SlideShare a Scribd company logo
THE PATH FORWARD 
TITAN 1.0 + TINKERPOP3 
MARKO A. RODRIGUEZ 
http://THINKAURELIUS.COM 
http://TINKERPOP.COM
PART 1: 
INTRODUCTION TO GRAPHS
VERTEX
software
software 
LABEL
name:gremlin 
software
PROPERTY 
KEY VALUE 
name:gremlin 
software
name:gremlin 
use:queries 
software
name:gremlin 
use:queries 
software 
name:titan 
use:database 
software
name:gremlin 
use:queries 
software 
name:titan 
use:database 
software 
dependsOn
name:gremlin 
use:queries 
software 
name:titan 
use:database 
software 
EDGE dependsOn
name:gremlin 
use:queries 
software 
name:titan 
use:database 
software 
dependsOn 
since:0.1
GRAPH
COMPUTING 
PROCESS STRUCTURE
COMPUTING 
PROCESS STRUCTURE 
TRAVERSAL GRAPH
TRAVERSAL GRAPH 
PROCESS STRUCTURE 
COMPUTING GRAPH-BASED 
COMPUTING
WhY GRAPH-BASED COMPUTING?
WhY GRAPH-BASED COMPUTING? 
INTUITIVE MODELING
WhY GRAPH-BASED COMPUTING? 
INTUITIVE MODELING 
EXPRESSIVE QUERYING
WhY GRAPH-BASED COMPUTING? 
INTUITIVE MODELING 
EXPRESSIVE QUERYING 
NUMEROUS ANALYSES 
Mixing Patterns 
Centrality 
Ranking 
Path Expressions 
Geodesics 
Inference 
Motifs 
Scoring
ANALYSES ARE THE 
EPIPHENOMENA OF TRAVERSAL 
f( )→
WHAT IS THE SIGNIFICANCE OF 
GRAPH ANALYSIS?
ANALYSES YIELD 
INSIGHTS ABOUT THE MODEL 
= 
DATA 
PRODUCTS 
DATA-DRIVEN 
DECISION SUPPORT
RECOMMENDATION 
People you may know. 
Products you might like. 
Movies you should watch and 
the friends you should watch them with. 
SOCIAL GRAPH 
RATINGS GRAPH 
SOCIAL+RATINGS 
GRAPH
WHO ELSE MIGHT HERCULES KNOW? 
hercules 
1 
0 2 
3 
4 
5 
6 
cerberus 
nemean 
hydra 
knows 
knows 
knows 
pluto 
neptune 
jupiter 
knows 
knows 
knows 
knows 
knows
hercules 
1 
0 2 
3 
4 
5 
6 
cerberus 
nemean 
hydra 
knows 
knows 
knows 
pluto 
neptune 
jupiter 
knows 
knows 
knows 
knows 
knows 
gremlin> hercules 
==>v[0]
hercules 
1 
0 2 
3 
4 
5 
6 
cerberus 
nemean 
hydra 
knows 
knows 
knows 
pluto 
neptune 
jupiter 
knows 
knows 
knows 
knows 
knows 
gremlin> hercules.out('knows') 
==>v[1] 
==>v[2] 
==>v[3]
hercules 
1 
0 2 
3 
4 
5 
6 
cerberus 
nemean 
hydra 
knows 
knows 
knows 
pluto 
neptune 
jupiter 
knows 
knows 
knows 
knows 
knows 
gremlin> hercules.out('knows').out('knows') 
==>v[4] 
==>v[5] 
==>v[5] 
==>v[6] 
==>v[5]
hercules 
1 
0 2 
3 
4 
5 
6 
cerberus 
nemean 
hydra 
knows 
knows 
knows 
pluto 
neptune 
jupiter 
knows 
knows 
knows 
knows 
knows 
gremlin> hercules.out('knows').out('knows').groupCount() 
==>v[4]=1 
==>v[5]=3 
==>v[6]=1
HERCULES PROBABLY KNOWS NEPTUNE 
hercules 
1 
0 2 
3 
4 
5 
6 
cerberus 
nemean 
hydra 
knows 
knows 
knows 
pluto 
neptune 
jupiter 
knows 
knows 
knows 
knows 
knows 
knows
HERCULES PROBABLY KNOWS NEPTUNE 
hercules 
1 
0 2 
3 
4 
5 
6 
cerberus 
nemean 
hydra 
knows 
knows 
knows 
pluto 
neptune 
jupiter 
knows 
knows 
knows 
knows 
knows 
father 
brother 
...PROBABLY MORE SO WHEN OTHER 
TYPES OF EDGES ARE ANALYZED
hercules 
1 
0 2 
3 
4 
5 
6 
cerberus 
nemean 
hydra 
knows 
knows 
knows 
pluto 
neptune 
jupiter 
knows 
knows 
knows 
knows 
knows 
father 
brother
hercules 
1 
0 2 
3 
4 
5 
6 
cerberus 
nemean 
hydra 
knows 
knows 
knows 
pluto 
neptune 
jupiter 
knows 
knows 
knows 
knows 
knows 
father 
brother 
likes
hercules 
1 
0 2 
3 
4 
5 
6 
cerberus 
nemean 
hydra 
knows 
knows 
knows 
pluto 
neptune 
jupiter 
knows 
knows 
knows 
knows 
knows 
father 
brother 
likes 
SOCIAL GRAPH
human flesh 
hercules 
1 
0 2 
3 
4 
5 
6 
cerberus 
nemean 
hydra 
knows 
knows 
knows 
pluto 
neptune 
jupiter 
knows 
knows 
knows 
knows 
knows 
father 
brother 
7 
likes 
SOCIAL GRAPH
human flesh 
hercules 
1 
0 2 
3 
4 
5 
6 
cerberus 
nemean 
hydra 
knows 
knows 
knows 
pluto 
neptune 
jupiter 
knows 
knows 
knows 
knows 
knows 
father 
brother 
7 
likes 
likes 
likes 
SOCIAL GRAPH
human flesh 
hercules 
1 
0 2 
3 
4 
5 
6 
cerberus 
nemean 
hydra 
knows 
knows 
knows 
pluto 
neptune 
jupiter 
knows 
knows 
knows 
knows 
knows 
father 
brother 
7 
likes 
likes 
likes 
tartarus 
8 
SOCIAL GRAPH
human flesh 
hercules 
1 
0 2 
3 
4 
5 
6 
cerberus 
nemean 
hydra 
knows 
knows 
knows 
pluto 
neptune 
jupiter 
knows 
knows 
knows 
knows 
knows 
father 
brother 
7 
likes 
likes 
likes 
tartarus 
8 
likes likes 
dislikes 
SOCIAL GRAPH
human flesh 
hercules 
1 
0 2 
3 
4 
5 
6 
cerberus 
nemean 
hydra 
knows 
knows 
knows 
pluto 
neptune 
jupiter 
knows 
knows 
knows 
knows 
knows 
father 
brother 
7 
likes 
likes 
likes 
tartarus 
8 
likes likes 
dislikes 
SOCIAL GRAPH 
RATINGS GRAPH
human flesh 
hercules 
1 
0 2 
3 
4 
5 
6 
cerberus 
nemean 
hydra 
knows 
knows 
knows 
pluto 
neptune 
jupiter 
knows 
knows 
knows 
knows 
knows 
father 
brother 
7 
likes 
likes 
likes 
composedOf 
tartarus 
8 
likes likes 
dislikes 
NEMEAN MIGHT LIKE TARTARUS 
smellsOf 
SOCIAL GRAPH 
RATINGS GRAPH 
PRODUCT GRAPH 
* Collaborative Filtering + Content-Based Recommendation
PATH FINDING 
How is this person related to this film? 
Which authors of this book also 
wrote a New York Times bestseller? 
Which movies are based on a book by a 
New York Times bestseller? 
MOVIE GRAPH 
BOOK GRAPH 
MOVIE+BOOK 
GRAPH
WHO PLAYED HERCULES 
IN WHAT MOVIE? 
hercules 
0 
arnold 
schwarzenegger 
hasActor 
jupiter 
10 7 
8 
depictedIn 
hercules in 
new york 
actor 
role 
9 
hasActor 
6 
role 
ernest 
graves 
actor 
11 
depictedIn
WHO PLAYED HERCULES 
IN WHAT MOVIE? 
hercules 
0 
arnold 
schwarzenegger 
hasActor 
jupiter 
10 7 
8 
depictedIn 
hercules in 
new york 
gremlin> hercules.match('hercules', 
actor 
role 
9 
hasActor 
6 
role 
ernest 
graves 
actor 
11 
depictedIn
WHO PLAYED HERCULES 
IN WHAT MOVIE? 
hercules 
0 
arnold 
schwarzenegger 
role 
hasActor 
jupiter 
10 7 
8 
depictedIn 
hercules in 
new york 
actor 
gremlin> hercules.match('hercules', 
g.of().as('hercules').out('depictedIn').as('movie'), 
9 
hasActor 
6 
role 
ernest 
graves 
actor 
11 
depictedIn
WHO PLAYED HERCULES 
IN WHAT MOVIE? 
hercules 
0 
arnold 
schwarzenegger 
role 
hasActor 
jupiter 
10 7 
8 
depictedIn 
hercules in 
new york 
actor 
gremlin> hercules.match('hercules', 
g.of().as('hercules').out('depictedIn').as('movie'), 
g.of().as('movie').out('hasActor').as('actor'), 
9 
hasActor 
6 
role 
ernest 
graves 
actor 
11 
depictedIn
WHO PLAYED HERCULES 
IN WHAT MOVIE? 
hercules 
0 
arnold 
schwarzenegger 
role 
hasActor 
jupiter 
10 7 
8 
depictedIn 
hercules in 
new york 
actor 
gremlin> hercules.match('hercules', 
g.of().as('hercules').out('depictedIn').as('movie'), 
g.of().as('movie').out('hasActor').as('actor'), 
g.of().as('actor').out('role').as('hercules'), 
9 
hasActor 
6 
role 
ernest 
graves 
actor 
11 
depictedIn
WHO PLAYED HERCULES 
IN WHAT MOVIE? 
hercules 
0 
arnold 
schwarzenegger 
role 
hasActor 
jupiter 
10 7 
8 
depictedIn 
hercules in 
new york 
actor 
gremlin> hercules.match('hercules', 
g.of().as('hercules').out('depictedIn').as('movie'), 
g.of().as('movie').out('hasActor').as('actor'), 
g.of().as('actor').out('role').as('hercules'), 
g.of().as('actor').out('actor').as('star')) 
9 
hasActor 
6 
role 
ernest 
graves 
actor 
11 
depictedIn
WHO PLAYED HERCULES 
IN WHAT MOVIE? 
hercules 
0 
arnold 
schwarzenegger 
role 
hasActor 
jupiter 
10 7 
8 
depictedIn 
hercules in 
new york 
actor 
gremlin> hercules.match('hercules', 
g.of().as('hercules').out('depictedIn').as('movie'), 
g.of().as('movie').out('hasActor').as('actor'), 
g.of().as('actor').out('role').as('hercules'), 
g.of().as('actor').out('actor').as('star')) 
.select(['movie','star']) 
==>[movie:v[7], star:v[9]] 
9 
hasActor 
6 
role 
ernest 
graves 
actor 
11 
depictedIn
WHO PLAYED HERCULES 
IN WHAT MOVIE? 
hercules 
0 
arnold 
schwarzenegger 
role 
hasActor 
jupiter 
10 7 
8 
depictedIn 
hercules in 
new york 
actor 
gremlin> hercules.match('hercules', 
g.of().as('hercules').out('depictedIn').as('movie') 
g.of().as('movie').out('hasActor').as('actor') 
g.of().as('actor').out('role').as('hercules') 
g.of().as('actor').out('actor').as('star')) 
.select(['movie','star']){it.name} 
==>[movie:hercules in new york, star:arnold schwarzenegger] 
9 
hasActor 
6 
role 
ernest 
graves 
actor 
11 
depictedIn
hercules 
0 
arnold 
schwarzenegger 
hasActor 
jupiter 
10 7 
8 
depictedIn 
hercules in 
new york 
actor 
role 
9 
hasActor 
6 
role 
ernest 
graves 
actor 
11 
depictedIn
hercules 
0 
arnold 
schwarzenegger 
hasActor 
jupiter 
10 7 
8 
depictedIn 
hercules in 
new york 
actor 
role 
9 
hasActor 
6 
role 
ernest 
graves 
actor 
11 
depictedIn
hercules 
0 
arnold 
schwarzenegger 
hasActor 
jupiter 
10 7 
8 
depictedIn 
hercules in 
new york 
actor 
role 
9 
hasActor 
6 
role 
ernest 
graves 
actor 
11 
depictedIn 
12 the arms of 
hercules 
depictedIn
hercules 
0 
arnold 
schwarzenegger 
hasActor 
jupiter 
10 7 
8 
depictedIn 
hercules in 
new york 
actor 
role 
9 
hasActor 
6 
role 
ernest 
graves 
actor 
11 
depictedIn 
12 the arms of 
hercules 
fred 
saberhagen 
13 
depictedIn 
writtenBy
albuquerque 
hercules 
0 
arnold 
schwarzenegger 
hasActor 
jupiter 
10 7 
8 
depictedIn 
hercules in 
new york 
actor 
role 
9 
hasActor 
6 
role 
ernest 
graves 
actor 
11 
depictedIn 
12 the arms of 
hercules 
fred 
saberhagen 
13 
depictedIn 
writtenBy 
14 
livesIn
albuquerque 
hercules 
0 
arnold 
schwarzenegger 
hasActor 
jupiter 
10 7 
8 
depictedIn 
hercules in 
new york 
actor 
role 
9 
hasActor 
6 
role 
ernest 
graves 
actor 
11 
depictedIn 
12 the arms of 
hercules 
fred 
saberhagen 
13 
depictedIn 
writtenBy 
14 
livesIn 
15 
25-North 
santa fe
albuquerque 
hercules 
0 
arnold 
schwarzenegger 
hasActor 
jupiter 
10 7 
8 
depictedIn 
hercules in 
new york 
actor 
role 
9 
hasActor 
6 
role 
marko 
rodriguez 
ernest 
graves 
actor 
11 
depictedIn 
12 the arms of 
hercules 
fred 
saberhagen 
13 
depictedIn 
writtenBy 
14 
livesIn 
15 
25-North 
santa fe 
16 
livesIn
albuquerque 
hercules 
0 
arnold 
schwarzenegger 
hasActor 
jupiter 
10 7 
8 
depictedIn 
hercules in 
new york 
actor 
role 
9 
hasActor 
6 
role 
marko 
rodriguez 
ernest 
graves 
actor 
11 
depictedIn 
12 the arms of 
hercules 
fred 
saberhagen 
13 
depictedIn 
writtenBy 
14 
livesIn 
15 
25-North 
santa fe 
16 
livesIn 
thinksHeIs
albuquerque 
hercules 
0 
arnold 
schwarzenegger 
BOOK GRAPH 
hasActor 
jupiter 
10 7 
8 
depictedIn 
hercules in 
new york 
actor 
role 
9 
hasActor 
6 
role 
marko 
rodriguez 
ernest 
graves 
actor 
11 
depictedIn 
12 the arms of 
hercules 
fred 
saberhagen 
13 
depictedIn 
writtenBy 
14 
livesIn 
15 
25-North 
santa fe 
16 
livesIn 
thinksHeIs 
MOVIE GRAPH 
TRANSPORTATION GRAPH 
PROFILE 
GRAPH
SOCIAL INFLUENCE 
Who are the most influential people in 
java, mathematics, art, surreal art, politics, ...? 
Which region of the social graph will propagate this 
advertisement this furthest? 
Which 3 experts should review this submitted article? 
Which people should I talk to at the upcoming 
conference and what topics should 
I talk to them about? 
SOCIAL + COMMUNICATION + EXPERTISE + EVENT GRAPH
PATTERN IDENTIFICATION 
This connectivity pattern is a sign of financial fraud. 
When this motif is found, a red flag will be raised. 
TRANSACTION GRAPH 
Healthy discourse is typified by a discussion board 
with a branch factor in this range and a concept 
clique score in this range. 
DISCUSSION GRAPH
KNOWLEDGE DISCOVERY 
The terms "ice", "fans", "stanley cup," 
are classified as "sports" 
Given that all identified birds fly, 
it can be deduced that all birds fly. 
If contrary evidence is provided, 
then this "fact" can be retracted. 
WIKIPEDIA GRAPH 
EVIDENTIAL LOGIC GRAPH
WORLD MODEL
WORLD PROCESSES 
WORLD MODEL
WORLD PROCESSES 
WORLD MODEL 
A single world model and various types of traversers 
moving through that model to solve problems.
TRAVERSAL GRAPH 
PROCESS STRUCTURE 
COMPUTING GRAPH-BASED 
COMPUTING
PART 2: 
THE PATH TO TITAN 1.0
WhY CREATE TITAN? 
A number of Aurelius' clients... 
...need to represent and process 
graphs at the 100+ billion edge 
scale w/ thousands of concurrent 
transactions. 
...need both local graph traversals 
(OLTP) and batch graph 
processing (OLAP). 
...desire a free, open source 
distributed graph database.
TITAN's KEY FEATURES 
Titan provides... 
..."infinite size" graphs and 
"unlimited" users by means of a 
distributed storage engine. 
...real-time local traversals (OLTP) 
and support for global batch 
processing via Hadoop (OLAP). 
...distribution via the liberal, free, 
open source Apache2 license.
-OR-BACKEND 
AGNOSTIC
TITAN DISTRIBUTED 
VIA CASSANDRA 
titan$ bin/gremlin.sh 
,,,/ 
(o o) 
-----oOOo-(_)-oOOo----- 
gremlin> conf = new BaseConfiguration(); 
==>org.apache.commons.configuration.BaseConfiguration@763861e6 
gremlin> conf.setProperty("storage.backend","cassandra"); 
gremlin> conf.setProperty("storage.hostname","77.77.77.77"); 
gremlin> g = TitanFactory.open(conf); 
==>titangraph[cassandra:77.77.77.77] 
gremlin>
INHERITED FEATURES 
Continuously available with no single point of failure. 
No write bottlenecks to the graph as there is no master/slave architecture. 
Built-in replication ensures data is available during machine failure. 
Caching layer ensures that continuously accessed data is available in memory. 
Elastic scalability allows for the introduction and removal of machines. 
Cassandra available at http://cassandra.apache.org/
TITAN DISTRIBUTED 
titan$ bin/gremlin.sh 
,,,/ 
(o o) 
VIA HBASE 
-----oOOo-(_)-oOOo----- 
gremlin> conf = new BaseConfiguration(); 
==>org.apache.commons.configuration.BaseConfiguration@763861e6 
gremlin> conf.setProperty("storage.backend","hbase"); 
gremlin> conf.setProperty("storage.hostname","77.77.77.77"); 
gremlin> g = TitanFactory.open(conf); 
==>titangraph[hbase:77.77.77.77] 
gremlin>
INHERITED FEATURES 
Strictly consistent reads and writes. 
Linear scalability with the addition of machines. 
Base classes for backing Hadoop MapReduce jobs with HBase tables. 
HDFS-based data replication. 
Generally good integration with the tools in the Hadoop ecosystem. 
HBase available at http://hbase.apache.org/
Titan is all about ...
Titan is all about numerous concurrent users...
Titan is all about numerous concurrent users... 
high availability....
Titan is all about numerous concurrent users... 
high availability.... 
dynamic scalability...
VERTEX-CENTRIC INDICES 
THE SUPER NODE PROBLEM 
Natural, real-world graphs contain 
vertices of high degree. 
Even if rare, their degree ensures that 
they exist on many paths. 
Traversing a high degree vertex 
means touching numerous incident 
edges and potentially touching most 
of the graph in only a few steps.
VERTEX-CENTRIC INDICES 
A SUPER NODE SOLUTION 
A "super node" only exists from the 
vantage point of classic "textbook 
style" graphs. 
In the world of property graphs, 
intelligent disk-level filtering can 
interpret a "super node" as a more 
manageable low-degree vertex. 
Vertex-centric querying utilizes sort 
orders for speedy lookup of incident 
edges with particular qualities.
VERTEX-CENTRIC INDICES 
PUSHDOWN PREDICATES 
stars:5 
vertex.bothE() 
likes likes 
stars:2 
stars:2 
knows knows 
likes likes 
knows 
likes 
stars:3 stars:3 
8 edges
VERTEX-CENTRIC INDICES 
PUSHDOWN PREDICATES 
stars:5 
likes likes 
likes 
knows knows 
stars:3 stars:3 
likes likes 
stars:2 
stars:2 
vertex.outE() 
7 edges
VERTEX-CENTRIC INDICES 
PUSHDOWN PREDICATES 
stars:5 
likes likes 
likes 
stars:3 stars:3 
likes likes 
stars:2 
stars:2 
vertex.outE("likes") 
5 edges
VERTEX-CENTRIC INDICES 
PUSHDOWN PREDICATES 
stars:5 
likes 
1 edge 
vertex.outE("likes").has("stars",5)
VERTEX-CENTRIC INDICES 
DISK-LEVEL SORTING/INDEXING 
likes 
likes 
likes 
knows 
knows 
stars:1 
stars:2 
stars:5
VERTEX-CENTRIC INDICES 
DISK-LEVEL SORTING/INDEXING 
likes 
likes 
likes 
knows 
knows 
likes 
knows 
stars:1 
stars:2 
stars:5
VERTEX-CENTRIC INDICES 
DISK-LEVEL SORTING/INDEXING 
likes 
likes 
likes 
knows 
knows 
likes w/ time 1-3 
knows 
stars:1 
stars:2 
stars:5 
likes w/ time 4-5
VERTEX-CENTRIC INDICES 
DISK-LEVEL SORTING/INDEXING 
likes 
likes 
likes 
knows 
knows 
likes w/ time 1-3 
knows 
schema = g.openManagementSystem() 
stars = schema 
.makePropertyKey("stars") 
.dataType(Integer.class) 
.make() 
likes = schema 
.makeEdgeLabel('likes') 
.make() 
schema.buildEdgeIndex(likes, 
'timeIdx',BOTH,DESC,stars) 
stars:1 
stars:2 
stars:5 
likes w/ time 4-5
THE TITAN OLAP STORY 
0 
1 
2 
3 
A 
B 
A 
C 
a:b 
c:d 
e:f 
g:h 
i:j 
4 
5 
6 
7 
vertex 
id props 
incoming edges outgoing edges 
edge 
id props label id 
edge 
id props label id 
edge 
id label id 
edge 
id label id 
1 e:f 4 c:d A 2 5 B 0 6 g:h A 3 7 C 3
0 
1 
3 
4 
5 
6 
7 
8 
9 
10 
11 
AN ADJACENCY LIST
AN ADJACENCY LIST 
+ 
CLUSTER 
127.0.0.2 127.0.0.3 127.0.0.4 
0 
1 
3 
4 
5 
6 
7 
8 
9 
10 
11
A DISTRIBUTED ADJACENCY LIST 
0 
1 
2 
3 
4 
5 
6 
7 
8 
9 
10 
11 
127.0.0.2 127.0.0.3 127.0.0.4
HADOOP 
Hadoop is a distributed computing platform composed of two key components: 
HDFS: 
A distributed file system that stores arbitrarily large files within a cluster. 
MapReduce: 
A parallel functional computing model for key/value pair data. 
http://hadoop.apache.org
FAUNUS AND HADOOP 
0 
1 
2 
3 
4 
5 
6 
7 
8 
9 
10 
11 
Process 
Structure 
127.0.0.2 127.0.0.3 127.0.0.4 
Faunus provides graph input/output formats (structure) 
and a traversal language for graphs (process).
THE TITAN OLAP STORY 
Serial Key/Value Data Structure Indexed Key/Indexed Value Data Structure 
0 
1 
3 
4 
5 
6 
7 
8 
9 
10 
11 
0 
1 
3 
4 
5 
6 
7 
8 
9 
10 
11
Adam Jacobs. 2009. The Pathologies of Big Data. Communications of the ACM 52, 8 (August 2009), 36-44. 
doi:10.1145/1536616.1536632 http://doi.acm.org/10.1145/1536616.1536632
INTRA-CLUSTER CONFIGURATION 
Titan/Cassandra Faunus/Hadoop 
Data is processed on the machine where it is located. 
Limited network communication.
INTER-CLUSTER CONFIGURATION 
Graph data is offloaded to another cluster. 
Repeated analysis does not interfere with production graph database.
THE TITAN OLAP STORY 
Titan currently provides a Hadoop OLAP engine 
Titan-Hadoop (nicknamed Faunus) 
Titan 1.0 will also provide an in-memory, off head, single-machine OLAP engine. 
Titan-Memory (nicknamed Fulgora)
PART 3: 
THE PATH TO TINKERPOP3
TINKERPOP3 FEATURES 
A standard, vendor-agnostic graph query language. 
Both OLTP and OLAP engines for evaluating queries. 
A way for developers to create custom, domain specific variants. 
A server system for pushing queries to the server for evaluation. 
A vertex-centric computing framework for distributed graph algorithms.
gremlin>
gremlin> gremlin = g.addVertex(label,'software','name','gremlin') 
==>v[0] 
name:gremlin 
0 
software
gremlin> gremlin = g.addVertex(label,'software','name','gremlin') 
==>v[0] 
gremlin> titan = g.addVertex(label,'software','name','titan') 
==>v[2] 
name:titan name:gremlin 
0 
software 
2 
software
gremlin> gremlin = g.addVertex(label,'software','name','gremlin') 
==>v[0] 
gremlin> titan = g.addVertex(label,'software','name','titan') 
==>v[2] 
gremlin> titan.addEdge('dependsOn',gremlin,'since','0.1') 
==>e[4][2-dependsOn->0] 
name:titan name:gremlin 
0 
software 
2 
software 
dependsOn 
since:0.1
Gremlin is a style of graph traversal that is 
heavily influenced by functional programming. 
gremlin> gremlin = g.addVertex(label,'software','name','gremlin') 
==>v[0] 
gremlin> titan = g.addVertex(label,'software','name','titan') 
==>v[2] 
gremlin> titan.addEdge('dependsOn',gremlin,'since','0.1') 
==>e[4][2-dependsOn->0] 
////////////////////////////////////////////////////////////////// 
gremlin> g.V 
==>v[0] 
==>v[2] 
gremlin> g.V.has('name','titan') 
==>v[2] 
gremlin> g.V.has('name','titan').outE 
==>e[4][2-dependsOn->0] 
gremlin> g.V.has('name','titan').outE.since 
==>0.1 
gremlin> g.V.has('name','titan').out.name 
==>gremlin
g.V.has('name','titan').out.name 
What are the names of the vertices outgoing adjacent to Titan?
g.V.has('name','titan').out.name 
name=gremlin 
0 
2 
name=titan 
For the graph 'g'... 
Traversals are defined using fluent, function composition.
g.V.has('name','titan').out.name 
name=gremlin 
0 
2 
name=titan 
…get all the vertices... 
Objects flow through the traversal in a lazy evaluation manner.
g.V.has('name','titan').out.name 
2 
name=titan 
…filter all vertices that don't have the name 'titan'… 
The general functions are map(), flatMap(), filter(), sideEffect(), and branch().
g.V.has('name','titan').out.name 
name=gremlin 
0 
…get all outgoing adjacent vertices… 
Any programming language can provide a Gremlin implementation.
g.V.has('name','titan').out.name 
gremlin 
…get the names of those vertices... 
Any graph system vendor can provide a binding to Gremlin.
THE GENERIC STEPS 
? 
S filter S S map E S 
flatMap 
sideEffect branch 
A 
E 
E E 
E 
E 
E 
S S S 
S 
... 
S 
? 
? 
has() 
retain() 
except() 
interval() 
simplePath() 
dedup() 
timeLimit() 
... 
back() 
fold() 
valueMap() 
match() 
orderBy() 
shuffle() 
... 
out() 
outE() 
inV() 
both() 
values() 
unfold() 
... 
aggregate() 
store() 
groupCount() 
groupBy() 
tree() 
count() 
subgraph() 
... 
jump() 
until() 
choose() 
union() 
...
TINKERPOP'S OLAP STORY 
Execute parallel, distributed Gremlin traversals. 
Execute vertex-centric message passing algorithms.
TINKERPOP'S OLAP STORY 
Execute parallel, distributed Gremlin traversals. 
Execute vertex-centric message passing algorithms. 
traversers are the messages!
OLAP 
1 
1 
1 
1 
2 
2 
2 
3 
1 
2 
3 
OLTP 
✓ Real-time queries. 
✓ Touching a subset of the graph. 
✓ Numerous concurrent queries. 
✓ Good for local graph mutations. 
✓ Long running queries. 
✓ Touching the entire of the graph. 
✓ Single user. 
✓ Good for bulk loading/mutations.
g.compute() 
GraphComputer 
Vertex 
Program 
program 
mapReduce 
distributed 
to all workers 
MapReduce 
MapReduce 
MapReduce 
worker 1 
worker 2 
worker 3 
1. Logically distribute the vertex program to all the vertices in the graph. 
2. The vertices execute the vertex program sending and receiving messages on each iteration. 
3. Any number of MapReduce jobs can follow to aggregate the computed data in the graph.
gremlin> result = g.compute().program(PageRankVertexProgram).submit() 
==>result[tinkergraph[vertices:6 edges:6],memory[size:0]] 
gremlin> result.memory.iteration 
==>30 
gremlin> result.graph 
==>tinkergraph[vertices:6 edges:6] 
gremlin> result.graph.V.map{ [it.id,it.pageRank]} 
==>[1, 0.15000000000000002] 
==>[2, 0.19250000000000003] 
==>[3, 0.4018125] 
==>[4, 0.19250000000000003] 
==>[5, 0.23181250000000003] 
==>[6, 0.15000000000000002] 
* Minor tweaks to example to make it fit nicely on a single line.
gremlin> g.V.both.both.name.groupCount() 
==>[ripple:3, peter:3, vadas:3, josh:7, lop:7, marko:7] 
gremlin> g.V.both.both.name.groupCount().submit(g.compute()) 
==>[ripple:3, peter:3, vadas:3, josh:7, lop:7, marko:7] 
OLTP 
OLAP
MULTIPLE GRAPH COMPUTERS 
FOR THE SAME GRAPH 
gremlin> g = TitanFactory.open(conf) 
==>titangraph[66.77.88.99] 
gremlin> g.compute(FaunusGraphComputer) 
.program(PageRankVertexProgram) 
.submit() 
==>result[faunusgraph,memory[size:0]] 
gremlin> g = TitanFactory.open(conf) 
==>titangraph[66.77.88.99] 
gremlin> g.compute(FulgoraGraphComputer) 
.program(PageRankVertexProgram) 
.submit() 
==>result[fulgoragraph,memory[size:0]]
GREMLIN SERVER
GREMLIN SERVER 
gremlin> g = TitanFactory.open(conf) 
==>titangraph[66.77.88.99] 
localhost
GREMLIN SERVER 
gremlin> g = TitanFactory.open(conf) 
==>titangraph[66.77.88.99] 
gremlin> g.V.has('name','titan') 
==>v[2] 
localhost
GREMLIN SERVER 
gremlin> g = TitanFactory.open(conf) 
==>titangraph[66.77.88.99] 
gremlin> g.V.has('name','titan') 
==>v[2] 
gremlin> g.V.has('name','titan').out 
localhost 
1. get the titan vertex 
2. get titan's adjacents
GREMLIN SERVER 
gremlin> g = TitanFactory.open(conf) 
==>titangraph[66.77.88.99] 
gremlin> g.V.has('name','titan') 
==>v[2] 
gremlin> g.V.has('name','titan').out 
localhost 
1. get the titan vertex 
2. get titan's adjacents 
CHATTY 
Each step in the traversal is a round trip
GREMLIN SERVER
GREMLIN SERVER
GREMLIN SERVER 
gremlin> :remote connect tinkerpop.server titan.yaml 
==>connected - 66.77.88.99:8182,11.22.33.44:8182,... 
localhost
GREMLIN SERVER 
gremlin> :remote connect tinkerpop.server titan.yaml 
==>connected - 66.77.88.99:8182,11.22.33.44:8182,... 
gremlin> :> g.V.has('name','titan') 
==>v[2] 
localhost 
g.V.has('name','titan')
GREMLIN SERVER 
gremlin> :remote connect tinkerpop.server titan.yaml 
==>connected - 66.77.88.99:8182,11.22.33.44:8182,... 
gremlin> :> g.V.has('name','titan') 
==>v[2] 
gremlin> :> g.V.has('name','titan').out 
localhost 
g.V.has('name','titan').out 
LESS CHATTY 
1. Gremlin Server client knows the hash function 
to query the machine with the Titan vertex 
2. Push the traversal to the server and 
get a WebSockets iterator back
GREMLIN SERVER 
Allows non-JVM languages to interact with a remote graph. 
Supports stored procedures -- MyAlgorithms.class 
Supports WebSockets, REST, and various serialization mechanisms.
CREDITS 
PRESENTED BY 
MARKO A. RODRIGUEZ 
MANY THANKS TO 
MATTHIAS BRöCHELER 
STEPHEN MALLETTE 
DAN LAROCQUE 
DANIEL KUPPITZ 
BOB BRIODY 
BRYN COOKE 
JOSH SHINAVIER 
PAVEL YASKEVICH 
AURELIUS COMMUNITY 
TINKERPOP COMMUNITY 
KETRINA YIM

More Related Content

Viewers also liked

Graph databases: Tinkerpop and Titan DB
Graph databases: Tinkerpop and Titan DBGraph databases: Tinkerpop and Titan DB
Graph databases: Tinkerpop and Titan DB
Mohamed Taher Alrefaie
 
Gremlin: A Graph-Based Programming Language
Gremlin: A Graph-Based Programming LanguageGremlin: A Graph-Based Programming Language
Gremlin: A Graph-Based Programming Language
Marko Rodriguez
 
Exploring Titan and Spark GraphX for Analyzing Time-Varying Electrical Networks
Exploring Titan and Spark GraphX for Analyzing Time-Varying Electrical Networks Exploring Titan and Spark GraphX for Analyzing Time-Varying Electrical Networks
Exploring Titan and Spark GraphX for Analyzing Time-Varying Electrical Networks
DataWorks Summit/Hadoop Summit
 
Solving Problems with Graphs
Solving Problems with GraphsSolving Problems with Graphs
Solving Problems with Graphs
Marko Rodriguez
 
Quantum Processes in Graph Computing
Quantum Processes in Graph ComputingQuantum Processes in Graph Computing
Quantum Processes in Graph Computing
Marko Rodriguez
 
Titan: Scaling Graphs and TinkerPop3
Titan: Scaling Graphs and TinkerPop3Titan: Scaling Graphs and TinkerPop3
Titan: Scaling Graphs and TinkerPop3
Matthias Broecheler
 
Humanitarian Logistics
Humanitarian LogisticsHumanitarian Logistics
Humanitarian Logistics
ihab tarek
 
Problem-Solving using Graph Traversals: Searching, Scoring, Ranking, and Reco...
Problem-Solving using Graph Traversals: Searching, Scoring, Ranking, and Reco...Problem-Solving using Graph Traversals: Searching, Scoring, Ranking, and Reco...
Problem-Solving using Graph Traversals: Searching, Scoring, Ranking, and Reco...
Marko Rodriguez
 
Standard Treasury Series A Pitch Deck
Standard Treasury Series A Pitch DeckStandard Treasury Series A Pitch Deck
Standard Treasury Series A Pitch Deck
Zachary Townsend
 
Vertica And Spark: Connecting Computation And Data
Vertica And Spark: Connecting Computation And DataVertica And Spark: Connecting Computation And Data
Vertica And Spark: Connecting Computation And Data
Spark Summit
 
ACM DBPL Keynote: The Graph Traversal Machine and Language
ACM DBPL Keynote: The Graph Traversal Machine and LanguageACM DBPL Keynote: The Graph Traversal Machine and Language
ACM DBPL Keynote: The Graph Traversal Machine and Language
Marko Rodriguez
 

Viewers also liked (11)

Graph databases: Tinkerpop and Titan DB
Graph databases: Tinkerpop and Titan DBGraph databases: Tinkerpop and Titan DB
Graph databases: Tinkerpop and Titan DB
 
Gremlin: A Graph-Based Programming Language
Gremlin: A Graph-Based Programming LanguageGremlin: A Graph-Based Programming Language
Gremlin: A Graph-Based Programming Language
 
Exploring Titan and Spark GraphX for Analyzing Time-Varying Electrical Networks
Exploring Titan and Spark GraphX for Analyzing Time-Varying Electrical Networks Exploring Titan and Spark GraphX for Analyzing Time-Varying Electrical Networks
Exploring Titan and Spark GraphX for Analyzing Time-Varying Electrical Networks
 
Solving Problems with Graphs
Solving Problems with GraphsSolving Problems with Graphs
Solving Problems with Graphs
 
Quantum Processes in Graph Computing
Quantum Processes in Graph ComputingQuantum Processes in Graph Computing
Quantum Processes in Graph Computing
 
Titan: Scaling Graphs and TinkerPop3
Titan: Scaling Graphs and TinkerPop3Titan: Scaling Graphs and TinkerPop3
Titan: Scaling Graphs and TinkerPop3
 
Humanitarian Logistics
Humanitarian LogisticsHumanitarian Logistics
Humanitarian Logistics
 
Problem-Solving using Graph Traversals: Searching, Scoring, Ranking, and Reco...
Problem-Solving using Graph Traversals: Searching, Scoring, Ranking, and Reco...Problem-Solving using Graph Traversals: Searching, Scoring, Ranking, and Reco...
Problem-Solving using Graph Traversals: Searching, Scoring, Ranking, and Reco...
 
Standard Treasury Series A Pitch Deck
Standard Treasury Series A Pitch DeckStandard Treasury Series A Pitch Deck
Standard Treasury Series A Pitch Deck
 
Vertica And Spark: Connecting Computation And Data
Vertica And Spark: Connecting Computation And DataVertica And Spark: Connecting Computation And Data
Vertica And Spark: Connecting Computation And Data
 
ACM DBPL Keynote: The Graph Traversal Machine and Language
ACM DBPL Keynote: The Graph Traversal Machine and LanguageACM DBPL Keynote: The Graph Traversal Machine and Language
ACM DBPL Keynote: The Graph Traversal Machine and Language
 

More from Marko Rodriguez

mm-ADT: A Virtual Machine/An Economic Machine
mm-ADT: A Virtual Machine/An Economic Machinemm-ADT: A Virtual Machine/An Economic Machine
mm-ADT: A Virtual Machine/An Economic Machine
Marko Rodriguez
 
mm-ADT: A Multi-Model Abstract Data Type
mm-ADT: A Multi-Model Abstract Data Typemm-ADT: A Multi-Model Abstract Data Type
mm-ADT: A Multi-Model Abstract Data Type
Marko Rodriguez
 
Open Problems in the Universal Graph Theory
Open Problems in the Universal Graph TheoryOpen Problems in the Universal Graph Theory
Open Problems in the Universal Graph Theory
Marko Rodriguez
 
Gremlin 101.3 On Your FM Dial
Gremlin 101.3 On Your FM DialGremlin 101.3 On Your FM Dial
Gremlin 101.3 On Your FM Dial
Marko Rodriguez
 
Faunus: Graph Analytics Engine
Faunus: Graph Analytics EngineFaunus: Graph Analytics Engine
Faunus: Graph Analytics Engine
Marko Rodriguez
 
The Pathology of Graph Databases
The Pathology of Graph DatabasesThe Pathology of Graph Databases
The Pathology of Graph Databases
Marko Rodriguez
 
Traversing Graph Databases with Gremlin
Traversing Graph Databases with GremlinTraversing Graph Databases with Gremlin
Traversing Graph Databases with Gremlin
Marko Rodriguez
 
The Path-o-Logical Gremlin
The Path-o-Logical GremlinThe Path-o-Logical Gremlin
The Path-o-Logical Gremlin
Marko Rodriguez
 
The Gremlin in the Graph
The Gremlin in the GraphThe Gremlin in the Graph
The Gremlin in the Graph
Marko Rodriguez
 
Memoirs of a Graph Addict: Despair to Redemption
Memoirs of a Graph Addict: Despair to RedemptionMemoirs of a Graph Addict: Despair to Redemption
Memoirs of a Graph Addict: Despair to RedemptionMarko Rodriguez
 
Graph Databases: Trends in the Web of Data
Graph Databases: Trends in the Web of DataGraph Databases: Trends in the Web of Data
Graph Databases: Trends in the Web of Data
Marko Rodriguez
 
A Perspective on Graph Theory and Network Science
A Perspective on Graph Theory and Network ScienceA Perspective on Graph Theory and Network Science
A Perspective on Graph Theory and Network Science
Marko Rodriguez
 
The Graph Traversal Programming Pattern
The Graph Traversal Programming PatternThe Graph Traversal Programming Pattern
The Graph Traversal Programming Pattern
Marko Rodriguez
 
The Network Data Structure in Computing
The Network Data Structure in ComputingThe Network Data Structure in Computing
The Network Data Structure in ComputingMarko Rodriguez
 
A Model of the Scholarly Community
A Model of the Scholarly CommunityA Model of the Scholarly Community
A Model of the Scholarly CommunityMarko Rodriguez
 
General-Purpose, Internet-Scale Distributed Computing with Linked Process
General-Purpose, Internet-Scale Distributed Computing with Linked ProcessGeneral-Purpose, Internet-Scale Distributed Computing with Linked Process
General-Purpose, Internet-Scale Distributed Computing with Linked Process
Marko Rodriguez
 
Collective Decision Making Systems: From the Ideal State to Human Eudaimonia
Collective Decision Making Systems: From the Ideal State to Human EudaimoniaCollective Decision Making Systems: From the Ideal State to Human Eudaimonia
Collective Decision Making Systems: From the Ideal State to Human Eudaimonia
Marko Rodriguez
 
Distributed Graph Databases and the Emerging Web of Data
Distributed Graph Databases and the Emerging Web of DataDistributed Graph Databases and the Emerging Web of Data
Distributed Graph Databases and the Emerging Web of Data
Marko Rodriguez
 
An Overview of Data Management Paradigms: Relational, Document, and Graph
An Overview of Data Management Paradigms: Relational, Document, and GraphAn Overview of Data Management Paradigms: Relational, Document, and Graph
An Overview of Data Management Paradigms: Relational, Document, and Graph
Marko Rodriguez
 
Graph Databases and the Future of Large-Scale Knowledge Management
Graph Databases and the Future of Large-Scale Knowledge ManagementGraph Databases and the Future of Large-Scale Knowledge Management
Graph Databases and the Future of Large-Scale Knowledge Management
Marko Rodriguez
 

More from Marko Rodriguez (20)

mm-ADT: A Virtual Machine/An Economic Machine
mm-ADT: A Virtual Machine/An Economic Machinemm-ADT: A Virtual Machine/An Economic Machine
mm-ADT: A Virtual Machine/An Economic Machine
 
mm-ADT: A Multi-Model Abstract Data Type
mm-ADT: A Multi-Model Abstract Data Typemm-ADT: A Multi-Model Abstract Data Type
mm-ADT: A Multi-Model Abstract Data Type
 
Open Problems in the Universal Graph Theory
Open Problems in the Universal Graph TheoryOpen Problems in the Universal Graph Theory
Open Problems in the Universal Graph Theory
 
Gremlin 101.3 On Your FM Dial
Gremlin 101.3 On Your FM DialGremlin 101.3 On Your FM Dial
Gremlin 101.3 On Your FM Dial
 
Faunus: Graph Analytics Engine
Faunus: Graph Analytics EngineFaunus: Graph Analytics Engine
Faunus: Graph Analytics Engine
 
The Pathology of Graph Databases
The Pathology of Graph DatabasesThe Pathology of Graph Databases
The Pathology of Graph Databases
 
Traversing Graph Databases with Gremlin
Traversing Graph Databases with GremlinTraversing Graph Databases with Gremlin
Traversing Graph Databases with Gremlin
 
The Path-o-Logical Gremlin
The Path-o-Logical GremlinThe Path-o-Logical Gremlin
The Path-o-Logical Gremlin
 
The Gremlin in the Graph
The Gremlin in the GraphThe Gremlin in the Graph
The Gremlin in the Graph
 
Memoirs of a Graph Addict: Despair to Redemption
Memoirs of a Graph Addict: Despair to RedemptionMemoirs of a Graph Addict: Despair to Redemption
Memoirs of a Graph Addict: Despair to Redemption
 
Graph Databases: Trends in the Web of Data
Graph Databases: Trends in the Web of DataGraph Databases: Trends in the Web of Data
Graph Databases: Trends in the Web of Data
 
A Perspective on Graph Theory and Network Science
A Perspective on Graph Theory and Network ScienceA Perspective on Graph Theory and Network Science
A Perspective on Graph Theory and Network Science
 
The Graph Traversal Programming Pattern
The Graph Traversal Programming PatternThe Graph Traversal Programming Pattern
The Graph Traversal Programming Pattern
 
The Network Data Structure in Computing
The Network Data Structure in ComputingThe Network Data Structure in Computing
The Network Data Structure in Computing
 
A Model of the Scholarly Community
A Model of the Scholarly CommunityA Model of the Scholarly Community
A Model of the Scholarly Community
 
General-Purpose, Internet-Scale Distributed Computing with Linked Process
General-Purpose, Internet-Scale Distributed Computing with Linked ProcessGeneral-Purpose, Internet-Scale Distributed Computing with Linked Process
General-Purpose, Internet-Scale Distributed Computing with Linked Process
 
Collective Decision Making Systems: From the Ideal State to Human Eudaimonia
Collective Decision Making Systems: From the Ideal State to Human EudaimoniaCollective Decision Making Systems: From the Ideal State to Human Eudaimonia
Collective Decision Making Systems: From the Ideal State to Human Eudaimonia
 
Distributed Graph Databases and the Emerging Web of Data
Distributed Graph Databases and the Emerging Web of DataDistributed Graph Databases and the Emerging Web of Data
Distributed Graph Databases and the Emerging Web of Data
 
An Overview of Data Management Paradigms: Relational, Document, and Graph
An Overview of Data Management Paradigms: Relational, Document, and GraphAn Overview of Data Management Paradigms: Relational, Document, and Graph
An Overview of Data Management Paradigms: Relational, Document, and Graph
 
Graph Databases and the Future of Large-Scale Knowledge Management
Graph Databases and the Future of Large-Scale Knowledge ManagementGraph Databases and the Future of Large-Scale Knowledge Management
Graph Databases and the Future of Large-Scale Knowledge Management
 

Recently uploaded

Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Product School
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Inflectra
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
 

Recently uploaded (20)

Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 

The Path Forward

  • 1. THE PATH FORWARD TITAN 1.0 + TINKERPOP3 MARKO A. RODRIGUEZ http://THINKAURELIUS.COM http://TINKERPOP.COM
  • 7. PROPERTY KEY VALUE name:gremlin software
  • 9. name:gremlin use:queries software name:titan use:database software
  • 10. name:gremlin use:queries software name:titan use:database software dependsOn
  • 11. name:gremlin use:queries software name:titan use:database software EDGE dependsOn
  • 12. name:gremlin use:queries software name:titan use:database software dependsOn since:0.1
  • 13. GRAPH
  • 15. COMPUTING PROCESS STRUCTURE TRAVERSAL GRAPH
  • 16. TRAVERSAL GRAPH PROCESS STRUCTURE COMPUTING GRAPH-BASED COMPUTING
  • 18. WhY GRAPH-BASED COMPUTING? INTUITIVE MODELING
  • 19. WhY GRAPH-BASED COMPUTING? INTUITIVE MODELING EXPRESSIVE QUERYING
  • 20. WhY GRAPH-BASED COMPUTING? INTUITIVE MODELING EXPRESSIVE QUERYING NUMEROUS ANALYSES Mixing Patterns Centrality Ranking Path Expressions Geodesics Inference Motifs Scoring
  • 21. ANALYSES ARE THE EPIPHENOMENA OF TRAVERSAL f( )→
  • 22. WHAT IS THE SIGNIFICANCE OF GRAPH ANALYSIS?
  • 23. ANALYSES YIELD INSIGHTS ABOUT THE MODEL = DATA PRODUCTS DATA-DRIVEN DECISION SUPPORT
  • 24. RECOMMENDATION People you may know. Products you might like. Movies you should watch and the friends you should watch them with. SOCIAL GRAPH RATINGS GRAPH SOCIAL+RATINGS GRAPH
  • 25. WHO ELSE MIGHT HERCULES KNOW? hercules 1 0 2 3 4 5 6 cerberus nemean hydra knows knows knows pluto neptune jupiter knows knows knows knows knows
  • 26. hercules 1 0 2 3 4 5 6 cerberus nemean hydra knows knows knows pluto neptune jupiter knows knows knows knows knows gremlin> hercules ==>v[0]
  • 27. hercules 1 0 2 3 4 5 6 cerberus nemean hydra knows knows knows pluto neptune jupiter knows knows knows knows knows gremlin> hercules.out('knows') ==>v[1] ==>v[2] ==>v[3]
  • 28. hercules 1 0 2 3 4 5 6 cerberus nemean hydra knows knows knows pluto neptune jupiter knows knows knows knows knows gremlin> hercules.out('knows').out('knows') ==>v[4] ==>v[5] ==>v[5] ==>v[6] ==>v[5]
  • 29. hercules 1 0 2 3 4 5 6 cerberus nemean hydra knows knows knows pluto neptune jupiter knows knows knows knows knows gremlin> hercules.out('knows').out('knows').groupCount() ==>v[4]=1 ==>v[5]=3 ==>v[6]=1
  • 30. HERCULES PROBABLY KNOWS NEPTUNE hercules 1 0 2 3 4 5 6 cerberus nemean hydra knows knows knows pluto neptune jupiter knows knows knows knows knows knows
  • 31. HERCULES PROBABLY KNOWS NEPTUNE hercules 1 0 2 3 4 5 6 cerberus nemean hydra knows knows knows pluto neptune jupiter knows knows knows knows knows father brother ...PROBABLY MORE SO WHEN OTHER TYPES OF EDGES ARE ANALYZED
  • 32. hercules 1 0 2 3 4 5 6 cerberus nemean hydra knows knows knows pluto neptune jupiter knows knows knows knows knows father brother
  • 33. hercules 1 0 2 3 4 5 6 cerberus nemean hydra knows knows knows pluto neptune jupiter knows knows knows knows knows father brother likes
  • 34. hercules 1 0 2 3 4 5 6 cerberus nemean hydra knows knows knows pluto neptune jupiter knows knows knows knows knows father brother likes SOCIAL GRAPH
  • 35. human flesh hercules 1 0 2 3 4 5 6 cerberus nemean hydra knows knows knows pluto neptune jupiter knows knows knows knows knows father brother 7 likes SOCIAL GRAPH
  • 36. human flesh hercules 1 0 2 3 4 5 6 cerberus nemean hydra knows knows knows pluto neptune jupiter knows knows knows knows knows father brother 7 likes likes likes SOCIAL GRAPH
  • 37. human flesh hercules 1 0 2 3 4 5 6 cerberus nemean hydra knows knows knows pluto neptune jupiter knows knows knows knows knows father brother 7 likes likes likes tartarus 8 SOCIAL GRAPH
  • 38. human flesh hercules 1 0 2 3 4 5 6 cerberus nemean hydra knows knows knows pluto neptune jupiter knows knows knows knows knows father brother 7 likes likes likes tartarus 8 likes likes dislikes SOCIAL GRAPH
  • 39. human flesh hercules 1 0 2 3 4 5 6 cerberus nemean hydra knows knows knows pluto neptune jupiter knows knows knows knows knows father brother 7 likes likes likes tartarus 8 likes likes dislikes SOCIAL GRAPH RATINGS GRAPH
  • 40. human flesh hercules 1 0 2 3 4 5 6 cerberus nemean hydra knows knows knows pluto neptune jupiter knows knows knows knows knows father brother 7 likes likes likes composedOf tartarus 8 likes likes dislikes NEMEAN MIGHT LIKE TARTARUS smellsOf SOCIAL GRAPH RATINGS GRAPH PRODUCT GRAPH * Collaborative Filtering + Content-Based Recommendation
  • 41. PATH FINDING How is this person related to this film? Which authors of this book also wrote a New York Times bestseller? Which movies are based on a book by a New York Times bestseller? MOVIE GRAPH BOOK GRAPH MOVIE+BOOK GRAPH
  • 42. WHO PLAYED HERCULES IN WHAT MOVIE? hercules 0 arnold schwarzenegger hasActor jupiter 10 7 8 depictedIn hercules in new york actor role 9 hasActor 6 role ernest graves actor 11 depictedIn
  • 43. WHO PLAYED HERCULES IN WHAT MOVIE? hercules 0 arnold schwarzenegger hasActor jupiter 10 7 8 depictedIn hercules in new york gremlin> hercules.match('hercules', actor role 9 hasActor 6 role ernest graves actor 11 depictedIn
  • 44. WHO PLAYED HERCULES IN WHAT MOVIE? hercules 0 arnold schwarzenegger role hasActor jupiter 10 7 8 depictedIn hercules in new york actor gremlin> hercules.match('hercules', g.of().as('hercules').out('depictedIn').as('movie'), 9 hasActor 6 role ernest graves actor 11 depictedIn
  • 45. WHO PLAYED HERCULES IN WHAT MOVIE? hercules 0 arnold schwarzenegger role hasActor jupiter 10 7 8 depictedIn hercules in new york actor gremlin> hercules.match('hercules', g.of().as('hercules').out('depictedIn').as('movie'), g.of().as('movie').out('hasActor').as('actor'), 9 hasActor 6 role ernest graves actor 11 depictedIn
  • 46. WHO PLAYED HERCULES IN WHAT MOVIE? hercules 0 arnold schwarzenegger role hasActor jupiter 10 7 8 depictedIn hercules in new york actor gremlin> hercules.match('hercules', g.of().as('hercules').out('depictedIn').as('movie'), g.of().as('movie').out('hasActor').as('actor'), g.of().as('actor').out('role').as('hercules'), 9 hasActor 6 role ernest graves actor 11 depictedIn
  • 47. WHO PLAYED HERCULES IN WHAT MOVIE? hercules 0 arnold schwarzenegger role hasActor jupiter 10 7 8 depictedIn hercules in new york actor gremlin> hercules.match('hercules', g.of().as('hercules').out('depictedIn').as('movie'), g.of().as('movie').out('hasActor').as('actor'), g.of().as('actor').out('role').as('hercules'), g.of().as('actor').out('actor').as('star')) 9 hasActor 6 role ernest graves actor 11 depictedIn
  • 48. WHO PLAYED HERCULES IN WHAT MOVIE? hercules 0 arnold schwarzenegger role hasActor jupiter 10 7 8 depictedIn hercules in new york actor gremlin> hercules.match('hercules', g.of().as('hercules').out('depictedIn').as('movie'), g.of().as('movie').out('hasActor').as('actor'), g.of().as('actor').out('role').as('hercules'), g.of().as('actor').out('actor').as('star')) .select(['movie','star']) ==>[movie:v[7], star:v[9]] 9 hasActor 6 role ernest graves actor 11 depictedIn
  • 49. WHO PLAYED HERCULES IN WHAT MOVIE? hercules 0 arnold schwarzenegger role hasActor jupiter 10 7 8 depictedIn hercules in new york actor gremlin> hercules.match('hercules', g.of().as('hercules').out('depictedIn').as('movie') g.of().as('movie').out('hasActor').as('actor') g.of().as('actor').out('role').as('hercules') g.of().as('actor').out('actor').as('star')) .select(['movie','star']){it.name} ==>[movie:hercules in new york, star:arnold schwarzenegger] 9 hasActor 6 role ernest graves actor 11 depictedIn
  • 50. hercules 0 arnold schwarzenegger hasActor jupiter 10 7 8 depictedIn hercules in new york actor role 9 hasActor 6 role ernest graves actor 11 depictedIn
  • 51. hercules 0 arnold schwarzenegger hasActor jupiter 10 7 8 depictedIn hercules in new york actor role 9 hasActor 6 role ernest graves actor 11 depictedIn
  • 52. hercules 0 arnold schwarzenegger hasActor jupiter 10 7 8 depictedIn hercules in new york actor role 9 hasActor 6 role ernest graves actor 11 depictedIn 12 the arms of hercules depictedIn
  • 53. hercules 0 arnold schwarzenegger hasActor jupiter 10 7 8 depictedIn hercules in new york actor role 9 hasActor 6 role ernest graves actor 11 depictedIn 12 the arms of hercules fred saberhagen 13 depictedIn writtenBy
  • 54. albuquerque hercules 0 arnold schwarzenegger hasActor jupiter 10 7 8 depictedIn hercules in new york actor role 9 hasActor 6 role ernest graves actor 11 depictedIn 12 the arms of hercules fred saberhagen 13 depictedIn writtenBy 14 livesIn
  • 55. albuquerque hercules 0 arnold schwarzenegger hasActor jupiter 10 7 8 depictedIn hercules in new york actor role 9 hasActor 6 role ernest graves actor 11 depictedIn 12 the arms of hercules fred saberhagen 13 depictedIn writtenBy 14 livesIn 15 25-North santa fe
  • 56. albuquerque hercules 0 arnold schwarzenegger hasActor jupiter 10 7 8 depictedIn hercules in new york actor role 9 hasActor 6 role marko rodriguez ernest graves actor 11 depictedIn 12 the arms of hercules fred saberhagen 13 depictedIn writtenBy 14 livesIn 15 25-North santa fe 16 livesIn
  • 57. albuquerque hercules 0 arnold schwarzenegger hasActor jupiter 10 7 8 depictedIn hercules in new york actor role 9 hasActor 6 role marko rodriguez ernest graves actor 11 depictedIn 12 the arms of hercules fred saberhagen 13 depictedIn writtenBy 14 livesIn 15 25-North santa fe 16 livesIn thinksHeIs
  • 58. albuquerque hercules 0 arnold schwarzenegger BOOK GRAPH hasActor jupiter 10 7 8 depictedIn hercules in new york actor role 9 hasActor 6 role marko rodriguez ernest graves actor 11 depictedIn 12 the arms of hercules fred saberhagen 13 depictedIn writtenBy 14 livesIn 15 25-North santa fe 16 livesIn thinksHeIs MOVIE GRAPH TRANSPORTATION GRAPH PROFILE GRAPH
  • 59. SOCIAL INFLUENCE Who are the most influential people in java, mathematics, art, surreal art, politics, ...? Which region of the social graph will propagate this advertisement this furthest? Which 3 experts should review this submitted article? Which people should I talk to at the upcoming conference and what topics should I talk to them about? SOCIAL + COMMUNICATION + EXPERTISE + EVENT GRAPH
  • 60. PATTERN IDENTIFICATION This connectivity pattern is a sign of financial fraud. When this motif is found, a red flag will be raised. TRANSACTION GRAPH Healthy discourse is typified by a discussion board with a branch factor in this range and a concept clique score in this range. DISCUSSION GRAPH
  • 61. KNOWLEDGE DISCOVERY The terms "ice", "fans", "stanley cup," are classified as "sports" Given that all identified birds fly, it can be deduced that all birds fly. If contrary evidence is provided, then this "fact" can be retracted. WIKIPEDIA GRAPH EVIDENTIAL LOGIC GRAPH
  • 64. WORLD PROCESSES WORLD MODEL A single world model and various types of traversers moving through that model to solve problems.
  • 65. TRAVERSAL GRAPH PROCESS STRUCTURE COMPUTING GRAPH-BASED COMPUTING
  • 66. PART 2: THE PATH TO TITAN 1.0
  • 67. WhY CREATE TITAN? A number of Aurelius' clients... ...need to represent and process graphs at the 100+ billion edge scale w/ thousands of concurrent transactions. ...need both local graph traversals (OLTP) and batch graph processing (OLAP). ...desire a free, open source distributed graph database.
  • 68. TITAN's KEY FEATURES Titan provides... ..."infinite size" graphs and "unlimited" users by means of a distributed storage engine. ...real-time local traversals (OLTP) and support for global batch processing via Hadoop (OLAP). ...distribution via the liberal, free, open source Apache2 license.
  • 70. TITAN DISTRIBUTED VIA CASSANDRA titan$ bin/gremlin.sh ,,,/ (o o) -----oOOo-(_)-oOOo----- gremlin> conf = new BaseConfiguration(); ==>org.apache.commons.configuration.BaseConfiguration@763861e6 gremlin> conf.setProperty("storage.backend","cassandra"); gremlin> conf.setProperty("storage.hostname","77.77.77.77"); gremlin> g = TitanFactory.open(conf); ==>titangraph[cassandra:77.77.77.77] gremlin>
  • 71. INHERITED FEATURES Continuously available with no single point of failure. No write bottlenecks to the graph as there is no master/slave architecture. Built-in replication ensures data is available during machine failure. Caching layer ensures that continuously accessed data is available in memory. Elastic scalability allows for the introduction and removal of machines. Cassandra available at http://cassandra.apache.org/
  • 72. TITAN DISTRIBUTED titan$ bin/gremlin.sh ,,,/ (o o) VIA HBASE -----oOOo-(_)-oOOo----- gremlin> conf = new BaseConfiguration(); ==>org.apache.commons.configuration.BaseConfiguration@763861e6 gremlin> conf.setProperty("storage.backend","hbase"); gremlin> conf.setProperty("storage.hostname","77.77.77.77"); gremlin> g = TitanFactory.open(conf); ==>titangraph[hbase:77.77.77.77] gremlin>
  • 73. INHERITED FEATURES Strictly consistent reads and writes. Linear scalability with the addition of machines. Base classes for backing Hadoop MapReduce jobs with HBase tables. HDFS-based data replication. Generally good integration with the tools in the Hadoop ecosystem. HBase available at http://hbase.apache.org/
  • 74. Titan is all about ...
  • 75. Titan is all about numerous concurrent users...
  • 76. Titan is all about numerous concurrent users... high availability....
  • 77. Titan is all about numerous concurrent users... high availability.... dynamic scalability...
  • 78. VERTEX-CENTRIC INDICES THE SUPER NODE PROBLEM Natural, real-world graphs contain vertices of high degree. Even if rare, their degree ensures that they exist on many paths. Traversing a high degree vertex means touching numerous incident edges and potentially touching most of the graph in only a few steps.
  • 79. VERTEX-CENTRIC INDICES A SUPER NODE SOLUTION A "super node" only exists from the vantage point of classic "textbook style" graphs. In the world of property graphs, intelligent disk-level filtering can interpret a "super node" as a more manageable low-degree vertex. Vertex-centric querying utilizes sort orders for speedy lookup of incident edges with particular qualities.
  • 80. VERTEX-CENTRIC INDICES PUSHDOWN PREDICATES stars:5 vertex.bothE() likes likes stars:2 stars:2 knows knows likes likes knows likes stars:3 stars:3 8 edges
  • 81. VERTEX-CENTRIC INDICES PUSHDOWN PREDICATES stars:5 likes likes likes knows knows stars:3 stars:3 likes likes stars:2 stars:2 vertex.outE() 7 edges
  • 82. VERTEX-CENTRIC INDICES PUSHDOWN PREDICATES stars:5 likes likes likes stars:3 stars:3 likes likes stars:2 stars:2 vertex.outE("likes") 5 edges
  • 83. VERTEX-CENTRIC INDICES PUSHDOWN PREDICATES stars:5 likes 1 edge vertex.outE("likes").has("stars",5)
  • 84. VERTEX-CENTRIC INDICES DISK-LEVEL SORTING/INDEXING likes likes likes knows knows stars:1 stars:2 stars:5
  • 85. VERTEX-CENTRIC INDICES DISK-LEVEL SORTING/INDEXING likes likes likes knows knows likes knows stars:1 stars:2 stars:5
  • 86. VERTEX-CENTRIC INDICES DISK-LEVEL SORTING/INDEXING likes likes likes knows knows likes w/ time 1-3 knows stars:1 stars:2 stars:5 likes w/ time 4-5
  • 87. VERTEX-CENTRIC INDICES DISK-LEVEL SORTING/INDEXING likes likes likes knows knows likes w/ time 1-3 knows schema = g.openManagementSystem() stars = schema .makePropertyKey("stars") .dataType(Integer.class) .make() likes = schema .makeEdgeLabel('likes') .make() schema.buildEdgeIndex(likes, 'timeIdx',BOTH,DESC,stars) stars:1 stars:2 stars:5 likes w/ time 4-5
  • 88. THE TITAN OLAP STORY 0 1 2 3 A B A C a:b c:d e:f g:h i:j 4 5 6 7 vertex id props incoming edges outgoing edges edge id props label id edge id props label id edge id label id edge id label id 1 e:f 4 c:d A 2 5 B 0 6 g:h A 3 7 C 3
  • 89. 0 1 3 4 5 6 7 8 9 10 11 AN ADJACENCY LIST
  • 90. AN ADJACENCY LIST + CLUSTER 127.0.0.2 127.0.0.3 127.0.0.4 0 1 3 4 5 6 7 8 9 10 11
  • 91. A DISTRIBUTED ADJACENCY LIST 0 1 2 3 4 5 6 7 8 9 10 11 127.0.0.2 127.0.0.3 127.0.0.4
  • 92. HADOOP Hadoop is a distributed computing platform composed of two key components: HDFS: A distributed file system that stores arbitrarily large files within a cluster. MapReduce: A parallel functional computing model for key/value pair data. http://hadoop.apache.org
  • 93. FAUNUS AND HADOOP 0 1 2 3 4 5 6 7 8 9 10 11 Process Structure 127.0.0.2 127.0.0.3 127.0.0.4 Faunus provides graph input/output formats (structure) and a traversal language for graphs (process).
  • 94. THE TITAN OLAP STORY Serial Key/Value Data Structure Indexed Key/Indexed Value Data Structure 0 1 3 4 5 6 7 8 9 10 11 0 1 3 4 5 6 7 8 9 10 11
  • 95. Adam Jacobs. 2009. The Pathologies of Big Data. Communications of the ACM 52, 8 (August 2009), 36-44. doi:10.1145/1536616.1536632 http://doi.acm.org/10.1145/1536616.1536632
  • 96. INTRA-CLUSTER CONFIGURATION Titan/Cassandra Faunus/Hadoop Data is processed on the machine where it is located. Limited network communication.
  • 97. INTER-CLUSTER CONFIGURATION Graph data is offloaded to another cluster. Repeated analysis does not interfere with production graph database.
  • 98. THE TITAN OLAP STORY Titan currently provides a Hadoop OLAP engine Titan-Hadoop (nicknamed Faunus) Titan 1.0 will also provide an in-memory, off head, single-machine OLAP engine. Titan-Memory (nicknamed Fulgora)
  • 99. PART 3: THE PATH TO TINKERPOP3
  • 100. TINKERPOP3 FEATURES A standard, vendor-agnostic graph query language. Both OLTP and OLAP engines for evaluating queries. A way for developers to create custom, domain specific variants. A server system for pushing queries to the server for evaluation. A vertex-centric computing framework for distributed graph algorithms.
  • 102. gremlin> gremlin = g.addVertex(label,'software','name','gremlin') ==>v[0] name:gremlin 0 software
  • 103. gremlin> gremlin = g.addVertex(label,'software','name','gremlin') ==>v[0] gremlin> titan = g.addVertex(label,'software','name','titan') ==>v[2] name:titan name:gremlin 0 software 2 software
  • 104. gremlin> gremlin = g.addVertex(label,'software','name','gremlin') ==>v[0] gremlin> titan = g.addVertex(label,'software','name','titan') ==>v[2] gremlin> titan.addEdge('dependsOn',gremlin,'since','0.1') ==>e[4][2-dependsOn->0] name:titan name:gremlin 0 software 2 software dependsOn since:0.1
  • 105. Gremlin is a style of graph traversal that is heavily influenced by functional programming. gremlin> gremlin = g.addVertex(label,'software','name','gremlin') ==>v[0] gremlin> titan = g.addVertex(label,'software','name','titan') ==>v[2] gremlin> titan.addEdge('dependsOn',gremlin,'since','0.1') ==>e[4][2-dependsOn->0] ////////////////////////////////////////////////////////////////// gremlin> g.V ==>v[0] ==>v[2] gremlin> g.V.has('name','titan') ==>v[2] gremlin> g.V.has('name','titan').outE ==>e[4][2-dependsOn->0] gremlin> g.V.has('name','titan').outE.since ==>0.1 gremlin> g.V.has('name','titan').out.name ==>gremlin
  • 106. g.V.has('name','titan').out.name What are the names of the vertices outgoing adjacent to Titan?
  • 107. g.V.has('name','titan').out.name name=gremlin 0 2 name=titan For the graph 'g'... Traversals are defined using fluent, function composition.
  • 108. g.V.has('name','titan').out.name name=gremlin 0 2 name=titan …get all the vertices... Objects flow through the traversal in a lazy evaluation manner.
  • 109. g.V.has('name','titan').out.name 2 name=titan …filter all vertices that don't have the name 'titan'… The general functions are map(), flatMap(), filter(), sideEffect(), and branch().
  • 110. g.V.has('name','titan').out.name name=gremlin 0 …get all outgoing adjacent vertices… Any programming language can provide a Gremlin implementation.
  • 111. g.V.has('name','titan').out.name gremlin …get the names of those vertices... Any graph system vendor can provide a binding to Gremlin.
  • 112. THE GENERIC STEPS ? S filter S S map E S flatMap sideEffect branch A E E E E E E S S S S ... S ? ? has() retain() except() interval() simplePath() dedup() timeLimit() ... back() fold() valueMap() match() orderBy() shuffle() ... out() outE() inV() both() values() unfold() ... aggregate() store() groupCount() groupBy() tree() count() subgraph() ... jump() until() choose() union() ...
  • 113. TINKERPOP'S OLAP STORY Execute parallel, distributed Gremlin traversals. Execute vertex-centric message passing algorithms.
  • 114. TINKERPOP'S OLAP STORY Execute parallel, distributed Gremlin traversals. Execute vertex-centric message passing algorithms. traversers are the messages!
  • 115. OLAP 1 1 1 1 2 2 2 3 1 2 3 OLTP ✓ Real-time queries. ✓ Touching a subset of the graph. ✓ Numerous concurrent queries. ✓ Good for local graph mutations. ✓ Long running queries. ✓ Touching the entire of the graph. ✓ Single user. ✓ Good for bulk loading/mutations.
  • 116. g.compute() GraphComputer Vertex Program program mapReduce distributed to all workers MapReduce MapReduce MapReduce worker 1 worker 2 worker 3 1. Logically distribute the vertex program to all the vertices in the graph. 2. The vertices execute the vertex program sending and receiving messages on each iteration. 3. Any number of MapReduce jobs can follow to aggregate the computed data in the graph.
  • 117. gremlin> result = g.compute().program(PageRankVertexProgram).submit() ==>result[tinkergraph[vertices:6 edges:6],memory[size:0]] gremlin> result.memory.iteration ==>30 gremlin> result.graph ==>tinkergraph[vertices:6 edges:6] gremlin> result.graph.V.map{ [it.id,it.pageRank]} ==>[1, 0.15000000000000002] ==>[2, 0.19250000000000003] ==>[3, 0.4018125] ==>[4, 0.19250000000000003] ==>[5, 0.23181250000000003] ==>[6, 0.15000000000000002] * Minor tweaks to example to make it fit nicely on a single line.
  • 118. gremlin> g.V.both.both.name.groupCount() ==>[ripple:3, peter:3, vadas:3, josh:7, lop:7, marko:7] gremlin> g.V.both.both.name.groupCount().submit(g.compute()) ==>[ripple:3, peter:3, vadas:3, josh:7, lop:7, marko:7] OLTP OLAP
  • 119. MULTIPLE GRAPH COMPUTERS FOR THE SAME GRAPH gremlin> g = TitanFactory.open(conf) ==>titangraph[66.77.88.99] gremlin> g.compute(FaunusGraphComputer) .program(PageRankVertexProgram) .submit() ==>result[faunusgraph,memory[size:0]] gremlin> g = TitanFactory.open(conf) ==>titangraph[66.77.88.99] gremlin> g.compute(FulgoraGraphComputer) .program(PageRankVertexProgram) .submit() ==>result[fulgoragraph,memory[size:0]]
  • 121. GREMLIN SERVER gremlin> g = TitanFactory.open(conf) ==>titangraph[66.77.88.99] localhost
  • 122. GREMLIN SERVER gremlin> g = TitanFactory.open(conf) ==>titangraph[66.77.88.99] gremlin> g.V.has('name','titan') ==>v[2] localhost
  • 123. GREMLIN SERVER gremlin> g = TitanFactory.open(conf) ==>titangraph[66.77.88.99] gremlin> g.V.has('name','titan') ==>v[2] gremlin> g.V.has('name','titan').out localhost 1. get the titan vertex 2. get titan's adjacents
  • 124. GREMLIN SERVER gremlin> g = TitanFactory.open(conf) ==>titangraph[66.77.88.99] gremlin> g.V.has('name','titan') ==>v[2] gremlin> g.V.has('name','titan').out localhost 1. get the titan vertex 2. get titan's adjacents CHATTY Each step in the traversal is a round trip
  • 127. GREMLIN SERVER gremlin> :remote connect tinkerpop.server titan.yaml ==>connected - 66.77.88.99:8182,11.22.33.44:8182,... localhost
  • 128. GREMLIN SERVER gremlin> :remote connect tinkerpop.server titan.yaml ==>connected - 66.77.88.99:8182,11.22.33.44:8182,... gremlin> :> g.V.has('name','titan') ==>v[2] localhost g.V.has('name','titan')
  • 129. GREMLIN SERVER gremlin> :remote connect tinkerpop.server titan.yaml ==>connected - 66.77.88.99:8182,11.22.33.44:8182,... gremlin> :> g.V.has('name','titan') ==>v[2] gremlin> :> g.V.has('name','titan').out localhost g.V.has('name','titan').out LESS CHATTY 1. Gremlin Server client knows the hash function to query the machine with the Titan vertex 2. Push the traversal to the server and get a WebSockets iterator back
  • 130. GREMLIN SERVER Allows non-JVM languages to interact with a remote graph. Supports stored procedures -- MyAlgorithms.class Supports WebSockets, REST, and various serialization mechanisms.
  • 131. CREDITS PRESENTED BY MARKO A. RODRIGUEZ MANY THANKS TO MATTHIAS BRöCHELER STEPHEN MALLETTE DAN LAROCQUE DANIEL KUPPITZ BOB BRIODY BRYN COOKE JOSH SHINAVIER PAVEL YASKEVICH AURELIUS COMMUNITY TINKERPOP COMMUNITY KETRINA YIM