SlideShare a Scribd company logo
1 of 76
Download to read offline
The Graph Traversal Programming Pattern

                Marko A. Rodriguez
              Graph Systems Architect
           http://markorodriguez.com
           http://twitter.com/twarko




      WindyCityDB - Chicago, Illinois ā€“ June 26, 2010

                     June 25, 2010
Abstract
A graph is a structure composed of a set of vertices (i.e. nodes, dots)
connected to one another by a set of edges (i.e. links, lines). The concept
of a graph has been around since the late 19th century, however, only in
recent decades has there been a strong resurgence in the development of
both graph theories and applications. In applied computing, since the late
1960s, the interlinked table structure of the relational database has been
the predominant information storage and retrieval paradigm. With the
growth of graph/network-based data and the need to eļ¬ƒciently process
such data, new data management systems have been developed. In
contrast to the index-intensive, set-theoretic operations of relational
databases, graph databases make use of index-free traversals. This
presentation will discuss the graph traversal programming pattern and its
application to problem-solving with graph databases.
Outline

ā€¢ Graph Structures

ā€¢ Graph Databases

ā€¢ Graph Traversals
   Artiļ¬cial Example
   Real-World Examples
Outline

ā€¢ Graph Structures

ā€¢ Graph Databases

ā€¢ Graph Traversals
   Artiļ¬cial Example
   Real-World Examples
Dots and Lines




There are dots and there are lines.
Lets call them vertices and edges, respectively.
Constructions from Dots and Lines




Its possible to arrange the dots and lines into various
conļ¬gurations.
Lets call such conļ¬gurations graphs.
Dots and Lines Make a Graph
The Undirected Graph

          1. Vertices
             ā€¢ All vertices denote the same
               type of object.

          2. Edges
             ā€¢ All edges denote the same type
               of relationship.
             ā€¢ All edges denote a symmetric
               relationship.
Denoting an Undirected Structure in the Real World




Collaborator graph is an undirected graph.   Road graph is an undirected graph.
A Line with a Triangle




Dots and lines are boring.
Lets add a triangle to one side of each line.
However, lets call a triangle-tipped line a directed edge.
The Directed Graph

         1. Vertices
            ā€¢ All vertices denote the same
              type of object.

         2. Edges
            ā€¢ All edges denote the same type
              of relationship.
            ā€¢ All     edges    denote     an
              asymmetric relationship.
Denoting a Directed Structure in the Real World




Twitter follow graph is a directed graph.   Web href-citation graph is a directed graph.
Single Relational Structures




ā€¢ Without a way to demarcate edges, all edges have the same
  meaning/type. Such structures are called single-relational graphs.

ā€¢ Single-relational graphs are perhaps the most common graph type
  in graph theory and network science.
How Do You Model a World with Multiple Structures?
                                                        I-25


                                                                          lives_in
                 lives_in


                                                                                     I-40
                   is
                                                               is
                                         follows

                               follows
       created


                                                                          lives_in

                                               is



                                                    created              created
                            cites                              created



                                                      cites
The Limitations of the Single-Relational Graph

ā€¢ A single-relational graph is only able to express a single type of vertex
  (e.g. person, city, user, webpage).1

ā€¢ A single-relational graph is only able to express a single type of edge
  (e.g. collaborator, road, follows, citation).2

ā€¢ For modelers, these are very limiting graph types.


   1
     This is not completely true. All n-partite single-relational graphs allow for the division of the vertex set
into n subsets, where V = n Ai : Ai āˆ© Aj = āˆ…. Thus, its possible to implicitly type the vertices.
                                i
   2
     This is not completely true. There exists an injective, information-preserving function that maps any
multi-relational graph to a single-relational graph, where edge types are denoted by topological structures.
Rodriguez, M.A., ā€œMapping Semantic Networks to Undirected Networks,ā€ International Journal of Applied
Mathematics and Computer Sciences, 5(1), pp. 39ā€“42, 2009. [http://arxiv.org/abs/0804.0277]
The Gains of the Multi-Relational Graph
ā€¢ A multi-relational graph allows for the explicit typing of edges
  (e.g. ā€œfollows,ā€ ā€œcites,ā€ etc.).

ā€¢ By labeling edges, edges can have diļ¬€erent meanings and vertices
  can have diļ¬€erent types.
    follows : user ā†’ user
    created : user ā†’ webpage
    cites : webpage ā†’ webpage
    ...


                                created
Increasing Expressivity with Multi-Relational Graphs
                                                     cites
                         cites



                                 created
                                                             created
               created


                                                                  follows
     cites
                                           follows


             created                                                      follows

                                               follows
                                                                follows
                  created
The Flexibility of the Property Graph
ā€¢ A property graph extends a multi-relational graph by allowing for both
  vertices and edges to maintain a key/value property map.

ā€¢ These properties are useful for expressing non-relational data (i.e. data
  not needing to be graphed).

ā€¢ This allows for the further reļ¬nement of the meaning of an edge.
    Peter Neubauer created the Neo4j webpage on 2007/10.

                                                      name=neo4j
         name=peterneubauer
                                                      views=56781

                                     created
                                  date=2007/10
Increasing Expressivity with Property Graphs
                                        name=neo4j
                                        views=56781

    page_rank=0.023                                         cites
                              cites

                                                                                    name=tenderlove
                                                                                      gender=male
                                       created
                                                                    created
                    created
                                            date=2007/10

        cites
                                                                              follows

                                                  follows
                created
                                      name=peterneubauer                         follows
name=graph_blog                                       follows
  views=1000                                                           follows
                          created




                                                                                    name=ahzf
                                           name=twarko
                                             age=30
Property Graph Instance Schema/Ontology

                  name=<string>                        name=<string>
                  age=<integer>                       views=<integer>
                 gender=<string>                    page_rank=<double>

                        user            created           webpage

                                    date=<string>


              follows                                                cites



No standard convention, but in general, specify the types of vertices, edges, and the
properties they may contain. Look into the world of RDFS and OWL for more rigorous,
expressive speciļ¬cations of graph-based schemas.
Property Graphs Can Model Other Graph Types
                                                           weighted graph


                                                           add weight attribute


                                                            property graph


                                          remove attributes remove attributes          no op



                       labeled graph           no op       semantic graph              no op    directed graph

                                  remove edge labels       remove edge labels
                       make labels URIs                                                no op


                                                                                               remove directionality
                          rdf graph                           multi-graph

                                                       remove loops, directionality,
                                                          and multiple edges


                                                             simple graph              no op   undirected graph




NOTE: Given that a property graph is a binary edge graph, it is diļ¬ƒcult to model an n-ary edge graph (i.e. a hypergraph).
Outline

ā€¢ Graph Structures

ā€¢ Graph Databases

ā€¢ Graph Traversals
   Artiļ¬cial Example
   Real-World Examples
Persisting a Graph Data Structure

ā€¢ A graph is a relatively simple data structure. It can be
  seen as the most fundamental data structureā€”something is
  related to something else.

ā€¢ Most every database can model a graph.3




   3
   For the sake of simplicity, the following examples are with respect to a directed, single-relational graph.
However, note that property graphs can be easily modeled by such databases as well.
Representing a Graph in a Relational Database

outV | inV
------------                           A
  A   |   B
  A   |   C
  C   |   D                  B                     C
  D   |   A

                                       D
Representing a Graph in a JSON Database

{
    A : {
      out : [B, C], in : [D]              A
    }
    B : {
      in : [A]
    }
                                B                  C
    C : {
      out : [D], in : [A]
    }
    D : {
      out : [A], in : [C]                 D
    }
}
Representing a Graph in an XML Database

<graphml>
  <graph>
    <node id=A />                        A
    <node id=B />
    <node id=C />
    <edge source=A   target=B   />
    <edge source=A   target=C   />
    <edge source=C   target=D   />   B           C
    <edge source=D   target=A   />
  </graph>
</graphml>
                                         D
Deļ¬ning a Graph Database



ā€œIf any database can represent a graph, then what
              is a graph database?ā€
Deļ¬ning a Graph Database



   A graph database is any storage system that
         provides index-free adjacency.45




   4
     There is no ā€œoļ¬ƒcialā€ deļ¬nition of what makes a database a graph database. The one provided is my
deļ¬nition. However, hopefully the following argument will convince you that this is a necessary deļ¬nition.
   5
     There is adjacency between the elements of an index, but if the index is not the primary data structure
of concern (to the developer), then there is indirect/implicit adjacency, not direct/explicit adjacency. A
graph database exposes the graph as an explicit data structure (not an implicit data structure).
Deļ¬ning a Graph Database

ā€¢ Every element (i.e. vertex or edge) has a direct pointer to
  its adjacent element.

ā€¢ No O(log2(n)) index lookup required to determine which
  vertex is adjacent to which other vertex.

ā€¢ If the graph is connected, the graph as a whole is a single
  atomic data structure.
Deļ¬ning a Graph Database by Example

            Toy Graph                Gremlin
                                     (stuntman)

        B               E



A


        C               D
Graph Databases and Index-Free Adjacency
                                     B                    E



                     A


                                     C                    D


ā€¢ Our gremlin is at vertex A.
ā€¢ In a graph database, vertex A has direct references to its adjacent vertices.
ā€¢ Constant time cost to move from A to B and C . It is dependent upon the number
  of edges emanating from vertex A (local).
Graph Databases and Index-Free Adjacency


                   B                E



        A


                   C                D



             The Graph (explicit)
Graph Databases and Index-Free Adjacency


                   B                E



       A


                   C                D



             The Graph (explicit)
Non-Graph Databases and Index-Based Adjacency

                                                       B                 E



      A         B     C                   A
      B,C        E   D,E

                           D       E
                                                       C                 D



ā€¢ Our gremlin is at vertex A.
ā€¢ In a non-graph database, the gremlin needs to look at an index to determine what
  is adjacent to A.
ā€¢ log2(n) time cost to move to B and C . It is dependent upon the total number of
  vertices and edges in the database (global).
Non-Graph Databases and Index-Based Adjacency


                                         B                  E



A          B     C               A
B,C        E    D,E

                       D     E           C                  D




      The Index (explicit)           The Graph (implicit)
Non-Graph Databases and Index-Based Adjacency



                                         B                  E



A          B     C               A
B,C        E    D,E

                       D     E           C                  D




      The Index (explicit)           The Graph (implicit)
Index-Free Adjacency

ā€¢ While any database can implicitly represent a graph, only a
  graph database makes the graph structure explicit.

ā€¢ In a graph database, each vertex serves as a ā€œmini indexā€
  of its adjacent elements.6

ā€¢ Thus, as the graph grows in size, the cost of a local step
  remains the same.7
   6
     Each vertex can be intepreted as a ā€œparent nodeā€ in an index with its children being its adjacent
elements. In this sense, traversing a graph is analogous in many ways to traversing an indexā€”albeit the
graph is not an acyclic connected graph (tree).
   7
     A graph, in many ways, is like a distributed index.
Graph Databases Make Use of Indices




          A         B     C
                                            }    Index of Vertices
                                                      (by id)




                                D       E   }       The Graph




ā€¢ There is more to the graph than the explicit graph structure.

ā€¢ Indices index the vertices, by their properties (e.g. ids).
Graph Databases and Endogenous Indices

ā€¢ Many indices are trees.8

ā€¢ A tree is a type of constrained graph.9

ā€¢ You can represent a tree with a graph.10




 8
   Even an ā€œindexā€ that is simply an O(n) container can be represented as a graph (e.g. linked list).
 9
   A tree is an acyclic connected graph with each vertex having at most one parent.
10
   This follows as a consequence of a tree being a graph.
Graph Databases and Endogenous Indices
ā€¢ Graph databases allows you to explicitly model indices
  endogenous to your domain model. Your indices and
  domain model are one atomic entityā€”a graph.11

ā€¢ This has beneļ¬ts in designing special-purpose index
  structures for your data.
        Think about all the numerous types of indices in the
        geo-spatial community.12
        Think about all the indices that you have yet to think
        about.
11
     Originally, Neo4j used itself as its own indexing system before moving to Lucene.
12
     Craig Taverner explores the use of graph databases in GIS-based applications.
Graph Databases and Endogenous Indices
                                               name property index




views property index                                                                             gender property index




                                                 name=neo4j
                                                 views=56781

             page_rank=0.023                                         cites
                                       cites

                                                                                             name=tenderlove
                                                                                               gender=male
                                                created
                                                                             created
                             created
                                                     date=2007/10

                 cites
                                                                                       follows

                                                           follows
                         created
                                               name=peterneubauer                         follows
         name=graph_blog                                       follows
           views=1000                                                           follows
                                   created




                                                                                             name=ahzf
                                                    name=twarko
                                                      age=30
Graph Databases and Endogenous Indices
                                                 name property index




  views property index                                                                             gender property index




                                                   name=neo4j
                                                   views=56781

               page_rank=0.023                                         cites
                                         cites

                                                                                               name=tenderlove
                                                                                                 gender=male
                                                  created
                                                                               created
                               created
                                                       date=2007/10

                   cites
                                                                                         follows

                                                             follows
                           created
                                                 name=peterneubauer                         follows
           name=graph_blog                                       follows
             views=1000                                                           follows
                                     created




                                                                                               name=ahzf
                                                      name=twarko
                                                        age=30




                                                                               The Graph Dataset
Outline

ā€¢ Graph Structures

ā€¢ Graph Databases

ā€¢ Graph Traversals
   Artiļ¬cial Example
   Real-World Examples
Graph Traversals as the Foundation

ā€¢ Question: Once I have my data represented as a
  graph, what can I do with it?

ā€¢ Answer: You can traverse over the graph to
  solve problems.
Outline

ā€¢ Graph Structures

ā€¢ Graph Databases

ā€¢ Graph Traversals
   Artiļ¬cial Example
   Real-World Examples
Graph Database vs. Relational Database

ā€¢ While any database can represent a graph, it takes time to
  make what is implicit explicit.

ā€¢ The graph database represents an explicit graph.

ā€¢ The experiment that follows demonstrate the problem with
  using lots of table JOINs to accomplish the eļ¬€ect of a
  graph traversal.13


 13
      Though not presented in this lecture, similar results were seen with JSON document databases.
Neo4j vs. MySQL ā€“ Generating a Large Graph




ā€¢ Generated a 1 million vertex/4 million edge graph with ā€œnatural statistics.ā€14
ā€¢ Loaded the graph into both Neo4j and MySQL in order to empirically evaluate the
  eļ¬€ect of index-free adjacency.
 14
      What is diagramed is a small subset of this graph 1 million vertex graph.
Neo4j vs. MySQL ā€“ The Experiment

ā€¢ For each run of the experiment, a traverser (gremlin) is
  placed on a single vertex.

ā€¢ For each step, the traverser moves to its adjacent
  vertices.
        Neo4j (graph database): the adjacent vertices are provided by the
        current vertex.15
        MySQL (relational database): the adjacent vertices are provided by a
        table JOIN.

ā€¢ For the experiment, this process goes up to 5 steps.
15
     You can think of a graph traversal, in a graph database, as a local neighborhood JOIN.
Neo4j vs. MySQL ā€“ The Experiment (Zoom-In Subset)
Neo4j vs. MySQL ā€“ The Experiment (Step 1)
Neo4j vs. MySQL ā€“ The Experiment (Step 2)
Neo4j vs. MySQL ā€“ The Experiment (Step 3)
Neo4j vs. MySQL ā€“ The Experiment (Step 4)
Neo4j vs. MySQL ā€“ The Experiment (Step 5)
Neo4j vs. MySQL ā€“ The Results
                               total running time (ms) for step traverals of length n
                                      total running time (ms) for traversals of length n
                              mysql
                                                                                           2.3x faster
                              neo4j


                   100000
        time(ms)
                   60000
                   0 20000




                                                                      2.6x faster
                               4.5x faster          1.9x faster

                                      1                 2                  3                   4

                                                       traversal length
                                                              steps
                                average over the 250 most dense vertices as root of the traveral




ā€¢ At step 5, Neo4j completed it in 14 minutes.
ā€¢ At step 5, MySQL was still running after 2 hours (process stopped).
Neo4j vs. MySQL ā€“ More Information
For more information on this experiment, please visit
http://markorodriguez.com/Blarko/Entries/2010/3/
29_MySQL_vs._Neo4j_on_a_Large-Scale_Graph_
Traversal.html
Why Use a Graph Database? ā€“ Data Locality

ā€¢ If the solution to your problem can be represented as a local
  process within a larger global data structure, then a graph
  database may be the optimal solution for your problem.

ā€¢ If the solution to your problem can be represented as being
  with respect to a set of root elements, then a graph
  database may be the optimal solution to your problem.

ā€¢ If the solution to your problem does not require a global
  analysis of your data, then a graph database may be the
  optimal solution to your problem.
Why Use a Graph Database? ā€“ Data Locality
Outline

ā€¢ Graph Structures

ā€¢ Graph Databases

ā€¢ Graph Traversals
   Artiļ¬cial Example
   Real-World Examples
Some Graph Traversal Use Cases
ā€¢ Local searches ā€” ā€œWhat is in the neighborhood around
  A?ā€16

ā€¢ Local recommendations ā€” ā€œGiven A, what should A
  include in their neighborhood?ā€17

ā€¢ Local ranks ā€” ā€œGiven A, how would you rank B relative to
  A?ā€18
  16
     A can be an individual vertex or a set of vertices. This set is known as the root vertex set.
  17
     Recommendation can be seen as trying to increase the connectivity of the graph by recommending
vertices (e.g. items) for another vertex (e.g. person) to extend an edge to (e.g. purchased).
  18
     In this presentation, there will be no examples provided for this use case. However, note that searching,
ranking, and recommendation are presented in the WindyCityDB OpenLab Graph Database Tutorial. Other
terms for local rank are ā€œrank with priorsā€ or ā€œrelative rank.ā€
Graph Traversals with Gremlin Programming Language




                                   Gremlin            G = (V, E)

                       http://gremlin.tinkerpop.com

The examples to follow are as basic as possible to get the idea across. Note that
numerous variations to the themes presented can be created. Such variations are driven
by the richness of the underlying graph data set and the desired speed of evaluation.
Graph Traversals with Gremlin Programming Language



        1                created               3


              knows                  created


                            4


                      name = peter
                      age = 37
Graph Traversals with Gremlin Programming Language
                                                       vertex 3 in edges
                  vertex 1 out edges
                                        edge label
     edge out vertex
                                                                           edge in vertex




              1                        created                              3



                             knows                 created

                                          4


                                    name = peter
                                    age = 37


              vertex 4 properties                            vertex 4 id
Graph Traversal in Property Graphs
                                    name=tobias

                         follows


             name=alex                              created   name=C


                          created
                                                  created
name=emil   follows                  name=B

                                                                       name=E

            follows                  name=A
                                                  created
                          created

            name=johan                              created   name=D


                         follows

                                    name=peter




   Red vertices are people and blue vertices are webpages.
Local Search: ā€œWho are the followers of Emil Eifrem?ā€
                                        name=tobias

                             follows


             name=alex                                  created    name=C


                              created
                                                      created
name=emil   follows      2               name=B                                         name=alex

                                                                            name=E

                                         name=A
    1       follows      2                                                              name=johan
                                                      created
                              created

            name=johan                                  created    name=D


                             follows

                                        name=peter                ./outE[@label=Ź»followsŹ¼]/inV
Local Search: ā€œWhat webpages did Emilā€™s followers
                   create?ā€
                                        name=tobias

                             follows


                                                        created      name=C
             name=alex                      3
                              created
                                                      created
name=emil   follows      2               name=B                                           name=B

                                                                              name=E

                                         name=A                                           name=A
    1       follows      2
                                                      created
                              created

            name=johan                      3           created      name=D


                             follows
                                                                  ./outE[@label=Ź»followsŹ¼]/inV
                                        name=peter                   /outE[@label=Ź»createdŹ¼]/inV
Local Search: ā€œWhat webpages did Emilā€™s followers
               followers create?ā€
                                        name=tobias

                             follows


             name=alex
                                            3           created     name=C
                                                                                         name=C
                              created
                                                      created
name=emil                                                              4
            follows      2               name=B                                          name=D

                                                                             name=E

    1       follows      2               name=A                        4 4               name=E
                                                      created
                              created                              4
                                                                                         name=E
            name=johan                                  created     name=D
                                            3
                             follows
                                                                  ./outE[@label=Ź»followsŹ¼]/inV/
                                        name=peter                  outE[@label=Ź»followsŹ¼]/inV/
                                                                     outE[@label=Ź»createdŹ¼]/inV
Local Recommendation: ā€œIf you like webpage E, you
                  may also like...ā€
                                           name=tobias

                                follows

                                               2           created   name=C
                    name=alex

                                 created
                                                         created
                                                                              3
       name=emil   follows                  name=B                                         name=C

                                                                              name=E

                                            name=A
                                                                     1                     name=D
                   follows
                                                         created
                                 created                                      3
                   name=johan                                        name=D
                                               2           created

                                follows
                                                         ./inV[@label='created']/outV/
                                           name=peter      outE[@label='created']/inV[g:except($_)]


Assumption: if you like a webpage by X , you will like others that they have created.
Local Recommendation: ā€œIf you like Johan, you may also
                      like...ā€
                                               name=tobias

                                    follows


                    name=alex                                  created      name=C


                                     created
                                                             created
       name=emil   follows      3               name=B

                                                                                     name=E      name=alex

           2       follows      1               name=A
                                                             created
                                     created

                   name=johan                                  created      name=D


                                    follows
                                                                         ./inV[@label='follows']/outV/
                                               name=peter                  outE[@label='follows']/inV


Assumption: if many people follow the same two people, then those two may be similar.
Assortment of Other Speciļ¬c Graph Traversal Use Cases
ā€¢ Missing friends: Find all the friends of person A. Then ļ¬nd all the
  friends of the friends of person A that are not also person Aā€™s friends.19
       ./outE[@label=ā€˜friendā€™]/inV[g:assign(ā€˜$xā€™)]/
           outE[@label=ā€˜friendā€™]/inV[g:except($x)]

ā€¢ Collaborative ļ¬ltering: Find all the items that the person A likes. Then
  ļ¬nd all the people that like those same items. Then ļ¬nd which items
  those people like that are not already the items that are liked by person
  A.20
       ./outE[@label=ā€˜likesā€™]/inV[g:assign(ā€˜$xā€™)]/
           inE[@label=ā€˜likesā€™]/outV/outE[@label=ā€˜likesā€™]/inV[g:except($x)]
  19
     This algorithm is based on the notion of trying to close ā€œopen trianglesā€ in the friendship graph. If
many of person Aā€™s friends are friends with person B , then its likely that A and B know each other.
  20
     This is the most concise representation of collaborative ļ¬ltering. There are numerous modiļ¬cations to
this general theme that can be taken advantage of to alter the recommendations.
Assortment of Other Speciļ¬c Graph Traversal Use Cases

ā€¢ Question expert identiļ¬cation: Find all the tags associated with
  question A. For all those tag, ļ¬nd all answers (for any questions) that
  are tagged by those tags. For those answers, ļ¬nd who created those
  answers.21
       ./inE[@label=ā€˜tagā€™]/outV[@type=ā€˜answerā€™]/inE[@label=ā€˜createdā€™]/outV


ā€¢ Similar tags: Find all the things that tag A has been used as a tag for.
  For all those things, determine what else they have been tagged with.22
       ./inE[@label=ā€˜tagā€™]/outV/outE[@label=ā€˜tagā€™]/inV[g:except($_)]

  21
     If two resources share a ā€œbundleā€ of resources in common, then they are similar.
  22
     This is the notion of ā€œco-associationā€ and can be generalized to ļ¬nd the similarity of two resources
based upon their co-association through a third resource (e.g. co-author, co-usage, co-download, etc.). The
third resource and the edge labels traversed determine the meaning of the association.
Some Tips on Graph Traversals

ā€¢ Ranking, scoring, recommendation, searching, etc. are all
  variations on the basic theme of deļ¬ning abstract paths
  through a graph and realizing instances of those paths
  through traversal.

ā€¢ The type of path taken determines the meaning
  (i.e semantics) of the rank, score, recommendation, search,
  etc.

ā€¢ Given the data locality aspect of graph databases, many of
  these traversals run in real-time (< 100ms).
Property Graph Algorithms in General
ā€¢ There is a general framework for mapping all the standard single-relational
  graph analysis algorithms over to the property graph domain.23
       Geodesics: shortest path, eccentricity, radius, diameter, closeness,
       betweenness, etc.24
       Spectral: random walks, page rank, spreading activation, priors, etc.25
       Assortativity: scalar or categorical.
       ... any graph algorithm in general.

ā€¢ All able to be represented in Gremlin.
  23
     Rodriguez M.A., Shinavier, J., ā€œExposing Multi-Relational Networks to Single-Relational Network Analysis
Algorithms,ā€ Journal of Informetrics, 4(1), pp. 29ā€“41, 2009. [http://arxiv.org/abs/0806.2274]
  24
     Rodriguez, M.A., Watkins, J.H., ā€œGrammar-Based Geodesics in Semantic Networks,ā€ Knowledge-Based
Systems, in press, 2010.
  25
     Rodriguez, M.A., ā€œGrammar-Based Random Walkers in Semantic Networks,ā€ Knowledge-Based Systems,
21(7), pp. 7270ā€“739, 2008. [http://arxiv.org/abs/0803.4355]
Conclusion

ā€¢ Graph databases are eļ¬ƒcient with respects to local data
  analysis.

ā€¢ Locality is deļ¬ned by direct referent structures.

ā€¢ Frame all solutions to problems as a traversal over local
  regions of the graph.
    This is the Graph Traversal Pattern.
Acknowledgements

ā€¢ Pavel Yaskevich for advancing Gremlin. Pavel is currently writing a
  new compiler that will make Gremlin faster and more memory eļ¬ƒcient.

ā€¢ Peter Neubauer for his collaboration on many of the ideas discussed in
  this presentation.

ā€¢ The rest of the Neo4j team (Emil, Johan, Mattias, Alex, Tobias, David,
  Anders (1 and 2)) for their comments.

ā€¢ WindyCityDB organizers for their support.

ā€¢ AT&T Interactive (Aaron, Rand, Charlie, and the rest of the Buzz
  team) for their support.
References to Related Work
ā€¢ Rodriguez, M.A., Neubauer, P., ā€œConstructions from Dots and Lines,ā€ Bulletin
  of the American Society of Information Science and Technology, June 2010.
  [http://arxiv.org/abs/1006.2361]

ā€¢ Rodriguez, M.A., Neubauer, P., ā€œThe Graph Traversal Pattern,ā€ AT&Ti and
  NeoTechnology Technical Report, April 2010. [http://arxiv.org/abs/1004.1001]

ā€¢ Neo4j: A Graph Database [http://neo4j.org]

ā€¢ TinkerPop [http://tinkerpop.com]
     Blueprints: Data Models and their Implementations [http://blueprints.tinkerpop.com]
     Pipes: A Data Flow Framework using Process Graphs [http://pipes.tinkerpop.com]
     Gremlin: A Graph-Based Programming Language [http://gremlin.tinkerpop.com]
     Rexster: A Graph-Based Ranking Engine [http://rexster.tinkerpop.com]
     āˆ— Wreckster: A Ruby API for Rexster [http://github.com/tenderlove/wreckster]

More Related Content

What's hot

Graph database Use Cases
Graph database Use CasesGraph database Use Cases
Graph database Use CasesMax De Marzi
Ā 
Introduction to OOP in Python
Introduction to OOP in PythonIntroduction to OOP in Python
Introduction to OOP in PythonAleksander Fabijan
Ā 
An Algebraic Data Model for Graphs and Hypergraphs (Category Theory meetup, N...
An Algebraic Data Model for Graphs and Hypergraphs (Category Theory meetup, N...An Algebraic Data Model for Graphs and Hypergraphs (Category Theory meetup, N...
An Algebraic Data Model for Graphs and Hypergraphs (Category Theory meetup, N...Joshua Shinavier
Ā 
Scala Data Pipelines @ Spotify
Scala Data Pipelines @ SpotifyScala Data Pipelines @ Spotify
Scala Data Pipelines @ SpotifyNeville Li
Ā 
Intro to Neo4j and Graph Databases
Intro to Neo4j and Graph DatabasesIntro to Neo4j and Graph Databases
Intro to Neo4j and Graph DatabasesNeo4j
Ā 
NOSQLEU - Graph Databases and Neo4j
NOSQLEU - Graph Databases and Neo4jNOSQLEU - Graph Databases and Neo4j
NOSQLEU - Graph Databases and Neo4jTobias Lindaaker
Ā 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EngineDataWorks Summit
Ā 
Intro to Neo4j
Intro to Neo4jIntro to Neo4j
Intro to Neo4jNeo4j
Ā 
Set methods in python
Set methods in pythonSet methods in python
Set methods in pythondeepalishinkar1
Ā 
Introduction to Neo4j
Introduction to Neo4jIntroduction to Neo4j
Introduction to Neo4jNeo4j
Ā 
Introduction to Neo4j and .Net
Introduction to Neo4j and .NetIntroduction to Neo4j and .Net
Introduction to Neo4j and .NetNeo4j
Ā 
The Apache Spark File Format Ecosystem
The Apache Spark File Format EcosystemThe Apache Spark File Format Ecosystem
The Apache Spark File Format EcosystemDatabricks
Ā 
Introduction to Graph Databases
Introduction to Graph DatabasesIntroduction to Graph Databases
Introduction to Graph DatabasesDataStax
Ā 
Morel, a Functional Query Language
Morel, a Functional Query LanguageMorel, a Functional Query Language
Morel, a Functional Query LanguageJulian Hyde
Ā 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to RedisDvir Volk
Ā 
Neo4j in Production: A look at Neo4j in the Real World
Neo4j in Production: A look at Neo4j in the Real WorldNeo4j in Production: A look at Neo4j in the Real World
Neo4j in Production: A look at Neo4j in the Real WorldNeo4j
Ā 
RDBMS to Graph
RDBMS to GraphRDBMS to Graph
RDBMS to GraphNeo4j
Ā 
AI made easy with Flink AI Flow
AI made easy with Flink AI FlowAI made easy with Flink AI Flow
AI made easy with Flink AI FlowJiangjie Qin
Ā 

What's hot (20)

Graph database Use Cases
Graph database Use CasesGraph database Use Cases
Graph database Use Cases
Ā 
Introduction to OOP in Python
Introduction to OOP in PythonIntroduction to OOP in Python
Introduction to OOP in Python
Ā 
An Algebraic Data Model for Graphs and Hypergraphs (Category Theory meetup, N...
An Algebraic Data Model for Graphs and Hypergraphs (Category Theory meetup, N...An Algebraic Data Model for Graphs and Hypergraphs (Category Theory meetup, N...
An Algebraic Data Model for Graphs and Hypergraphs (Category Theory meetup, N...
Ā 
Models for hierarchical data
Models for hierarchical dataModels for hierarchical data
Models for hierarchical data
Ā 
Scala Data Pipelines @ Spotify
Scala Data Pipelines @ SpotifyScala Data Pipelines @ Spotify
Scala Data Pipelines @ Spotify
Ā 
Intro to Neo4j and Graph Databases
Intro to Neo4j and Graph DatabasesIntro to Neo4j and Graph Databases
Intro to Neo4j and Graph Databases
Ā 
NOSQLEU - Graph Databases and Neo4j
NOSQLEU - Graph Databases and Neo4jNOSQLEU - Graph Databases and Neo4j
NOSQLEU - Graph Databases and Neo4j
Ā 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
Ā 
Intro to Neo4j
Intro to Neo4jIntro to Neo4j
Intro to Neo4j
Ā 
Set methods in python
Set methods in pythonSet methods in python
Set methods in python
Ā 
Introduction to Neo4j
Introduction to Neo4jIntroduction to Neo4j
Introduction to Neo4j
Ā 
Introduction to Neo4j and .Net
Introduction to Neo4j and .NetIntroduction to Neo4j and .Net
Introduction to Neo4j and .Net
Ā 
The Apache Spark File Format Ecosystem
The Apache Spark File Format EcosystemThe Apache Spark File Format Ecosystem
The Apache Spark File Format Ecosystem
Ā 
Introduction to Graph Databases
Introduction to Graph DatabasesIntroduction to Graph Databases
Introduction to Graph Databases
Ā 
Morel, a Functional Query Language
Morel, a Functional Query LanguageMorel, a Functional Query Language
Morel, a Functional Query Language
Ā 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to Redis
Ā 
SQLAlchemy Primer
SQLAlchemy PrimerSQLAlchemy Primer
SQLAlchemy Primer
Ā 
Neo4j in Production: A look at Neo4j in the Real World
Neo4j in Production: A look at Neo4j in the Real WorldNeo4j in Production: A look at Neo4j in the Real World
Neo4j in Production: A look at Neo4j in the Real World
Ā 
RDBMS to Graph
RDBMS to GraphRDBMS to Graph
RDBMS to Graph
Ā 
AI made easy with Flink AI Flow
AI made easy with Flink AI FlowAI made easy with Flink AI Flow
AI made easy with Flink AI Flow
Ā 

More from Marko Rodriguez

mm-ADT: A Virtual Machine/An Economic Machine
mm-ADT: A Virtual Machine/An Economic Machinemm-ADT: A Virtual Machine/An Economic Machine
mm-ADT: A Virtual Machine/An Economic MachineMarko Rodriguez
Ā 
mm-ADT: A Multi-Model Abstract Data Type
mm-ADT: A Multi-Model Abstract Data Typemm-ADT: A Multi-Model Abstract Data Type
mm-ADT: A Multi-Model Abstract Data TypeMarko Rodriguez
Ā 
Open Problems in the Universal Graph Theory
Open Problems in the Universal Graph TheoryOpen Problems in the Universal Graph Theory
Open Problems in the Universal Graph TheoryMarko Rodriguez
Ā 
Gremlin 101.3 On Your FM Dial
Gremlin 101.3 On Your FM DialGremlin 101.3 On Your FM Dial
Gremlin 101.3 On Your FM DialMarko Rodriguez
Ā 
Gremlin's Graph Traversal Machinery
Gremlin's Graph Traversal MachineryGremlin's Graph Traversal Machinery
Gremlin's Graph Traversal MachineryMarko Rodriguez
Ā 
Quantum Processes in Graph Computing
Quantum Processes in Graph ComputingQuantum Processes in Graph Computing
Quantum Processes in Graph ComputingMarko Rodriguez
Ā 
ACM DBPL Keynote: The Graph Traversal Machine and Language
ACM DBPL Keynote: The Graph Traversal Machine and LanguageACM DBPL Keynote: The Graph Traversal Machine and Language
ACM DBPL Keynote: The Graph Traversal Machine and LanguageMarko Rodriguez
Ā 
The Gremlin Graph Traversal Language
The Gremlin Graph Traversal LanguageThe Gremlin Graph Traversal Language
The Gremlin Graph Traversal LanguageMarko Rodriguez
Ā 
Faunus: Graph Analytics Engine
Faunus: Graph Analytics EngineFaunus: Graph Analytics Engine
Faunus: Graph Analytics EngineMarko Rodriguez
Ā 
Solving Problems with Graphs
Solving Problems with GraphsSolving Problems with Graphs
Solving Problems with GraphsMarko Rodriguez
Ā 
Titan: The Rise of Big Graph Data
Titan: The Rise of Big Graph DataTitan: The Rise of Big Graph Data
Titan: The Rise of Big Graph DataMarko Rodriguez
Ā 
The Pathology of Graph Databases
The Pathology of Graph DatabasesThe Pathology of Graph Databases
The Pathology of Graph DatabasesMarko Rodriguez
Ā 
Traversing Graph Databases with Gremlin
Traversing Graph Databases with GremlinTraversing Graph Databases with Gremlin
Traversing Graph Databases with GremlinMarko Rodriguez
Ā 
The Path-o-Logical Gremlin
The Path-o-Logical GremlinThe Path-o-Logical Gremlin
The Path-o-Logical GremlinMarko Rodriguez
Ā 
The Gremlin in the Graph
The Gremlin in the GraphThe Gremlin in the Graph
The Gremlin in the GraphMarko Rodriguez
Ā 
Memoirs of a Graph Addict: Despair to Redemption
Memoirs of a Graph Addict: Despair to RedemptionMemoirs of a Graph Addict: Despair to Redemption
Memoirs of a Graph Addict: Despair to RedemptionMarko Rodriguez
Ā 
Graph Databases: Trends in the Web of Data
Graph Databases: Trends in the Web of DataGraph Databases: Trends in the Web of Data
Graph Databases: Trends in the Web of DataMarko Rodriguez
Ā 
Problem-Solving using Graph Traversals: Searching, Scoring, Ranking, and Reco...
Problem-Solving using Graph Traversals: Searching, Scoring, Ranking, and Reco...Problem-Solving using Graph Traversals: Searching, Scoring, Ranking, and Reco...
Problem-Solving using Graph Traversals: Searching, Scoring, Ranking, and Reco...Marko Rodriguez
Ā 
A Perspective on Graph Theory and Network Science
A Perspective on Graph Theory and Network ScienceA Perspective on Graph Theory and Network Science
A Perspective on Graph Theory and Network ScienceMarko Rodriguez
Ā 

More from Marko Rodriguez (20)

mm-ADT: A Virtual Machine/An Economic Machine
mm-ADT: A Virtual Machine/An Economic Machinemm-ADT: A Virtual Machine/An Economic Machine
mm-ADT: A Virtual Machine/An Economic Machine
Ā 
mm-ADT: A Multi-Model Abstract Data Type
mm-ADT: A Multi-Model Abstract Data Typemm-ADT: A Multi-Model Abstract Data Type
mm-ADT: A Multi-Model Abstract Data Type
Ā 
Open Problems in the Universal Graph Theory
Open Problems in the Universal Graph TheoryOpen Problems in the Universal Graph Theory
Open Problems in the Universal Graph Theory
Ā 
Gremlin 101.3 On Your FM Dial
Gremlin 101.3 On Your FM DialGremlin 101.3 On Your FM Dial
Gremlin 101.3 On Your FM Dial
Ā 
Gremlin's Graph Traversal Machinery
Gremlin's Graph Traversal MachineryGremlin's Graph Traversal Machinery
Gremlin's Graph Traversal Machinery
Ā 
Quantum Processes in Graph Computing
Quantum Processes in Graph ComputingQuantum Processes in Graph Computing
Quantum Processes in Graph Computing
Ā 
ACM DBPL Keynote: The Graph Traversal Machine and Language
ACM DBPL Keynote: The Graph Traversal Machine and LanguageACM DBPL Keynote: The Graph Traversal Machine and Language
ACM DBPL Keynote: The Graph Traversal Machine and Language
Ā 
The Gremlin Graph Traversal Language
The Gremlin Graph Traversal LanguageThe Gremlin Graph Traversal Language
The Gremlin Graph Traversal Language
Ā 
The Path Forward
The Path ForwardThe Path Forward
The Path Forward
Ā 
Faunus: Graph Analytics Engine
Faunus: Graph Analytics EngineFaunus: Graph Analytics Engine
Faunus: Graph Analytics Engine
Ā 
Solving Problems with Graphs
Solving Problems with GraphsSolving Problems with Graphs
Solving Problems with Graphs
Ā 
Titan: The Rise of Big Graph Data
Titan: The Rise of Big Graph DataTitan: The Rise of Big Graph Data
Titan: The Rise of Big Graph Data
Ā 
The Pathology of Graph Databases
The Pathology of Graph DatabasesThe Pathology of Graph Databases
The Pathology of Graph Databases
Ā 
Traversing Graph Databases with Gremlin
Traversing Graph Databases with GremlinTraversing Graph Databases with Gremlin
Traversing Graph Databases with Gremlin
Ā 
The Path-o-Logical Gremlin
The Path-o-Logical GremlinThe Path-o-Logical Gremlin
The Path-o-Logical Gremlin
Ā 
The Gremlin in the Graph
The Gremlin in the GraphThe Gremlin in the Graph
The Gremlin in the Graph
Ā 
Memoirs of a Graph Addict: Despair to Redemption
Memoirs of a Graph Addict: Despair to RedemptionMemoirs of a Graph Addict: Despair to Redemption
Memoirs of a Graph Addict: Despair to Redemption
Ā 
Graph Databases: Trends in the Web of Data
Graph Databases: Trends in the Web of DataGraph Databases: Trends in the Web of Data
Graph Databases: Trends in the Web of Data
Ā 
Problem-Solving using Graph Traversals: Searching, Scoring, Ranking, and Reco...
Problem-Solving using Graph Traversals: Searching, Scoring, Ranking, and Reco...Problem-Solving using Graph Traversals: Searching, Scoring, Ranking, and Reco...
Problem-Solving using Graph Traversals: Searching, Scoring, Ranking, and Reco...
Ā 
A Perspective on Graph Theory and Network Science
A Perspective on Graph Theory and Network ScienceA Perspective on Graph Theory and Network Science
A Perspective on Graph Theory and Network Science
Ā 

Recently uploaded

WhatsApp 9892124323 āœ“Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 āœ“Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 āœ“Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 āœ“Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
Ā 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
Ā 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
Ā 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
Ā 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
Ā 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
Ā 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
Ā 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
Ā 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
Ā 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
Ā 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
Ā 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
Ā 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
Ā 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
Ā 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
Ā 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
Ā 
šŸ¬ The future of MySQL is Postgres šŸ˜
šŸ¬  The future of MySQL is Postgres   šŸ˜šŸ¬  The future of MySQL is Postgres   šŸ˜
šŸ¬ The future of MySQL is Postgres šŸ˜RTylerCroy
Ā 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel AraĆŗjo
Ā 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
Ā 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
Ā 

Recently uploaded (20)

WhatsApp 9892124323 āœ“Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 āœ“Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 āœ“Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 āœ“Call Girls In Kalyan ( Mumbai ) secure service
Ā 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
Ā 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Ā 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
Ā 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Ā 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Ā 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Ā 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
Ā 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
Ā 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
Ā 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
Ā 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Ā 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
Ā 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
Ā 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
Ā 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
Ā 
šŸ¬ The future of MySQL is Postgres šŸ˜
šŸ¬  The future of MySQL is Postgres   šŸ˜šŸ¬  The future of MySQL is Postgres   šŸ˜
šŸ¬ The future of MySQL is Postgres šŸ˜
Ā 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Ā 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
Ā 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Ā 

The Graph Traversal Programming Pattern

  • 1. The Graph Traversal Programming Pattern Marko A. Rodriguez Graph Systems Architect http://markorodriguez.com http://twitter.com/twarko WindyCityDB - Chicago, Illinois ā€“ June 26, 2010 June 25, 2010
  • 2. Abstract A graph is a structure composed of a set of vertices (i.e. nodes, dots) connected to one another by a set of edges (i.e. links, lines). The concept of a graph has been around since the late 19th century, however, only in recent decades has there been a strong resurgence in the development of both graph theories and applications. In applied computing, since the late 1960s, the interlinked table structure of the relational database has been the predominant information storage and retrieval paradigm. With the growth of graph/network-based data and the need to eļ¬ƒciently process such data, new data management systems have been developed. In contrast to the index-intensive, set-theoretic operations of relational databases, graph databases make use of index-free traversals. This presentation will discuss the graph traversal programming pattern and its application to problem-solving with graph databases.
  • 3. Outline ā€¢ Graph Structures ā€¢ Graph Databases ā€¢ Graph Traversals Artiļ¬cial Example Real-World Examples
  • 4. Outline ā€¢ Graph Structures ā€¢ Graph Databases ā€¢ Graph Traversals Artiļ¬cial Example Real-World Examples
  • 5. Dots and Lines There are dots and there are lines. Lets call them vertices and edges, respectively.
  • 6. Constructions from Dots and Lines Its possible to arrange the dots and lines into various conļ¬gurations. Lets call such conļ¬gurations graphs.
  • 7. Dots and Lines Make a Graph
  • 8. The Undirected Graph 1. Vertices ā€¢ All vertices denote the same type of object. 2. Edges ā€¢ All edges denote the same type of relationship. ā€¢ All edges denote a symmetric relationship.
  • 9. Denoting an Undirected Structure in the Real World Collaborator graph is an undirected graph. Road graph is an undirected graph.
  • 10. A Line with a Triangle Dots and lines are boring. Lets add a triangle to one side of each line. However, lets call a triangle-tipped line a directed edge.
  • 11. The Directed Graph 1. Vertices ā€¢ All vertices denote the same type of object. 2. Edges ā€¢ All edges denote the same type of relationship. ā€¢ All edges denote an asymmetric relationship.
  • 12. Denoting a Directed Structure in the Real World Twitter follow graph is a directed graph. Web href-citation graph is a directed graph.
  • 13. Single Relational Structures ā€¢ Without a way to demarcate edges, all edges have the same meaning/type. Such structures are called single-relational graphs. ā€¢ Single-relational graphs are perhaps the most common graph type in graph theory and network science.
  • 14. How Do You Model a World with Multiple Structures? I-25 lives_in lives_in I-40 is is follows follows created lives_in is created created cites created cites
  • 15. The Limitations of the Single-Relational Graph ā€¢ A single-relational graph is only able to express a single type of vertex (e.g. person, city, user, webpage).1 ā€¢ A single-relational graph is only able to express a single type of edge (e.g. collaborator, road, follows, citation).2 ā€¢ For modelers, these are very limiting graph types. 1 This is not completely true. All n-partite single-relational graphs allow for the division of the vertex set into n subsets, where V = n Ai : Ai āˆ© Aj = āˆ…. Thus, its possible to implicitly type the vertices. i 2 This is not completely true. There exists an injective, information-preserving function that maps any multi-relational graph to a single-relational graph, where edge types are denoted by topological structures. Rodriguez, M.A., ā€œMapping Semantic Networks to Undirected Networks,ā€ International Journal of Applied Mathematics and Computer Sciences, 5(1), pp. 39ā€“42, 2009. [http://arxiv.org/abs/0804.0277]
  • 16. The Gains of the Multi-Relational Graph ā€¢ A multi-relational graph allows for the explicit typing of edges (e.g. ā€œfollows,ā€ ā€œcites,ā€ etc.). ā€¢ By labeling edges, edges can have diļ¬€erent meanings and vertices can have diļ¬€erent types. follows : user ā†’ user created : user ā†’ webpage cites : webpage ā†’ webpage ... created
  • 17. Increasing Expressivity with Multi-Relational Graphs cites cites created created created follows cites follows created follows follows follows created
  • 18. The Flexibility of the Property Graph ā€¢ A property graph extends a multi-relational graph by allowing for both vertices and edges to maintain a key/value property map. ā€¢ These properties are useful for expressing non-relational data (i.e. data not needing to be graphed). ā€¢ This allows for the further reļ¬nement of the meaning of an edge. Peter Neubauer created the Neo4j webpage on 2007/10. name=neo4j name=peterneubauer views=56781 created date=2007/10
  • 19. Increasing Expressivity with Property Graphs name=neo4j views=56781 page_rank=0.023 cites cites name=tenderlove gender=male created created created date=2007/10 cites follows follows created name=peterneubauer follows name=graph_blog follows views=1000 follows created name=ahzf name=twarko age=30
  • 20. Property Graph Instance Schema/Ontology name=<string> name=<string> age=<integer> views=<integer> gender=<string> page_rank=<double> user created webpage date=<string> follows cites No standard convention, but in general, specify the types of vertices, edges, and the properties they may contain. Look into the world of RDFS and OWL for more rigorous, expressive speciļ¬cations of graph-based schemas.
  • 21. Property Graphs Can Model Other Graph Types weighted graph add weight attribute property graph remove attributes remove attributes no op labeled graph no op semantic graph no op directed graph remove edge labels remove edge labels make labels URIs no op remove directionality rdf graph multi-graph remove loops, directionality, and multiple edges simple graph no op undirected graph NOTE: Given that a property graph is a binary edge graph, it is diļ¬ƒcult to model an n-ary edge graph (i.e. a hypergraph).
  • 22. Outline ā€¢ Graph Structures ā€¢ Graph Databases ā€¢ Graph Traversals Artiļ¬cial Example Real-World Examples
  • 23. Persisting a Graph Data Structure ā€¢ A graph is a relatively simple data structure. It can be seen as the most fundamental data structureā€”something is related to something else. ā€¢ Most every database can model a graph.3 3 For the sake of simplicity, the following examples are with respect to a directed, single-relational graph. However, note that property graphs can be easily modeled by such databases as well.
  • 24. Representing a Graph in a Relational Database outV | inV ------------ A A | B A | C C | D B C D | A D
  • 25. Representing a Graph in a JSON Database { A : { out : [B, C], in : [D] A } B : { in : [A] } B C C : { out : [D], in : [A] } D : { out : [A], in : [C] D } }
  • 26. Representing a Graph in an XML Database <graphml> <graph> <node id=A /> A <node id=B /> <node id=C /> <edge source=A target=B /> <edge source=A target=C /> <edge source=C target=D /> B C <edge source=D target=A /> </graph> </graphml> D
  • 27. Deļ¬ning a Graph Database ā€œIf any database can represent a graph, then what is a graph database?ā€
  • 28. Deļ¬ning a Graph Database A graph database is any storage system that provides index-free adjacency.45 4 There is no ā€œoļ¬ƒcialā€ deļ¬nition of what makes a database a graph database. The one provided is my deļ¬nition. However, hopefully the following argument will convince you that this is a necessary deļ¬nition. 5 There is adjacency between the elements of an index, but if the index is not the primary data structure of concern (to the developer), then there is indirect/implicit adjacency, not direct/explicit adjacency. A graph database exposes the graph as an explicit data structure (not an implicit data structure).
  • 29. Deļ¬ning a Graph Database ā€¢ Every element (i.e. vertex or edge) has a direct pointer to its adjacent element. ā€¢ No O(log2(n)) index lookup required to determine which vertex is adjacent to which other vertex. ā€¢ If the graph is connected, the graph as a whole is a single atomic data structure.
  • 30. Deļ¬ning a Graph Database by Example Toy Graph Gremlin (stuntman) B E A C D
  • 31. Graph Databases and Index-Free Adjacency B E A C D ā€¢ Our gremlin is at vertex A. ā€¢ In a graph database, vertex A has direct references to its adjacent vertices. ā€¢ Constant time cost to move from A to B and C . It is dependent upon the number of edges emanating from vertex A (local).
  • 32. Graph Databases and Index-Free Adjacency B E A C D The Graph (explicit)
  • 33. Graph Databases and Index-Free Adjacency B E A C D The Graph (explicit)
  • 34. Non-Graph Databases and Index-Based Adjacency B E A B C A B,C E D,E D E C D ā€¢ Our gremlin is at vertex A. ā€¢ In a non-graph database, the gremlin needs to look at an index to determine what is adjacent to A. ā€¢ log2(n) time cost to move to B and C . It is dependent upon the total number of vertices and edges in the database (global).
  • 35. Non-Graph Databases and Index-Based Adjacency B E A B C A B,C E D,E D E C D The Index (explicit) The Graph (implicit)
  • 36. Non-Graph Databases and Index-Based Adjacency B E A B C A B,C E D,E D E C D The Index (explicit) The Graph (implicit)
  • 37. Index-Free Adjacency ā€¢ While any database can implicitly represent a graph, only a graph database makes the graph structure explicit. ā€¢ In a graph database, each vertex serves as a ā€œmini indexā€ of its adjacent elements.6 ā€¢ Thus, as the graph grows in size, the cost of a local step remains the same.7 6 Each vertex can be intepreted as a ā€œparent nodeā€ in an index with its children being its adjacent elements. In this sense, traversing a graph is analogous in many ways to traversing an indexā€”albeit the graph is not an acyclic connected graph (tree). 7 A graph, in many ways, is like a distributed index.
  • 38. Graph Databases Make Use of Indices A B C } Index of Vertices (by id) D E } The Graph ā€¢ There is more to the graph than the explicit graph structure. ā€¢ Indices index the vertices, by their properties (e.g. ids).
  • 39. Graph Databases and Endogenous Indices ā€¢ Many indices are trees.8 ā€¢ A tree is a type of constrained graph.9 ā€¢ You can represent a tree with a graph.10 8 Even an ā€œindexā€ that is simply an O(n) container can be represented as a graph (e.g. linked list). 9 A tree is an acyclic connected graph with each vertex having at most one parent. 10 This follows as a consequence of a tree being a graph.
  • 40. Graph Databases and Endogenous Indices ā€¢ Graph databases allows you to explicitly model indices endogenous to your domain model. Your indices and domain model are one atomic entityā€”a graph.11 ā€¢ This has beneļ¬ts in designing special-purpose index structures for your data. Think about all the numerous types of indices in the geo-spatial community.12 Think about all the indices that you have yet to think about. 11 Originally, Neo4j used itself as its own indexing system before moving to Lucene. 12 Craig Taverner explores the use of graph databases in GIS-based applications.
  • 41. Graph Databases and Endogenous Indices name property index views property index gender property index name=neo4j views=56781 page_rank=0.023 cites cites name=tenderlove gender=male created created created date=2007/10 cites follows follows created name=peterneubauer follows name=graph_blog follows views=1000 follows created name=ahzf name=twarko age=30
  • 42. Graph Databases and Endogenous Indices name property index views property index gender property index name=neo4j views=56781 page_rank=0.023 cites cites name=tenderlove gender=male created created created date=2007/10 cites follows follows created name=peterneubauer follows name=graph_blog follows views=1000 follows created name=ahzf name=twarko age=30 The Graph Dataset
  • 43. Outline ā€¢ Graph Structures ā€¢ Graph Databases ā€¢ Graph Traversals Artiļ¬cial Example Real-World Examples
  • 44. Graph Traversals as the Foundation ā€¢ Question: Once I have my data represented as a graph, what can I do with it? ā€¢ Answer: You can traverse over the graph to solve problems.
  • 45. Outline ā€¢ Graph Structures ā€¢ Graph Databases ā€¢ Graph Traversals Artiļ¬cial Example Real-World Examples
  • 46. Graph Database vs. Relational Database ā€¢ While any database can represent a graph, it takes time to make what is implicit explicit. ā€¢ The graph database represents an explicit graph. ā€¢ The experiment that follows demonstrate the problem with using lots of table JOINs to accomplish the eļ¬€ect of a graph traversal.13 13 Though not presented in this lecture, similar results were seen with JSON document databases.
  • 47. Neo4j vs. MySQL ā€“ Generating a Large Graph ā€¢ Generated a 1 million vertex/4 million edge graph with ā€œnatural statistics.ā€14 ā€¢ Loaded the graph into both Neo4j and MySQL in order to empirically evaluate the eļ¬€ect of index-free adjacency. 14 What is diagramed is a small subset of this graph 1 million vertex graph.
  • 48. Neo4j vs. MySQL ā€“ The Experiment ā€¢ For each run of the experiment, a traverser (gremlin) is placed on a single vertex. ā€¢ For each step, the traverser moves to its adjacent vertices. Neo4j (graph database): the adjacent vertices are provided by the current vertex.15 MySQL (relational database): the adjacent vertices are provided by a table JOIN. ā€¢ For the experiment, this process goes up to 5 steps. 15 You can think of a graph traversal, in a graph database, as a local neighborhood JOIN.
  • 49. Neo4j vs. MySQL ā€“ The Experiment (Zoom-In Subset)
  • 50. Neo4j vs. MySQL ā€“ The Experiment (Step 1)
  • 51. Neo4j vs. MySQL ā€“ The Experiment (Step 2)
  • 52. Neo4j vs. MySQL ā€“ The Experiment (Step 3)
  • 53. Neo4j vs. MySQL ā€“ The Experiment (Step 4)
  • 54. Neo4j vs. MySQL ā€“ The Experiment (Step 5)
  • 55. Neo4j vs. MySQL ā€“ The Results total running time (ms) for step traverals of length n total running time (ms) for traversals of length n mysql 2.3x faster neo4j 100000 time(ms) 60000 0 20000 2.6x faster 4.5x faster 1.9x faster 1 2 3 4 traversal length steps average over the 250 most dense vertices as root of the traveral ā€¢ At step 5, Neo4j completed it in 14 minutes. ā€¢ At step 5, MySQL was still running after 2 hours (process stopped).
  • 56. Neo4j vs. MySQL ā€“ More Information For more information on this experiment, please visit http://markorodriguez.com/Blarko/Entries/2010/3/ 29_MySQL_vs._Neo4j_on_a_Large-Scale_Graph_ Traversal.html
  • 57. Why Use a Graph Database? ā€“ Data Locality ā€¢ If the solution to your problem can be represented as a local process within a larger global data structure, then a graph database may be the optimal solution for your problem. ā€¢ If the solution to your problem can be represented as being with respect to a set of root elements, then a graph database may be the optimal solution to your problem. ā€¢ If the solution to your problem does not require a global analysis of your data, then a graph database may be the optimal solution to your problem.
  • 58. Why Use a Graph Database? ā€“ Data Locality
  • 59. Outline ā€¢ Graph Structures ā€¢ Graph Databases ā€¢ Graph Traversals Artiļ¬cial Example Real-World Examples
  • 60. Some Graph Traversal Use Cases ā€¢ Local searches ā€” ā€œWhat is in the neighborhood around A?ā€16 ā€¢ Local recommendations ā€” ā€œGiven A, what should A include in their neighborhood?ā€17 ā€¢ Local ranks ā€” ā€œGiven A, how would you rank B relative to A?ā€18 16 A can be an individual vertex or a set of vertices. This set is known as the root vertex set. 17 Recommendation can be seen as trying to increase the connectivity of the graph by recommending vertices (e.g. items) for another vertex (e.g. person) to extend an edge to (e.g. purchased). 18 In this presentation, there will be no examples provided for this use case. However, note that searching, ranking, and recommendation are presented in the WindyCityDB OpenLab Graph Database Tutorial. Other terms for local rank are ā€œrank with priorsā€ or ā€œrelative rank.ā€
  • 61. Graph Traversals with Gremlin Programming Language Gremlin G = (V, E) http://gremlin.tinkerpop.com The examples to follow are as basic as possible to get the idea across. Note that numerous variations to the themes presented can be created. Such variations are driven by the richness of the underlying graph data set and the desired speed of evaluation.
  • 62. Graph Traversals with Gremlin Programming Language 1 created 3 knows created 4 name = peter age = 37
  • 63. Graph Traversals with Gremlin Programming Language vertex 3 in edges vertex 1 out edges edge label edge out vertex edge in vertex 1 created 3 knows created 4 name = peter age = 37 vertex 4 properties vertex 4 id
  • 64. Graph Traversal in Property Graphs name=tobias follows name=alex created name=C created created name=emil follows name=B name=E follows name=A created created name=johan created name=D follows name=peter Red vertices are people and blue vertices are webpages.
  • 65. Local Search: ā€œWho are the followers of Emil Eifrem?ā€ name=tobias follows name=alex created name=C created created name=emil follows 2 name=B name=alex name=E name=A 1 follows 2 name=johan created created name=johan created name=D follows name=peter ./outE[@label=Ź»followsŹ¼]/inV
  • 66. Local Search: ā€œWhat webpages did Emilā€™s followers create?ā€ name=tobias follows created name=C name=alex 3 created created name=emil follows 2 name=B name=B name=E name=A name=A 1 follows 2 created created name=johan 3 created name=D follows ./outE[@label=Ź»followsŹ¼]/inV name=peter /outE[@label=Ź»createdŹ¼]/inV
  • 67. Local Search: ā€œWhat webpages did Emilā€™s followers followers create?ā€ name=tobias follows name=alex 3 created name=C name=C created created name=emil 4 follows 2 name=B name=D name=E 1 follows 2 name=A 4 4 name=E created created 4 name=E name=johan created name=D 3 follows ./outE[@label=Ź»followsŹ¼]/inV/ name=peter outE[@label=Ź»followsŹ¼]/inV/ outE[@label=Ź»createdŹ¼]/inV
  • 68. Local Recommendation: ā€œIf you like webpage E, you may also like...ā€ name=tobias follows 2 created name=C name=alex created created 3 name=emil follows name=B name=C name=E name=A 1 name=D follows created created 3 name=johan name=D 2 created follows ./inV[@label='created']/outV/ name=peter outE[@label='created']/inV[g:except($_)] Assumption: if you like a webpage by X , you will like others that they have created.
  • 69. Local Recommendation: ā€œIf you like Johan, you may also like...ā€ name=tobias follows name=alex created name=C created created name=emil follows 3 name=B name=E name=alex 2 follows 1 name=A created created name=johan created name=D follows ./inV[@label='follows']/outV/ name=peter outE[@label='follows']/inV Assumption: if many people follow the same two people, then those two may be similar.
  • 70. Assortment of Other Speciļ¬c Graph Traversal Use Cases ā€¢ Missing friends: Find all the friends of person A. Then ļ¬nd all the friends of the friends of person A that are not also person Aā€™s friends.19 ./outE[@label=ā€˜friendā€™]/inV[g:assign(ā€˜$xā€™)]/ outE[@label=ā€˜friendā€™]/inV[g:except($x)] ā€¢ Collaborative ļ¬ltering: Find all the items that the person A likes. Then ļ¬nd all the people that like those same items. Then ļ¬nd which items those people like that are not already the items that are liked by person A.20 ./outE[@label=ā€˜likesā€™]/inV[g:assign(ā€˜$xā€™)]/ inE[@label=ā€˜likesā€™]/outV/outE[@label=ā€˜likesā€™]/inV[g:except($x)] 19 This algorithm is based on the notion of trying to close ā€œopen trianglesā€ in the friendship graph. If many of person Aā€™s friends are friends with person B , then its likely that A and B know each other. 20 This is the most concise representation of collaborative ļ¬ltering. There are numerous modiļ¬cations to this general theme that can be taken advantage of to alter the recommendations.
  • 71. Assortment of Other Speciļ¬c Graph Traversal Use Cases ā€¢ Question expert identiļ¬cation: Find all the tags associated with question A. For all those tag, ļ¬nd all answers (for any questions) that are tagged by those tags. For those answers, ļ¬nd who created those answers.21 ./inE[@label=ā€˜tagā€™]/outV[@type=ā€˜answerā€™]/inE[@label=ā€˜createdā€™]/outV ā€¢ Similar tags: Find all the things that tag A has been used as a tag for. For all those things, determine what else they have been tagged with.22 ./inE[@label=ā€˜tagā€™]/outV/outE[@label=ā€˜tagā€™]/inV[g:except($_)] 21 If two resources share a ā€œbundleā€ of resources in common, then they are similar. 22 This is the notion of ā€œco-associationā€ and can be generalized to ļ¬nd the similarity of two resources based upon their co-association through a third resource (e.g. co-author, co-usage, co-download, etc.). The third resource and the edge labels traversed determine the meaning of the association.
  • 72. Some Tips on Graph Traversals ā€¢ Ranking, scoring, recommendation, searching, etc. are all variations on the basic theme of deļ¬ning abstract paths through a graph and realizing instances of those paths through traversal. ā€¢ The type of path taken determines the meaning (i.e semantics) of the rank, score, recommendation, search, etc. ā€¢ Given the data locality aspect of graph databases, many of these traversals run in real-time (< 100ms).
  • 73. Property Graph Algorithms in General ā€¢ There is a general framework for mapping all the standard single-relational graph analysis algorithms over to the property graph domain.23 Geodesics: shortest path, eccentricity, radius, diameter, closeness, betweenness, etc.24 Spectral: random walks, page rank, spreading activation, priors, etc.25 Assortativity: scalar or categorical. ... any graph algorithm in general. ā€¢ All able to be represented in Gremlin. 23 Rodriguez M.A., Shinavier, J., ā€œExposing Multi-Relational Networks to Single-Relational Network Analysis Algorithms,ā€ Journal of Informetrics, 4(1), pp. 29ā€“41, 2009. [http://arxiv.org/abs/0806.2274] 24 Rodriguez, M.A., Watkins, J.H., ā€œGrammar-Based Geodesics in Semantic Networks,ā€ Knowledge-Based Systems, in press, 2010. 25 Rodriguez, M.A., ā€œGrammar-Based Random Walkers in Semantic Networks,ā€ Knowledge-Based Systems, 21(7), pp. 7270ā€“739, 2008. [http://arxiv.org/abs/0803.4355]
  • 74. Conclusion ā€¢ Graph databases are eļ¬ƒcient with respects to local data analysis. ā€¢ Locality is deļ¬ned by direct referent structures. ā€¢ Frame all solutions to problems as a traversal over local regions of the graph. This is the Graph Traversal Pattern.
  • 75. Acknowledgements ā€¢ Pavel Yaskevich for advancing Gremlin. Pavel is currently writing a new compiler that will make Gremlin faster and more memory eļ¬ƒcient. ā€¢ Peter Neubauer for his collaboration on many of the ideas discussed in this presentation. ā€¢ The rest of the Neo4j team (Emil, Johan, Mattias, Alex, Tobias, David, Anders (1 and 2)) for their comments. ā€¢ WindyCityDB organizers for their support. ā€¢ AT&T Interactive (Aaron, Rand, Charlie, and the rest of the Buzz team) for their support.
  • 76. References to Related Work ā€¢ Rodriguez, M.A., Neubauer, P., ā€œConstructions from Dots and Lines,ā€ Bulletin of the American Society of Information Science and Technology, June 2010. [http://arxiv.org/abs/1006.2361] ā€¢ Rodriguez, M.A., Neubauer, P., ā€œThe Graph Traversal Pattern,ā€ AT&Ti and NeoTechnology Technical Report, April 2010. [http://arxiv.org/abs/1004.1001] ā€¢ Neo4j: A Graph Database [http://neo4j.org] ā€¢ TinkerPop [http://tinkerpop.com] Blueprints: Data Models and their Implementations [http://blueprints.tinkerpop.com] Pipes: A Data Flow Framework using Process Graphs [http://pipes.tinkerpop.com] Gremlin: A Graph-Based Programming Language [http://gremlin.tinkerpop.com] Rexster: A Graph-Based Ranking Engine [http://rexster.tinkerpop.com] āˆ— Wreckster: A Ruby API for Rexster [http://github.com/tenderlove/wreckster]