No SQL?




          Image credit: http://browsertoolkit.com/fault-tolerance.png
No SQL?




          Image credit: http://browsertoolkit.com/fault-tolerance.png
No SQL?




          Image credit: http://browsertoolkit.com/fault-tolerance.png
Neo4j       and a
                 NOSQL overview



                                  #neo4j
Emil Eifrem                       @emileifrem
CEO, Neo Technology               emil@neotechnology.com
What's the plan?

  Why now? – Four trends

  NOSQL overview

  Graph databases && Neo4j

  Conclusions
Trend 1:
data set size




40
2007            Source: IDC 2007
988

Trend 1:
data set size




40
2007            2010   Source: IDC 2007
Trend 2: connectedness
                                                                                           Giant
                                                                                          Global
                                                                                          Graph
                                                                                          (GGG)
 Information connectivity




                                                                             Ontologies


                                                                      RDF

                                                                                  Folksonomies
                                                                  Tagging


                                                      Wikis             User-
                                                                      generated
                                                                       content
                                                              Blogs


                                                     RSS


                                         Hypertext


                               Text
                            documents     web 1.0             web 2.0             “web 3.0”
                                        1990         2000                   2010                   2020
Trend 3: semi-structure
  Individualization of content!
     In the salary lists of the 1970s, all elements had exactly
     one job
     In the salary lists of the 2000s, we need 5 job columns!
     Or 8? Or 15?

  Trend accelerated by the decentralization of content
  generation that is the hallmark of the age of participation
  (“web 2.0”)
Aside: RDBMS performance
                                                            Relational database
               Salary List
 Performance




                             Majority of
                             Webapps



                                           Social network

                                                                 Semantic Trading




                                                  }           custom


                                                Data complexity
Trend 4: architecture


         1990s: Database as integration hub


          Application   Application   Application




                           DB
Trend 4: architecture


        2000s:   (Slowly towards...)   Decoupled services
                    with their own backend

            Service             Service        Service




             DB                   DB            DB
Why NOSQL 2009?

 Trend 1: Size.

 Trend 2: Connectivity.

 Trend 3: Semi-structure.

 Trend 4: Architecture.
NOSQL
overview
First off: the name

  NoSQL is NOT “Never SQL”

  NoSQL is NOT “No To SQL”
NOSQL
      is simply


N ot O nly S Q L !
Four (emerging) NOSQL categories
 Key-value stores
   Based on Amazon's Dynamo paper
   Data model: (global) collection of K-V pairs
   Example: Dynomite, Voldemort, Tokyo*

 ColumnFamily / BigTable clones
   Based on Google's BigTable paper
   Data model: big table, column families
   Example: HBase, Hypertable, Cassandra
Four (emerging) NOSQL categories
 Document databases
   Inspired by Lotus Notes
   Data model: collections of K-V collections
   Example: CouchDB, MongoDB, Riak

 Graph databases
   Inspired by Euler & graph theory
   Data model: nodes, rels, K-V on both
   Example: AllegroGraph, Sones, Neo4j
NOSQL data models
            Key-value stores
Data size




                          Bigtable clones


                                            Document
                                            databases


                                                           Graph databases




                                                        Data complexity
NOSQL data models
               Key-value stores
   Data size




                             Bigtable clones


                                               Document
                                               databases


                                                              Graph databases

                                                                          (This is still billio ns of
 90%                                                                      nodes & relationships)
  of
 use
cases




                                                           Data complexity
Graph DBs
& Neo4j intro
The Graph DB model: representation
 Core abstractions:
                                              name = “Emil”

   Nodes
                                              age = 29
                                              sex = “yes”


   Relationships between nodes
   Properties on both                     1                         2



                         type = KNOWS
                         time = 4 years                       3

                                                                  type = car
                                                                  vendor = “SAAB”
                                                                  model = “95 Aero”
Example: The Matrix
                                                                           name = “The Architect”
                             name = “Morpheus”
                             rank = “Captain”
name = “Thomas Anderson”
                             occupation = “Total badass”                                            42
age = 29
                                                 disclosure = public


                 KNOWS                            KNOWS                                              CODED_BY
                                                                                KNO
     1                                  7                              3           W   S

                 KN
                                                                                                13
                                             S

                    OW                                   name = “Cypher”
                                        KNOW



                         S                               last name = “Reagan”
                                                                                             name = “Agent Smith”
                                                                       disclosure = secret   version = 1.0b
         age = 3 days                                                  age = 6 months        language = C++
                                    2

                             name = “Trinity”
Code (1): Building a node space
GraphDatabaseService graphDb = ... // Get factory


// Create Thomas 'Neo' Anderson
Node mrAnderson = graphDb.createNode();
mrAnderson.setProperty( "name", "Thomas Anderson" );
mrAnderson.setProperty( "age", 29 );

// Create Morpheus
Node morpheus = graphDb.createNode();
morpheus.setProperty( "name", "Morpheus" );
morpheus.setProperty( "rank", "Captain" );
morpheus.setProperty( "occupation", "Total bad ass" );

// Create a relationship representing that they know each other
mrAnderson.createRelationshipTo( morpheus, RelTypes.KNOWS );
// ...create Trinity, Cypher, Agent Smith, Architect similarly
Code (1): Building a node space
GraphDatabaseService graphDb = ... // Get factory
Transaction tx = graphDb.beginTx();

// Create Thomas 'Neo' Anderson
Node mrAnderson = graphDb.createNode();
mrAnderson.setProperty( "name", "Thomas Anderson" );
mrAnderson.setProperty( "age", 29 );

// Create Morpheus
Node morpheus = graphDb.createNode();
morpheus.setProperty( "name", "Morpheus" );
morpheus.setProperty( "rank", "Captain" );
morpheus.setProperty( "occupation", "Total bad ass" );

// Create a relationship representing that they know each other
mrAnderson.createRelationshipTo( morpheus, RelTypes.KNOWS );
// ...create Trinity, Cypher, Agent Smith, Architect similarly

tx.commit();
Code (1b): Def ning RelationshipTypes
             i
// In package org.neo4j.graphdb
public interface RelationshipType
{
   String name();
}

// In package org.yourdomain.yourapp
// Example on how to roll dynamic RelationshipTypes
class MyDynamicRelType implements RelationshipType
{
   private final String name;
   MyDynamicRelType( String name ){ this.name = name; }
   public String name() { return this.name; }
}

// Example on how to kick it, static-RelationshipType-like
enum MyStaticRelTypes implements RelationshipType
{
   KNOWS,
   WORKS_FOR,
}
Whiteboard friendly



                                 owns
                      Björn                  Big Car
                        build             drives


                                DayCare
The Graph DB model: traversal
 Traverser framework for
 high-performance traversing                    name = “Emil”
                                                age = 31

 across the node space                          sex = “yes”




                                            1                         2



                           type = KNOWS
                           time = 4 years                       3

                                                                    type = car
                                                                    vendor = “SAAB”
                                                                    model = “95 Aero”
Example: Mr Anderson’s friends
                                                                           name = “The Architect”
                             name = “Morpheus”
                             rank = “Captain”
name = “Thomas Anderson”
                             occupation = “Total badass”                                            42
age = 29
                                                 disclosure = public


                 KNOWS                            KNOWS                                              CODED_BY
                                                                                KNO
     1                                  7                              3           W   S

                 KN
                                                                                                13
                                             S

                    OW                                   name = “Cypher”
                                        KNOW



                         S                               last name = “Reagan”
                                                                                             name = “Agent Smith”
                                                                       disclosure = secret   version = 1.0b
         age = 3 days                                                  age = 6 months        language = C++
                                    2

                             name = “Trinity”
Code (2): Traversing a node space
// Instantiate a traverser that returns Mr Anderson's friends
Traverser friendsTraverser = mrAnderson.traverse(
   Traverser.Order.BREADTH_FIRST,
   StopEvaluator.END_OF_GRAPH,
   ReturnableEvaluator.ALL_BUT_START_NODE,
   RelTypes.KNOWS,
   Direction.OUTGOING );

// Traverse the node space and print out the result
System.out.println( "Mr Anderson's friends:" );
for ( Node friend : friendsTraverser )
{
   System.out.printf( "At depth %d => %s%n",
       friendsTraverser.currentPosition().getDepth(),
       friend.getProperty( "name" ) );
}
name = “The Architect”
                              name = “Morpheus”
                              rank = “Captain”
 name = “Thomas Anderson”
                              occupation = “Total badass”                                            42
 age = 29
                                                  disclosure = public


                  KNOWS                            KNOWS                                              CODED_BY
                                                                                 KNO
      1                                  7                              3           W     S

                  KN                                                                              13

                                              S
                     OW                                   name = “Cypher”

                                         KNOW
                          S                               last name = “Reagan”
                                                                                               name = “Agent Smith”
                                                                        disclosure = secret    version = 1.0b
          age = 3 days                                                  age = 6 months         language = C++
                                     2

                              name = “Trinity”
                                                                    $ bin/start-neo-example
                                                                    Mr Anderson's friends:

                                                                    At      depth     1   =>   Morpheus
friendsTraverser = mrAnderson.traverse(
  Traverser.Order.BREADTH_FIRST,                                    At      depth     1   =>   Trinity
  StopEvaluator.END_OF_GRAPH,                                       At      depth     2   =>   Cypher
  ReturnableEvaluator.ALL_BUT_START_NODE,
  RelTypes.KNOWS,
                                                                    At      depth     3   =>   Agent Smith
  Direction.OUTGOING );                                             $
Example: Friends in love?
                                                                            name = “The Architect”
                               name = “Morpheus”
                               rank = “Captain”
name = “Thomas Anderson”
                               occupation = “Total badass”                                           42
age = 29
                                                  disclosure = public


                   KNOWS                           KNOWS                                              CODED_BY
                                                                                 KNO
     1                                    7                             3           W   S

                   KN                                                                            13
                                             S
                                        KNOW


                      OW                                  name = “Cypher”
                           S                              last name = “Reagan”
                                                                                              name = “Agent Smith”
         LO                                                             disclosure = secret   version = 1.0b
            VE                                                          age = 6 months        language = C++
               S
                                      2

                               name = “Trinity”
Code (3a): Custom traverser
// Create a traverser that returns all “friends in love”
Traverser loveTraverser = mrAnderson.traverse(
   Traverser.Order.BREADTH_FIRST,
   StopEvaluator.END_OF_GRAPH,
   new ReturnableEvaluator()
   {
       public boolean isReturnableNode( TraversalPosition pos )
       {
          return pos.currentNode().hasRelationship(
              RelTypes.LOVES, Direction.OUTGOING );
       }
   },
   RelTypes.KNOWS,
   Direction.OUTGOING );
Code (3a): Custom traverser
// Traverse the node space and print out the result
System.out.println( "Who’s a lover?" );
for ( Node person : loveTraverser )
{
   System.out.printf( "At depth %d => %s%n",
       loveTraverser.currentPosition().getDepth(),
       person.getProperty( "name" ) );
}
name = “The Architect”
                                name = “Morpheus”
                                rank = “Captain”
 name = “Thomas Anderson”
                                occupation = “Total badass”                                            42
 age = 29
                                                    disclosure = public


                    KNOWS                            KNOWS                         KNO                  CODED_BY
      1                                    7                              3           W   S

                    KN                                                                             13

                                                S
                                                            name = “Cypher”

                                           KNOW
                       OW
                            S                               last name = “Reagan”
                                                                                                name = “Agent Smith”
          LO                                                              disclosure = secret   version = 1.0b
             VE                                                           age = 6 months        language = C++
                S
                                       2

                                name = “Trinity”
                                                                     $ bin/start-neo-example
new ReturnableEvaluator()
                                                                     Who’s a lover?
{
  public boolean isReturnableNode(
    TraversalPosition pos)
                                                                     At depth 1 => Trinity
  {                                                                  $
    return pos.currentNode().
      hasRelationship( RelTypes.LOVES,
         Direction.OUTGOING );
  }
},
Bonus code: domain model
    How do you implement your domain model?
    Use the delegator pattern, i.e. every domain entity wraps a
    Neo4j primitive:
// In package org.yourdomain.yourapp
class PersonImpl implements Person
{
   private final Node underlyingNode;
   PersonImpl( Node node ){ this.underlyingNode = node; }

    public String getName()
    {
       return (String) this.underlyingNode.getProperty( "name" );
    }
    public void setName( String name )
    {
       this.underlyingNode.setProperty( "name", name );
    }
}
Domain layer frameworks
 Qi4j (www.qi4j.org)
   Framework for doing DDD in pure Java5
   Def nes Entities / Associations / Properties
      i
      Sound familiar? Nodes / Rel’s / Properties!
   Neo4j is an “EntityStore” backend

 Jo4neo (http://code.google.com/p/jo4neo)
   Annotation driven
   Weaves Neo4j-backed persistence into domain objects
   at runtime
Neo4j system characteristics
  Disk-based
     Native graph storage engine with custom binary on-disk
     format
  Transactional
    JTA/JTS, XA, 2PC, Tx recovery, deadlock detection,
    MVCC, etc
  Scales up
    Many billions of nodes/rels/props on single JVM
  Robust
    6+ years in 24/7 production
Social network pathE xists()
          12
                               ~1k persons
7         1
                           3   Avg 50 friends per
                               person
                               pathExists(a, b) limit
                               depth 4
    36
         41
                 5
                      77
                               Two backends
                               Eliminate disk IO so
                               warm up caches
Social network pathE xists()

                 2
                Emil
         1                                    5
                                      7
        Mike                                Kevin
                           3        John
                         Marcus
                   9                4
                 Bruce            Leigh

                                  # persons query time
Relational database                   1 000 2 000 ms
Graph database (Neo4j)                1 000      2 ms
Graph database (Neo4j)            1 000 000      2 ms
Pros & Cons compared to RDBMS
+ No O/R impedance mismatch (whiteboard friendly)
+ Can easily evolve schemas
+ Can represent semi-structured info
+ Can represent graphs/networks (with performance)

-   Lacks in tool and framework support
-   Few other implementations => potential lock in
-   No support for ad-hoc queries
Query languages
 SPARQL – “SQL for linked data”
    Ex: ”SELECT ?person WHERE {
          ?person neo4j:KNOWS ?friend .
          ?friend neo4j:KNOWS ?foe .
          ?foe neo4j:name “Larry Ellison” .
     }”


 Gremlin – “perl for graphs”
    Ex: ”./outE[@label='KNOWS']/inV[@age   > 30]/@name”
    Chec it out at http://try.neo4j.org
The Neo4j ecosystem
 Neo4j is an embedded database
   Tiny teeny lil jar f le
                      i
 Component ecosystem
   index
   meta-model
   graph-matching
   remote-graphdb
   sparql-engine
   ...
 See http://components.neo4j.org
Language bindings
 Neo4j.py – bindings for Jython and CPython
   http://components.neo4j.org/neo4j.py
 Neo4jrb – bindings for JRuby (incl RESTful API)
   http://wiki.neo4j.org/content/Ruby
 Neo4jrb-simple
   http://github.com/mdeiters/neo4jr-simple
 Clojure
   http://wiki.neo4j.org/content/Clojure
 Scala (incl RESTful API)
   http://wiki.neo4j.org/content/Scala
Grails Neoclipse screendump
Scale out – replication
  Neo4j HA in internal beta testing with cust's now
  Master-slave replication, 1st configuration
     MySQL style... ish
     Except all instances can write, synchronously between
     writing slave & master (strong consistency)
     Updates are asynchronously propagated to the other
     slaves (eventual consistency)
  This can handle billions of entities...
  … but not 100B
Scale out – partitioning
  “Sharding possible today”
     … but you have to do manual work
     … just as with MySQL
  Transparent partitioning? Neo4j 2.0
    100B? Easy to say. Sliiiiightly harder to do.
    Fundamentals: BASE & eventual consistency
    Generic clustering algorithm as base case, but give lots of
    knobs for developers
While we're on future stuff
  Neo4j 1.1 [July 2010]
     Full JMX / SNMP support
     Event framework (feedback plz!)
     Traverser framework iteration
     …
  Neo4j HA FCS this summer, GA this fall
  REST-ful API (already out in 0.8, please give feedback for 1.0!)
  Framework integration: Roo, Grails
  Database integration: CouchDB, MySQL
  Neo4j Spatial (uDig integr, {R,Quad}-tree etc, OSM imports)
  Client tools: Webling, Neoclipse, Monitoring console
How ego are you? (aka other impls?)
  Franz’ A lleg roG ra ph      (http://agraph.franz.com)

     Proprietary, Lisp, RDF-oriented but real graphdb
  Sones g ra phD B      (http://sones.com)

     Proprietary, .NET, in beta
  Twitter's Floc k D B    (http://github.com/twitter/flockdb)

     Twitter's graph database for large and shallow graphs
  Google P reg el   (http://bit.ly/dP9IP)

    We are oh-so-secret
  Some academic papers from ~10 years ago
     G = {V, E} #FAIL
Conclusion
 Graphs && Neo4j => teh awesome!
 Available NOW under AGPLv3 / commercial license
   AGPLv3: “if you’re open source, we’re open source”
   If you have proprietary software? Must buy a commercial
   license
   But the first one is free!
 Download
   http://neo4j.org
 Feedback
   http://lists.neo4j.org
Questions?




             Image credit: lost again! Sorry :(
http://neotechnology.com

NOSQL overview and intro to graph databases with Neo4j (Geeknight May 2010)

  • 1.
    No SQL? Image credit: http://browsertoolkit.com/fault-tolerance.png
  • 2.
    No SQL? Image credit: http://browsertoolkit.com/fault-tolerance.png
  • 3.
    No SQL? Image credit: http://browsertoolkit.com/fault-tolerance.png
  • 4.
    Neo4j and a NOSQL overview #neo4j Emil Eifrem @emileifrem CEO, Neo Technology emil@neotechnology.com
  • 5.
    What's the plan? Why now? – Four trends NOSQL overview Graph databases && Neo4j Conclusions
  • 6.
    Trend 1: data setsize 40 2007 Source: IDC 2007
  • 7.
    988 Trend 1: data setsize 40 2007 2010 Source: IDC 2007
  • 8.
    Trend 2: connectedness Giant Global Graph (GGG) Information connectivity Ontologies RDF Folksonomies Tagging Wikis User- generated content Blogs RSS Hypertext Text documents web 1.0 web 2.0 “web 3.0” 1990 2000 2010 2020
  • 9.
    Trend 3: semi-structure Individualization of content! In the salary lists of the 1970s, all elements had exactly one job In the salary lists of the 2000s, we need 5 job columns! Or 8? Or 15? Trend accelerated by the decentralization of content generation that is the hallmark of the age of participation (“web 2.0”)
  • 10.
    Aside: RDBMS performance Relational database Salary List Performance Majority of Webapps Social network Semantic Trading } custom Data complexity
  • 11.
    Trend 4: architecture 1990s: Database as integration hub Application Application Application DB
  • 12.
    Trend 4: architecture 2000s: (Slowly towards...) Decoupled services with their own backend Service Service Service DB DB DB
  • 13.
    Why NOSQL 2009? Trend 1: Size. Trend 2: Connectivity. Trend 3: Semi-structure. Trend 4: Architecture.
  • 14.
  • 15.
    First off: thename NoSQL is NOT “Never SQL” NoSQL is NOT “No To SQL”
  • 16.
    NOSQL is simply N ot O nly S Q L !
  • 17.
    Four (emerging) NOSQLcategories Key-value stores Based on Amazon's Dynamo paper Data model: (global) collection of K-V pairs Example: Dynomite, Voldemort, Tokyo* ColumnFamily / BigTable clones Based on Google's BigTable paper Data model: big table, column families Example: HBase, Hypertable, Cassandra
  • 18.
    Four (emerging) NOSQLcategories Document databases Inspired by Lotus Notes Data model: collections of K-V collections Example: CouchDB, MongoDB, Riak Graph databases Inspired by Euler & graph theory Data model: nodes, rels, K-V on both Example: AllegroGraph, Sones, Neo4j
  • 19.
    NOSQL data models Key-value stores Data size Bigtable clones Document databases Graph databases Data complexity
  • 20.
    NOSQL data models Key-value stores Data size Bigtable clones Document databases Graph databases (This is still billio ns of 90% nodes & relationships) of use cases Data complexity
  • 21.
  • 22.
    The Graph DBmodel: representation Core abstractions: name = “Emil” Nodes age = 29 sex = “yes” Relationships between nodes Properties on both 1 2 type = KNOWS time = 4 years 3 type = car vendor = “SAAB” model = “95 Aero”
  • 23.
    Example: The Matrix name = “The Architect” name = “Morpheus” rank = “Captain” name = “Thomas Anderson” occupation = “Total badass” 42 age = 29 disclosure = public KNOWS KNOWS CODED_BY KNO 1 7 3 W S KN 13 S OW name = “Cypher” KNOW S last name = “Reagan” name = “Agent Smith” disclosure = secret version = 1.0b age = 3 days age = 6 months language = C++ 2 name = “Trinity”
  • 24.
    Code (1): Buildinga node space GraphDatabaseService graphDb = ... // Get factory // Create Thomas 'Neo' Anderson Node mrAnderson = graphDb.createNode(); mrAnderson.setProperty( "name", "Thomas Anderson" ); mrAnderson.setProperty( "age", 29 ); // Create Morpheus Node morpheus = graphDb.createNode(); morpheus.setProperty( "name", "Morpheus" ); morpheus.setProperty( "rank", "Captain" ); morpheus.setProperty( "occupation", "Total bad ass" ); // Create a relationship representing that they know each other mrAnderson.createRelationshipTo( morpheus, RelTypes.KNOWS ); // ...create Trinity, Cypher, Agent Smith, Architect similarly
  • 25.
    Code (1): Buildinga node space GraphDatabaseService graphDb = ... // Get factory Transaction tx = graphDb.beginTx(); // Create Thomas 'Neo' Anderson Node mrAnderson = graphDb.createNode(); mrAnderson.setProperty( "name", "Thomas Anderson" ); mrAnderson.setProperty( "age", 29 ); // Create Morpheus Node morpheus = graphDb.createNode(); morpheus.setProperty( "name", "Morpheus" ); morpheus.setProperty( "rank", "Captain" ); morpheus.setProperty( "occupation", "Total bad ass" ); // Create a relationship representing that they know each other mrAnderson.createRelationshipTo( morpheus, RelTypes.KNOWS ); // ...create Trinity, Cypher, Agent Smith, Architect similarly tx.commit();
  • 26.
    Code (1b): Defning RelationshipTypes i // In package org.neo4j.graphdb public interface RelationshipType { String name(); } // In package org.yourdomain.yourapp // Example on how to roll dynamic RelationshipTypes class MyDynamicRelType implements RelationshipType { private final String name; MyDynamicRelType( String name ){ this.name = name; } public String name() { return this.name; } } // Example on how to kick it, static-RelationshipType-like enum MyStaticRelTypes implements RelationshipType { KNOWS, WORKS_FOR, }
  • 27.
    Whiteboard friendly owns Björn Big Car build drives DayCare
  • 29.
    The Graph DBmodel: traversal Traverser framework for high-performance traversing name = “Emil” age = 31 across the node space sex = “yes” 1 2 type = KNOWS time = 4 years 3 type = car vendor = “SAAB” model = “95 Aero”
  • 30.
    Example: Mr Anderson’sfriends name = “The Architect” name = “Morpheus” rank = “Captain” name = “Thomas Anderson” occupation = “Total badass” 42 age = 29 disclosure = public KNOWS KNOWS CODED_BY KNO 1 7 3 W S KN 13 S OW name = “Cypher” KNOW S last name = “Reagan” name = “Agent Smith” disclosure = secret version = 1.0b age = 3 days age = 6 months language = C++ 2 name = “Trinity”
  • 31.
    Code (2): Traversinga node space // Instantiate a traverser that returns Mr Anderson's friends Traverser friendsTraverser = mrAnderson.traverse( Traverser.Order.BREADTH_FIRST, StopEvaluator.END_OF_GRAPH, ReturnableEvaluator.ALL_BUT_START_NODE, RelTypes.KNOWS, Direction.OUTGOING ); // Traverse the node space and print out the result System.out.println( "Mr Anderson's friends:" ); for ( Node friend : friendsTraverser ) { System.out.printf( "At depth %d => %s%n", friendsTraverser.currentPosition().getDepth(), friend.getProperty( "name" ) ); }
  • 32.
    name = “TheArchitect” name = “Morpheus” rank = “Captain” name = “Thomas Anderson” occupation = “Total badass” 42 age = 29 disclosure = public KNOWS KNOWS CODED_BY KNO 1 7 3 W S KN 13 S OW name = “Cypher” KNOW S last name = “Reagan” name = “Agent Smith” disclosure = secret version = 1.0b age = 3 days age = 6 months language = C++ 2 name = “Trinity” $ bin/start-neo-example Mr Anderson's friends: At depth 1 => Morpheus friendsTraverser = mrAnderson.traverse( Traverser.Order.BREADTH_FIRST, At depth 1 => Trinity StopEvaluator.END_OF_GRAPH, At depth 2 => Cypher ReturnableEvaluator.ALL_BUT_START_NODE, RelTypes.KNOWS, At depth 3 => Agent Smith Direction.OUTGOING ); $
  • 33.
    Example: Friends inlove? name = “The Architect” name = “Morpheus” rank = “Captain” name = “Thomas Anderson” occupation = “Total badass” 42 age = 29 disclosure = public KNOWS KNOWS CODED_BY KNO 1 7 3 W S KN 13 S KNOW OW name = “Cypher” S last name = “Reagan” name = “Agent Smith” LO disclosure = secret version = 1.0b VE age = 6 months language = C++ S 2 name = “Trinity”
  • 34.
    Code (3a): Customtraverser // Create a traverser that returns all “friends in love” Traverser loveTraverser = mrAnderson.traverse( Traverser.Order.BREADTH_FIRST, StopEvaluator.END_OF_GRAPH, new ReturnableEvaluator() { public boolean isReturnableNode( TraversalPosition pos ) { return pos.currentNode().hasRelationship( RelTypes.LOVES, Direction.OUTGOING ); } }, RelTypes.KNOWS, Direction.OUTGOING );
  • 35.
    Code (3a): Customtraverser // Traverse the node space and print out the result System.out.println( "Who’s a lover?" ); for ( Node person : loveTraverser ) { System.out.printf( "At depth %d => %s%n", loveTraverser.currentPosition().getDepth(), person.getProperty( "name" ) ); }
  • 36.
    name = “TheArchitect” name = “Morpheus” rank = “Captain” name = “Thomas Anderson” occupation = “Total badass” 42 age = 29 disclosure = public KNOWS KNOWS KNO CODED_BY 1 7 3 W S KN 13 S name = “Cypher” KNOW OW S last name = “Reagan” name = “Agent Smith” LO disclosure = secret version = 1.0b VE age = 6 months language = C++ S 2 name = “Trinity” $ bin/start-neo-example new ReturnableEvaluator() Who’s a lover? { public boolean isReturnableNode( TraversalPosition pos) At depth 1 => Trinity { $ return pos.currentNode(). hasRelationship( RelTypes.LOVES, Direction.OUTGOING ); } },
  • 37.
    Bonus code: domainmodel How do you implement your domain model? Use the delegator pattern, i.e. every domain entity wraps a Neo4j primitive: // In package org.yourdomain.yourapp class PersonImpl implements Person { private final Node underlyingNode; PersonImpl( Node node ){ this.underlyingNode = node; } public String getName() { return (String) this.underlyingNode.getProperty( "name" ); } public void setName( String name ) { this.underlyingNode.setProperty( "name", name ); } }
  • 38.
    Domain layer frameworks Qi4j (www.qi4j.org) Framework for doing DDD in pure Java5 Def nes Entities / Associations / Properties i Sound familiar? Nodes / Rel’s / Properties! Neo4j is an “EntityStore” backend Jo4neo (http://code.google.com/p/jo4neo) Annotation driven Weaves Neo4j-backed persistence into domain objects at runtime
  • 39.
    Neo4j system characteristics Disk-based Native graph storage engine with custom binary on-disk format Transactional JTA/JTS, XA, 2PC, Tx recovery, deadlock detection, MVCC, etc Scales up Many billions of nodes/rels/props on single JVM Robust 6+ years in 24/7 production
  • 40.
    Social network pathExists() 12 ~1k persons 7 1 3 Avg 50 friends per person pathExists(a, b) limit depth 4 36 41 5 77 Two backends Eliminate disk IO so warm up caches
  • 41.
    Social network pathExists() 2 Emil 1 5 7 Mike Kevin 3 John Marcus 9 4 Bruce Leigh # persons query time Relational database 1 000 2 000 ms Graph database (Neo4j) 1 000 2 ms Graph database (Neo4j) 1 000 000 2 ms
  • 44.
    Pros & Conscompared to RDBMS + No O/R impedance mismatch (whiteboard friendly) + Can easily evolve schemas + Can represent semi-structured info + Can represent graphs/networks (with performance) - Lacks in tool and framework support - Few other implementations => potential lock in - No support for ad-hoc queries
  • 45.
    Query languages SPARQL– “SQL for linked data” Ex: ”SELECT ?person WHERE { ?person neo4j:KNOWS ?friend . ?friend neo4j:KNOWS ?foe . ?foe neo4j:name “Larry Ellison” . }” Gremlin – “perl for graphs” Ex: ”./outE[@label='KNOWS']/inV[@age > 30]/@name” Chec it out at http://try.neo4j.org
  • 46.
    The Neo4j ecosystem Neo4j is an embedded database Tiny teeny lil jar f le i Component ecosystem index meta-model graph-matching remote-graphdb sparql-engine ... See http://components.neo4j.org
  • 47.
    Language bindings Neo4j.py– bindings for Jython and CPython http://components.neo4j.org/neo4j.py Neo4jrb – bindings for JRuby (incl RESTful API) http://wiki.neo4j.org/content/Ruby Neo4jrb-simple http://github.com/mdeiters/neo4jr-simple Clojure http://wiki.neo4j.org/content/Clojure Scala (incl RESTful API) http://wiki.neo4j.org/content/Scala
  • 50.
  • 51.
    Scale out –replication Neo4j HA in internal beta testing with cust's now Master-slave replication, 1st configuration MySQL style... ish Except all instances can write, synchronously between writing slave & master (strong consistency) Updates are asynchronously propagated to the other slaves (eventual consistency) This can handle billions of entities... … but not 100B
  • 52.
    Scale out –partitioning “Sharding possible today” … but you have to do manual work … just as with MySQL Transparent partitioning? Neo4j 2.0 100B? Easy to say. Sliiiiightly harder to do. Fundamentals: BASE & eventual consistency Generic clustering algorithm as base case, but give lots of knobs for developers
  • 53.
    While we're onfuture stuff Neo4j 1.1 [July 2010] Full JMX / SNMP support Event framework (feedback plz!) Traverser framework iteration … Neo4j HA FCS this summer, GA this fall REST-ful API (already out in 0.8, please give feedback for 1.0!) Framework integration: Roo, Grails Database integration: CouchDB, MySQL Neo4j Spatial (uDig integr, {R,Quad}-tree etc, OSM imports) Client tools: Webling, Neoclipse, Monitoring console
  • 54.
    How ego areyou? (aka other impls?) Franz’ A lleg roG ra ph (http://agraph.franz.com) Proprietary, Lisp, RDF-oriented but real graphdb Sones g ra phD B (http://sones.com) Proprietary, .NET, in beta Twitter's Floc k D B (http://github.com/twitter/flockdb) Twitter's graph database for large and shallow graphs Google P reg el (http://bit.ly/dP9IP) We are oh-so-secret Some academic papers from ~10 years ago G = {V, E} #FAIL
  • 55.
    Conclusion Graphs &&Neo4j => teh awesome! Available NOW under AGPLv3 / commercial license AGPLv3: “if you’re open source, we’re open source” If you have proprietary software? Must buy a commercial license But the first one is free! Download http://neo4j.org Feedback http://lists.neo4j.org
  • 57.
    Questions? Image credit: lost again! Sorry :(
  • 58.