NOSQLEU - Graph Databases and Neo4j
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

NOSQLEU - Graph Databases and Neo4j

on

  • 24,986 views

Presentation about Neo4j from NOSQLEU 2010 ( http://nosqleu.com/ ).

Presentation about Neo4j from NOSQLEU 2010 ( http://nosqleu.com/ ).

Statistics

Views

Total Views
24,986
Views on SlideShare
21,126
Embed Views
3,860

Actions

Likes
60
Downloads
991
Comments
2

31 Embeds 3,860

http://nosql.pl 2203
http://nosql.mypopescu.com 621
http://vittoriop77.blogspot.com 393
http://george.mihailoff.com 310
http://www.slideshare.net 122
http://nosqlpl.tumblr.com 78
http://vittoriop77.blogspot.it 26
http://www.directrss.co.il 18
http://feeds.feedburner.com 13
http://ec2-23-23-184-146.compute-1.amazonaws.com 11
http://vittoriop77.blogspot.in 10
http://www.webyantra.net 7
http://www.globalcloudnotebook.com 6
http://j-reference.blogspot.com 6
http://webcache.googleusercontent.com 5
http://vittoriop77.blogspot.de 4
http://www.rxx.co.il 4
http://translate.googleusercontent.com 3
http://www.techgig.com 3
http://vittoriop77.blogspot.com.br 3
http://vittoriop77.blogspot.com.au 2
http://www.vittoriop77.blogspot.com 2
http://swazzy.com 2
http://j-reference.blogspot.co.uk 1
http://www.hanrss.com 1
http://mbot-2.local 1
http://web.archive.org 1
http://www.mefeedia.com 1
http://www.yutzu.com 1
http://vittoriop77.blogspot.fr 1
http://vittoriop77.blogspot.ca 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
  • And of course, all the integrations with frameworks like Rails, Grails, Griffon, Django etc are doing this, too. See for example
    http://github.com/andreasronge/neo4j for Rails.

    /peter
    Are you sure you want to
    Your message goes here
    Processing…
  • I received a question after the presentation asking me if there are any libraries for simplifying the pattern of delegating the state of a domain object to a node or relationship. If you are interested in such a thing, check out jo4neo: http://code.google.com/p/jo4neo/
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

NOSQLEU - Graph Databases and Neo4j Presentation Transcript

  • 1. Graph Databases and Neo4j twitter: @thobe / #neo4j Tobias Ivarsson email: tobias@neotechnology.com web: http://www.neo4j.org/ Hacker @ Neo Technology web: http://www.thobe.org/
  • 2. NOSQL - Why now? Four trends 2
  • 3. Trend 1: Data size ExaBytes (10¹⁸) of data stored per year 988 1000 Each year more and more digital data is created. Over t wo 750 years we create more digital data than all 623 the data created in history before that. 500 397 253 250 161 0 2006 2007 2008 2009 2010 Data source: IDC 2007 3
  • 4. Trend 2: Connectedness Giant Global Graph (GGG) Over time data has evolved to Ontologies be more and more interlinked and connected. RDF Hypertext has links, Blogs have pingback, Tagging groups all related data Folksonomies Information connectivity Tagging Wikis User-generated content Blogs RSS Hypertext Text documents web 1.0 web 2.0 “web 3.0” 1990 2000 2010 2020 4
  • 5. Trend 3: Semi-structure ๏ Individualization of content • In the salary lists of the 1970s, all elements had exactly one job • In Or 15? lists of the 2000s, we need 5 job columns! Or 8? the salary ๏ All encompassing “entire world views” • Store more data about each entity ๏ Trend accelerated by the decentralization of content generation that is the hallmark of the age of participation (“web 2.0”) 5
  • 6. Trend 4: Architecture 1980s: Mainframe applications Application DB 6
  • 7. Trend 4: Architecture 1990s: Database as integration hub Application Application Application DB 7
  • 8. Trend 4: Architecture 2000s: (moving towards) Decoupled services with their own backend Application Application Application DB DB DB 8
  • 9. Why NOSQL Now? ๏Trend 1: Size ๏Trend 2: Connectedness ๏Trend 3: Semi-structure ๏Trend 4: Architecture 9
  • 10. RDBMS performance Salary List Relational database Requirement of application Performance Majority of Webapps Social network We are building } applications today that Semantic Trading have complexity requirements that a Relational Database cannot handle with sufficient performance custom Data complexity 10
  • 11. Scaling to size vs. Scaling to complexity Size Key/Value stores Bigtable clones Document databases Graph databases Billions of nodes and relationships > 90% of use cases Complexity 11
  • 12. Graph Databases focuses on structure of data Graph databases focus on the structure of the data, scaling to the complexity of the data and of the application. 12
  • 13. What is Neo4j? ๏ Neo4j is a Graph Database • Non-relational (“#nosql”), transactional (ACID), embedded • Data is stored as a Graph / Network ‣Nodes and relationships with properties ‣“Property Graph” or “edge-labeled multidigraph” • Schema free, bottom-up data model design ๏ Neo4j is Open Source / Free (as in speech) Software Prices are available at http://neotechnology.com/ • AGPLv3 Contact us if you have questions and/or special license needs (e.g. if you • Commercial (“dual license”) license available want an evaluation license) ‣First server is free (as in beer), next is inexpensive 13
  • 14. More about Neo4j ๏ Neo4j is stable • In 24/7 operation since 2003 ๏ Neo4j is in active development • Neo Technology received VC funding October 2009 ๏ Neo4j delivers high performance graph operations • traverses 1’000’000+ relationships / second on commodity hardware 14
  • 15. The Neo4j Graph data model •Nodes •Relationships bet ween Nodes •Relationships have Labels •Relationships are directed, but traversed at equal speed in both directions •The semantics of the direction is up to the application (LIVES WITH is reflexive, LOVES is not) •Nodes have key-value properties •Relationships have key-value properties 15
  • 16. The Neo4j Graph data model •Nodes •Relationships bet ween Nodes •Relationships have Labels •Relationships are directed, but traversed at equal speed in both directions •The semantics of the direction is up to the application (LIVES WITH is reflexive, LOVES is not) •Nodes have key-value properties •Relationships have key-value properties 15
  • 17. The Neo4j Graph data model LIVES WITH LOVES OWNS DRIVES •Nodes •Relationships bet ween Nodes •Relationships have Labels •Relationships are directed, but traversed at equal speed in both directions •The semantics of the direction is up to the application (LIVES WITH is reflexive, LOVES is not) •Nodes have key-value properties •Relationships have key-value properties 15
  • 18. The Neo4j Graph data model LOVES LIVES WITH LOVES OWNS DRIVES •Nodes •Relationships bet ween Nodes •Relationships have Labels •Relationships are directed, but traversed at equal speed in both directions •The semantics of the direction is up to the application (LIVES WITH is reflexive, LOVES is not) •Nodes have key-value properties •Relationships have key-value properties 15
  • 19. The Neo4j Graph data model name: “Mary” LOVES name: “James” age: 35 age: 32 LIVES WITH twitter: “@spam” LOVES OWNS DRIVES •Nodes •Relationships bet ween Nodes •Relationships have Labels brand: “Volvo” •Relationships are directed, but traversed at model: “V70” equal speed in both directions •The semantics of the direction is up to the application (LIVES WITH is reflexive, LOVES is not) •Nodes have key-value properties •Relationships have key-value properties 15
  • 20. The Neo4j Graph data model name: “Mary” LOVES name: “James” age: 35 age: 32 LIVES WITH twitter: “@spam” LOVES OWNS item type: “car” DRIVES •Nodes •Relationships bet ween Nodes •Relationships have Labels brand: “Volvo” •Relationships are directed, but traversed at model: “V70” equal speed in both directions •The semantics of the direction is up to the application (LIVES WITH is reflexive, LOVES is not) •Nodes have key-value properties •Relationships have key-value properties 15
  • 21. Graphs are all around us A B C D ... 1 17 3.14 3 17.79333333333 2 42 10.11 14 30.33 3 316 6.66 1 2104.56 4 32 9.11 592 0.492432432432 5 Even if this spreadsheet looks like it could be a fit for a RDBMS 2153.175765766 it isn’t: •RDBMSes have problems with ... extending indefinitely on both rows and columns •Formulas and data dependencies would quickly lead to heavy join operations 16
  • 22. Graphs are all around us A B C D ... 1 17 3.14 3 = A1 * B1 / C1 2 42 10.11 14 = A2 * B2 / C2 3 316 6.66 1 = A3 * B3 / C3 4 32 9.11 592 = A4 * B4 / C4 5 = SUM(D2:D5) With data dependencies ... the spread sheet turns out to be a graph. 17
  • 23. Graphs are all around us A B C D ... 1 17 3.14 3 = A1 * B1 / C1 2 42 10.11 14 = A2 * B2 / C2 3 316 6.66 1 = A3 * B3 / C3 4 32 9.11 592 = A4 * B4 / C4 5 = SUM(D2:D5) With data dependencies ... the spread sheet turns out to be a graph. 17
  • 24. Graphs are all around us If we add external data sources the problem becomes even more interesting... 17 3.14 3 = A1 * B1 / C1 42 10.11 14 = A2 * B2 / C2 316 6.66 1 = A3 * B3 / C3 32 9.11 592 = A4 * B4 / C4 = SUM(D2:D5) 18
  • 25. Graphs are all around us If we add external data sources the problem becomes even more interesting... 17 3.14 3 = A1 * B1 / C1 42 10.11 14 = A2 * B2 / C2 316 6.66 1 = A3 * B3 / C3 32 9.11 592 = A4 * B4 / C4 = SUM(D2:D5) 18
  • 26. Graphs are whiteboard friendly An application domain model outlined on a whiteboard or piece of paper would be translated to an ER-diagram, then normalized to fit a Relational Database. With a Graph Database the model from the whiteboard is implemented directly. Image credits: Tobias Ivarsson 19
  • 27. Graphs are whiteboard friendly An application domain model outlined on a whiteboard or piece of paper would be translated to an ER-diagram, then normalized to fit a Relational Database. With a Graph Database the model from the whiteboard is implemented directly. * 1 * * 1 * 1 * 1 * Image credits: Tobias Ivarsson 19
  • 28. Graphs are whiteboard friendly An application domain model outlined on a whiteboard or piece of paper would be translated to an ER-diagram, then normalized to fit a Relational Database. With a Graph Database the model from the whiteboard is implemented directly. thobe Joe project blog Wardrobe Strength Hello Joe Modularizing Jython Neo4j performance analysis Image credits: Tobias Ivarsson 19
  • 29. Query Languages ๏ Traversal APIs • Neo4j core traversers • Blueprint pipes ๏ SPARQL - “SQL for linked data” - query by graph pattern matching SELECT ?person WHERE { Find all persons that ?person neo4j:KNOWS ?friend . KNOWS a friend that ?friend neo4j:KNOWS ?foe . KNOWS someone named “Larry Ellison”. ?foe neo4j:name "Larry Ellison" . } ๏ Gremlin - “perl for graphs” - query by traversal ./outE[@label='KNOWS']/inV[@age > 30]/@name Give me the names of all the people I know that are older than 30. 20
  • 30. Data manipulation API GraphDatabaseService graphDb = getGraphDbInstanceSomehow(); // Create Thomas 'Neo' Anderson Node mrAnderson = graphDb.createNode(); mrAnderson.setProperty( "name", "Thomas Anderson" ); mrAnderson.setProperty( "age", 29 ); // Create Morpheus Node morpheus = graphDb.createNode(); morpheus.setProperty( "name", "Morpheus" ); morpheus.setProperty( "rank", "Captain" ); morpheus.setProperty( "occupation", "Total bad ass" ); // Create relationship representing they know each other mrAnderson.createRelationshipTo( morpheus, RelTypes.KNOWS ); // ... similarly for Trinity, Cypher, Agent Smith, Architect 21
  • 31. Data manipulation API GraphDatabaseService graphDb = getGraphDbInstanceSomehow(); Transaction tx = graphDb.beginTx(); try { // Create Thomas 'Neo' Anderson Node mrAnderson = graphDb.createNode(); mrAnderson.setProperty( "name", "Thomas Anderson" ); mrAnderson.setProperty( "age", 29 ); // Create Morpheus Node morpheus = graphDb.createNode(); morpheus.setProperty( "name", "Morpheus" ); morpheus.setProperty( "rank", "Captain" ); morpheus.setProperty( "occupation", "Total bad ass" ); // Create relationship representing they know each other mrAnderson.createRelationshipTo( morpheus, RelTypes.KNOWS ); // ... similarly for Trinity, Cypher, Agent Smith, Architect tx.success(); } finally { tx.finish(); 21 }
  • 32. Graph traversals name: “The Architect” disclosure: “public” name: “Thomas Anderson” age: 29 name: “Cypher” last name: “Reagan” KNOWS name: “Morpheus” KNOWS KNOWS rank: “Captain” CODED BY LOVES occupation: “Total badass” KNOWS KNOWS name: “Trinity” disclosure: “secret” name: “Agent Smith” version: “1.0b” since: “meeting the oracle” since: “a year before the movie” language: “C++” cooperates on: “The Nebuchadnezzar” 22
  • 33. Graph traversals name: “The Architect” disclosure: “public” name: “Thomas Anderson” age: 29 name: “Cypher” last name: “Reagan” KNOWS name: “Morpheus” KNOWS KNOWS rank: “Captain” CODED BY LOVES occupation: “Total badass” KNOWS KNOWS name: “Trinity” disclosure: “secret” name: “Agent Smith” version: “1.0b” since: “meeting the oracle” since: “a year before the movie” language: “C++” cooperates on: “The Nebuchadnezzar” import neo4j class Friends(neo4j.Traversal): # Traversals ! queries in Neo4j types = [ neo4j.Outgoing.KNOWS ] order = neo4j.BREADTH_FIRST stop = neo4j.STOP_AT_END_OF_GRAPH returnable = neo4j.RETURN_ALL_BUT_START_NODE for friend_node in Friends(mr_anderson): print "%s (@ depth=%s)" % ( friend_node["name"], friend_node.depth ) 23
  • 34. Graph traversals name: “The Architect” disclosure: “public” name: “Thomas Anderson” age: 29 name: “Cypher” last name: “Reagan” KNOWS name: “Morpheus” KNOWS KNOWS rank: “Captain” CODED BY LOVES occupation: “Total badass” KNOWS KNOWS name: “Trinity” disclosure: “secret” name: “Agent Smith” version: “1.0b” since: “meeting the oracle” since: “a year before the movie” language: “C++” cooperates on: “The Nebuchadnezzar” import neo4j class Friends(neo4j.Traversal): # Traversals ! queries in Neo4j types = [ neo4j.Outgoing.KNOWS ] order = neo4j.BREADTH_FIRST stop = neo4j.STOP_AT_END_OF_GRAPH returnable = neo4j.RETURN_ALL_BUT_START_NODE for friend_node in Friends(mr_anderson): print "%s (@ depth=%s)" % ( friend_node["name"], friend_node.depth ) 23
  • 35. Graph traversals name: “The Architect” disclosure: “public” name: “Thomas Anderson” age: 29 name: “Cypher” last name: “Reagan” KNOWS name: “Morpheus” KNOWS KNOWS rank: “Captain” CODED BY LOVES occupation: “Total badass” KNOWS KNOWS name: “Trinity” disclosure: “secret” name: “Agent Smith” version: “1.0b” since: “meeting the oracle” since: “a year before the movie” language: “C++” cooperates on: “The Nebuchadnezzar” import neo4j class Friends(neo4j.Traversal): # Traversals ! queries in Neo4j types = [ neo4j.Outgoing.KNOWS ] Morpheus (@ depth=1) order = neo4j.BREADTH_FIRST stop = neo4j.STOP_AT_END_OF_GRAPH returnable = neo4j.RETURN_ALL_BUT_START_NODE for friend_node in Friends(mr_anderson): print "%s (@ depth=%s)" % ( friend_node["name"], friend_node.depth ) 23
  • 36. Graph traversals name: “The Architect” disclosure: “public” name: “Thomas Anderson” age: 29 name: “Cypher” last name: “Reagan” KNOWS name: “Morpheus” KNOWS KNOWS rank: “Captain” CODED BY LOVES occupation: “Total badass” KNOWS KNOWS name: “Trinity” disclosure: “secret” name: “Agent Smith” version: “1.0b” since: “meeting the oracle” since: “a year before the movie” language: “C++” cooperates on: “The Nebuchadnezzar” import neo4j class Friends(neo4j.Traversal): # Traversals ! queries in Neo4j types = [ neo4j.Outgoing.KNOWS ] Morpheus (@ depth=1) order = neo4j.BREADTH_FIRST Trinity (@ depth=1) stop = neo4j.STOP_AT_END_OF_GRAPH returnable = neo4j.RETURN_ALL_BUT_START_NODE for friend_node in Friends(mr_anderson): print "%s (@ depth=%s)" % ( friend_node["name"], friend_node.depth ) 23
  • 37. Graph traversals name: “The Architect” disclosure: “public” name: “Thomas Anderson” age: 29 name: “Cypher” last name: “Reagan” KNOWS name: “Morpheus” KNOWS KNOWS rank: “Captain” CODED BY LOVES occupation: “Total badass” KNOWS KNOWS name: “Trinity” disclosure: “secret” name: “Agent Smith” version: “1.0b” since: “meeting the oracle” since: “a year before the movie” language: “C++” cooperates on: “The Nebuchadnezzar” import neo4j class Friends(neo4j.Traversal): # Traversals ! queries in Neo4j types = [ neo4j.Outgoing.KNOWS ] Morpheus (@ depth=1) order = neo4j.BREADTH_FIRST Trinity (@ depth=1) stop = neo4j.STOP_AT_END_OF_GRAPH Cypher (@ depth=2) returnable = neo4j.RETURN_ALL_BUT_START_NODE for friend_node in Friends(mr_anderson): print "%s (@ depth=%s)" % ( friend_node["name"], friend_node.depth ) 23
  • 38. Graph traversals name: “The Architect” disclosure: “public” name: “Thomas Anderson” age: 29 name: “Cypher” last name: “Reagan” KNOWS name: “Morpheus” KNOWS KNOWS rank: “Captain” CODED BY LOVES occupation: “Total badass” KNOWS KNOWS name: “Trinity” disclosure: “secret” name: “Agent Smith” version: “1.0b” since: “meeting the oracle” since: “a year before the movie” language: “C++” cooperates on: “The Nebuchadnezzar” import neo4j class Friends(neo4j.Traversal): # Traversals ! queries in Neo4j types = [ neo4j.Outgoing.KNOWS ] Morpheus (@ depth=1) order = neo4j.BREADTH_FIRST Trinity (@ depth=1) stop = neo4j.STOP_AT_END_OF_GRAPH Cypher (@ depth=2) returnable = neo4j.RETURN_ALL_BUT_START_NODE Agent Smith (@ depth=3) for friend_node in Friends(mr_anderson): print "%s (@ depth=%s)" % ( friend_node["name"], friend_node.depth ) 23
  • 39. Graph traversals name: “The Architect” disclosure: “public” name: “Thomas Anderson” age: 29 name: “Cypher” last name: “Reagan” KNOWS name: “Morpheus” KNOWS KNOWS rank: “Captain” CODED BY LOVES occupation: “Total badass” KNOWS KNOWS name: “Trinity” disclosure: “secret” name: “Agent Smith” version: “1.0b” since: “meeting the oracle” since: “a year before the movie” language: “C++” cooperates on: “The Nebuchadnezzar” import neo4j class Friends(neo4j.Traversal): # Traversals ! queries in Neo4j types = [ neo4j.Outgoing.KNOWS ] Morpheus (@ depth=1) order = neo4j.BREADTH_FIRST Trinity (@ depth=1) stop = neo4j.STOP_AT_END_OF_GRAPH Cypher (@ depth=2) returnable = neo4j.RETURN_ALL_BUT_START_NODE Agent Smith (@ depth=3) for friend_node in Friends(mr_anderson): print "%s (@ depth=%s)" % ( friend_node["name"], friend_node.depth ) 23
  • 40. Finding a place to start ๏ Traversals need a Node to start from • QUESTION: How do I find the start Node? • ANSWER:You use an Index ๏ Indexes in Neo4j are different from Indexes in Relational Databases • RDBMSes use them for Joining • Neo4j use them for simple lookup IndexService index = getGraphDbIndexServiceSomehow(); Node mrAnderson = index.getSingleNode( "name", "Thomas Anderson" ); performTraversalFrom( mrAnderson ); 24
  • 41. Indexes in Neo4j ๏ The Graph *is* the main index • Use relationship labels for navigation • Build index structures *in the graph* ‣Search trees, tag clouds, geospatial indexes, et.c. ‣Linked/skip lists or other data structures in the graph ‣We have utility libraries for this ๏ External indexes used *for lookup* • Finding a (number of) points to start traversals from • Major difference from RDBMS that use indexes for everything 25
  • 42. A domain object implemented in Neo4j public interface Person { String getName(); void setName( String firstName, String lastName ); } public final class PersonImpl implements Person { private final Node underlyingNode; public PersonImpl( Node underlyingNode ) { this.underlyingNode = underlyingNode; } public String getName() { return String.format("%s %s", underlyingNode.getProperty("first name"), underlyingNode.getProperty("last name") ); } public String setName(String firstName, String lastName) { underlyingNode.setProperty("first name", firstName); underlyingNode.setProperty("last name", lastName); } } 26
  • 43. Neo4j as Software Transactional Memory ๏ Implement objects as wrappers around Nodes and Relationships • Neo4j is fast enough to allow you to read all state from the Node/Relationship ๏ Mutating operations require transactions • The changes are isolated from all other threads until committed • Multiple mutations can be committed atomically ๏ Nested transactions are flattened • Makes it possible to have methods open their own transaction ๏ Fits nicely with the OO paradigm • More focus on data than on objects (comp. Object DBs) 27
  • 44. Why not use an O/R mapper? ๏ Model evolution in ORMs is a hard problem • virtually unsupported in most ORM systems ๏ SQL is “compatible” across many RDBMSs • data is still locked in ๏ Each ORM maps object models differently • Moving to another ORM == legacy schema support ‣except your legacy schema is a strange auto-generated one ๏ Object/Graph Mapping is always done the same way • allows you to keep your data through application changes • or share data between multiple implementations 28
  • 45. What an ORM doesn’t do ๏Deep traversals ๏Graph algorithms ๏Shortest path(s) ๏Routing ๏etc. 29
  • 46. Path exists in social network ๏ Each person has on average 50 friends The performance impact in Neo4j depends only on the degree of each node. in Tobias an RDBMS it depends on the number of entries in the tables involved in the join(s). Emil Johan Peter Database # persons query time Relational database 1 000 2 000 ms Neo4j Graph Database 1 000 2 ms Neo4j Graph Database 1 000 000 2 ms Relational database 1 000 000 way too long... 30
  • 47. Path exists in social network ๏ Each person has on average 50 friends The performance impact in Neo4j depends only on the degree of each node. in Tobias an RDBMS it depends on the number of entries in the tables involved in the join(s). Emil Johan Peter Database # persons query time Relational database 1 000 2 000 ms Neo4j Graph Database 1 000 2 ms Neo4j Graph Database 1 000 000 2 ms Relational database 1 000 000 way too long... 30
  • 48. Path exists in social network ๏ Each person has on average 50 friends The performance impact in Neo4j depends only on the degree of each node. in Tobias an RDBMS it depends on the number of entries in the tables involved in the join(s). Emil Johan Peter Database # persons query time Relational database 1 000 2 000 ms Neo4j Graph Database 1 000 2 ms Neo4j Graph Database 1 000 000 2 ms Relational database 1 000 000 way too long... 30
  • 49. Path exists in social network ๏ Each person has on average 50 friends The performance impact in Neo4j depends only on the degree of each node. in Tobias an RDBMS it depends on the number of entries in the tables involved in the join(s). Emil Johan Peter Database # persons query time Relational database 1 000 2 000 ms Neo4j Graph Database 1 000 2 ms Neo4j Graph Database 1 000 000 2 ms Relational database 1 000 000 way too long... 30
  • 50. Path exists in social network ๏ Each person has on average 50 friends The performance impact in Neo4j depends only on the degree of each node. in Tobias an RDBMS it depends on the number of entries in the tables involved in the join(s). Emil Johan Peter Database # persons query time Relational database 1 000 2 000 ms Neo4j Graph Database 1 000 2 ms Neo4j Graph Database 1 000 000 2 ms Relational database 1 000 000 way too long... 30
  • 51. On-line real time routing with Neo4j ๏ 20 million Nodes - represents places ๏ 62 million Edges - represents direct roads between places • These edges have a length property, for the length of the road ๏ Average optimal route, 100 separate roads, found in 100ms ๏ Worst case route we could find: • Optimal route is 5500 separate roads • Total length ~770km There’s a difference • Found in less than 3 seconds bet ween least number of hops and least cost. ๏ Uses A* “best first” search 31
  • 52. Routing with Neo4j - using Neo4j Graph-Algos # The cost evaluator - for choosing the best next node class GeoCostEvaluator include EstimateEvaluator def getCost(node, goal) straight_path_distance( node.getProperty("lat"), node.getProperty("lon"), goal.getProperty("lat"), goal.getProperty("lon") ) end end # Instantiate the A* search function path_finder = AStar.new( Neo4j::instance, RelationshipExpander.forTypes( DynamicRelationshipType.withName("road"), Direction::BOTH ), DoubleEvaluator.new("length"), GeoCostEvaluator.new ) # Find the best path between New York City and San Francisco best_path = path_finder.findSinglePath( NYC, SF ) 32
  • 53. Newest addition: Neo4j lets you REST ๏ Hello Neo4j REST server - Neo4j no longer needs to be embedded ๏ Opens up Neo4j to your favorite platform (even if that isn’t Java) • PHP, .NET, et.c. - libraries already exists! • http://wiki.neo4j.org/content/Getting_Started_REST ๏ Uses JSON for state transfer + browsable HTML for introspection ๏ Atomic modification operations ๏ Brand new declarative traversal framework • Extensible using your favorite scripting language ‣javascript is included. Jython, JRuby, et.c. supported 33
  • 54. Other cool Graph Databases ๏ Sones GraphDB • Graph Query Language - a SQL-like query language for graphs ๏ Franz Inc. AllegroGraph ๏ HypergraphDB ๏ InfoGrid ๏ Twitter’s FlockDB • Optimized for the Twitter use case - one level relationships ๏ Interestingly we all have different approaches 34
  • 55. Up until recently there was only one Database, the RDBMS. The days of a single database that rules all is over. One database to rule them all Image credits: The Lord of the Rings, New Line Cinema 35
  • 56. Use best suited storage for each kind of data The era of using RDBMSes for all problems is over. Instead we should use the database most suited for the problem at hand. Image credits: Unknown :’( 36
  • 57. Polyglot persistence ... we could even use multiple databases in conjunction, and let each database handle the things it does best. Document {...} {...} {...} 37
  • 58. Polyglot persistence SQL && NOSQL Document {...} {...} All databases are welcome! SQL and NOSQL - it is Not Only SQL! {...} 38
  • 59. Finding out more ๏ http://neo4j.org/ - project website ‣http://api.neo4j.org/ and http://components.neo4j.org/ ‣http://wiki.neo4j.org/ - HowTos, Tutorials, Examples, FAQ, et.c. ‣http://planet.neo4j.org/ - aggregation of blogs about Neo4j ๏ http://neotechnology.com/ - commercial licensing ๏ http://twitter.com/neo4j/team - follow the Neo4j team ๏ http://nosql.mypopescu.com/ - good source for news on NOSQL monitors Neo4j and other NOSQL solutions ๏ http://highscalability.com/ - has published a few articles about Neo4j 39
  • 60. Buzzword summary http://neo4j.org/ Semi structured SPARQL AGPLv3 ACID transactions Open Source Object mapping Gremlin Shortest path In-Graph indexes NOSQL A* routing whiteboard friendly RESTful Traversal Query language Embedded Beer Schema free Software Transactional Memory Right tool for the right job Scaling to complexity Free Software Polyglot persistence 40
  • 61. http://neotechnology.com