Published on

Introduction to Neo4j.rb - the Ruby wrapper of Neo4j. Also a quick introduction to Ruby.

Published in: Technology
1 Comment
No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • I'm going to talk about one of the most popular graph databases - neo4j, and of course the ruby binding - neo4j.rb which I have written over the last few years.
  • Here is a photo of me playing the Brahms d-minor concerto with the swedish radio symphony. As you can see, it's an old photo. I don't do that much playing any longer but I still passionate about piano playing Another passion I have is Ruby. Why I'm I standing here talking about Neo4j.rb ? I fell in love with Ruby a few years ago and really wanted to learn it. The only way to learn is by reading/writing code. As it turns out I happened to know the people behind the Neo4j database and they suggested I would write a Ruby binding for it.
  • The left picture is actually a picture generated by the NeoEclipse – a tool for visualize the graph
  • Often you don't need to create your own search tree or index since the domain it self contains a search tree. For an recommendation algorithm it is enough to traverse only a limited number of node of a given distance from one node.
  • As you can see Neo4j.rb is like an Object Database except that it does not have a query language The two main benefits of the Graph Database are that it's natural to express the requirements in code (just like an Object Database) since often the real world is best expressed as a graph. You don't need to ruin model that the customer described by putting it in tables. The other benefit is the traversal pattern. As you saw in the previous slide it is easy to express algorithms for things like an recommendation engine with traversals. It is still possible to do this in SQL (since you only need two joins) but I think it is more natural expressed as traversals. Also the traversal speed will remain the same regardless of the depth of traversal or the number of nodes.
  • Neo4jrb

    1. 2. Neo4j.rb The Benefits of Graph Database Andreas Ronge
    2. 3. Who am I ?
    3. 4. NoSQL The problem with SQL: not designed for <ul><ul><ul><li>Accelerating growth of data
    4. 5. Huge clustered environments
    5. 6. Complex and evolving data models </li></ul></ul></ul>Choose the right tools - N ot O nly SQL
    6. 7. The CAP Theorem Can't achieve all three, pick two <ul><ul><ul><li>C onsistency – all readers will see the same write
    7. 8. A vailability – tolerant of node failures
    8. 9. P artition tolerant – if lost interconnect between nodes </li></ul></ul></ul>Many NoSQL databases choose <ul><ul><ul><li>Sacrifice consistency over availability
    9. 10. Eventual consistency instead of ACID (but not Neo4j) </li></ul></ul></ul>
    10. 12. What is Neo4j.rb ? Neo4j <ul><ul><li>A Graph Database written in Java
    11. 13. Accessible from a Java API, but also </li><ul><li>HTTP/REST interface
    12. 14. Language binding - Python, Ruby, Scala, Groovy
    13. 15. Gremlin ./outE[@label='created']/inV/inE[@label='created']/outV[g:except($_)] </li></ul></ul></ul>Neo4j.rb <ul><ul><li>A language binding for (J) Ruby </li></ul></ul>
    14. 16. A Ruby Crash Course
    15. 18. What is a Graph Database ?
    16. 19. Graph DB vs. SQL id name 1 andreas 2 peter id p1_id p2_id since 1234 1 2 2002 People Friends friend since: 2002 name: andreas name: peter
    17. 20. Graph DB vs SQL
    18. 21. Benefit 1 Domain Modeling
    19. 22. Benefit 2 No O/R mismatch
    20. 23. Benefit 3 Semi-structured information
    21. 25. Benefit 4 No Schema
    22. 26. DEMO TIME Does Thomas Andersson know someone [who knows]* called Agent Smith ? Benefit 5 Deep traversals
    23. 27. Neo4j(.rb) Features Neo4j <ul><ul><ul><li>Embedded
    24. 28. ACID Transactions
    25. 29. Lucene Integration
    26. 30. Graph Algorithms </li></ul></ul></ul>Neo4j.rb <ul><ul><ul><li>Object Oriented Mapping
    27. 31. “Drop in” replacement for Rails Active Model
    28. 32. Improved/extended API (lucene, rules, migrations,...) </li></ul></ul></ul>
    29. 33. Embedded Easier to install, deploy & test Is running in same thread as your application No network connection to DB needed No Database Tier
    30. 34. Another Ruby Crash Course Ruby Java
    31. 35. Transactions ACID <ul><ul><ul><li>a tomicity, c onsistency, i solation, d urability
    32. 36. only write locks, no read locks </li></ul></ul></ul>
    33. 37. Object Oriented Mapping Neo4j and a Dynamic Language – A Perfect Match ! Ruby Class A Neo4j Node _class: Person name: andreas
    34. 38. How do I find things ? <ul><li>Start from Reference Node
    35. 39. Graph as an Index
    36. 40. Use Lucene </li></ul>
    37. 41. Reference Node Find Thomas Andersson Neo4j.ref_node.outgoing(:root).first REF_NODE Thomas Andersson
    38. 42. Use the Graph as an Index A A B Neo4j.ref_node.outgoing('A').each {...} RefNode name andreas name anders name bertil
    39. 43. Lucene Full-featured text search engine Features <ul><ul><ul><li>Phrase queries, wildcard queries, proximity queries, range queries and more
    40. 44. Ranked searching
    41. 45. Sorting
    42. 46. Date-range
    43. 47. Sorting by any field </li></ul></ul></ul>
    44. 48. Lucene in Neo4j.rb
    45. 49. NodeMixin Lucene Integration Declared properties DSL for relationship traversal Migrations Lists Cascade Delete Works with inheritance
    46. 50. Ruby Class Mapping: Relationships friend friend name andreas name peter name david
    47. 51. Incoming Relationship Same relationship different direction Same _class = Actor name = 'keanu' _class = Movie name = 'matrix'
    48. 52. Ruby on Rails/Active Record “drop in” replacement
    49. 55. A Common Problem I have a <ul><ul><ul><li>System already in production
    50. 56. Huge database </li></ul></ul></ul>I need to <ul><ul><ul><li>Change the structure of the database </li></ul></ul></ul>Solution: <ul><ul><ul><li>Migrations </li></ul></ul></ul>
    51. 57. Migrations: Direct
    52. 58. Migrations: Lazy (Neo4j::LazyMigrationMixin)
    53. 59. Inheritance
    54. 60. Recommendation Engine Likes Likes Likes Recommend This 1 2 4 3 5 Bertil Caesar Adam
    55. 61. Example, Recommendation likes other_people 4 2 5 3 1
    56. 62. Included Graph Algorithms Shortest paths, Simple paths, Graph measures ... node1 node2
    57. 63. High Availability Online Backup - hot spare Read-slave replication Write master election
    58. 64. Neo4j – An Object DB ? Neo4j has very fast traversals <ul><ul><ul><li>Avoids loading properties </li></ul></ul></ul>No need to declare two way relationships <ul><ul><ul><li>A relationship has a start and end node </li></ul></ul></ul>Does have two ways of finding objects <ul><ul><ul><li>Traversals
    59. 65. Lucene </li></ul></ul></ul>Optimized for Graph Algorithms
    60. 66. Conclusions: Benefits Express your domain as a Graph <ul><ul><ul><li>Domain Modeling
    61. 67. No O/R mismatch
    62. 68. Efficient storage of Semi Structured Information
    63. 69. Schema Less </li></ul></ul></ul>Express Queries as Traversals <ul><ul><ul><li>Fast deep traversal instead of slow SQL queries that span many table joins </li></ul></ul></ul>
    64. 70. When NOT use Graph DB Don't have a graph related problem ? Not too much changing requirements ? Easy to organized data into: <ul><ul><ul><li>Tables, Documents or Key-Value models ? </li></ul></ul></ul>Few & well defined relationships in the domain ? Don't have SQL queries that span many table joins ? Many YES => maybe Graph DB not a good choice
    65. 71. When should I use a Graph DB ? Need to solve a graph related problem ? <ul><ul><ul><li>Recommendations, Shortest path, Social Networks </li></ul></ul></ul>Have a complex and evolving data model ? Few mandatory and many optional attributes ? Big part of domain is expressed as relationships ? Have SQL queries that span many table joins ? Many YES => maybe a Graph DB is a good choice