Application Modeling with Graph Databases


Published on

Published in: Technology, Business
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • * Goal here is to inspire further investigation * Not going to go into nuts & bolts * Docs are amazing!
  • * graph db usage poll
  • * Six degrees game * Relational databases can't easily answer certain types of questions * arbitrary path query * the basic unit of social networking
  • * Each degree adds a join * Increases complexity * Decreases performance * Stop when the actor you're looking for is in the list
  • * this problem highlights the ugly truth about RDBs * they weren't designed to handle these types of problems. * RDB relationships join data, but are not data in themselves * arbitrary path query * RDB does "query", not "path" * certainly not "arbitrary"
  • * Gather everything in the set that matches these criteria, then tell me if this thing is in the set * 1 set, no problem * 2nd set no problem * 3rd set not related to 1st * 4th not related to 2nd * 5th related to 1st and 4th * etc. * Relationships are only available between overlapping sets
  • * avoid schema lock-in * intuitive * ditch digger's dilemma
  • * Neo4j * AGPL for community * ACID compliant * High Availablity mode * Embedded and REST
  • * Neo4j * AGPL for community * ACID compliant * High Availablity mode * Embedded and REST * Bindings for every language
  • * graph theory * edges can be ordered or unordered pairs * vocab: - vertex -> node - edge -> relationship
  • * Tree data-structures * Networks * Maps * vehicles on streets == packets through network * social networking * manufacturing * fraud detection * supply chain
  • * Make each record a node * Make every foreign key a relationship * RDB indexes are usually stored in a tree structure * Trees are graphs * Why not use RDBs? * The trouble with RDBs is how they are stored in memory and queried   * Require a translation step from memory blocks to graph structure * ORMs hide the problem, but do not solve it * Relationships not first-class citizens * Many problem domains map poorly to rows/tables
  • The zen of graph databases
  • * Social networking - friends of friends of friends of friends * Assembly/Manufacturing - 1 widget contains 3 gadgets each contain 2 gizmos * Map directions - starting at my house find a route to the office that goes past the pub * Multi-tenancy - root node per tenant * all queries start at root * No overlap between graphs = no accidental data spillage * Fraud: track transactions back to origination * Pretty much anything that can be drawn on a whiteboard
  • * Example: retail system * Customer makes Order * Store sells Order * Order contains Items * Supplier supplied Items * Customer rates Items * Did this customer rank supplier X highly? * Which suppliers sell the highest rated items? * Does item A get rated higher when ordered with Item B? * All can be answered with RDBs as well * Not as elegant * Not as performant
  • * Actors are nodes * Movies are nodes * Relationship: Actor is IN a movie * Compare to degree selection join queries
  • * all groups user 3 is a member of directly or inherited
  • * does user 2 have permission to read the home page?
  • * does user 3 have permission to read the user 1's blog?
  • * Find all locations in company
  • * For a given brand, find all locations not under that brand
  • * RDBs are really good at data aggregation * Set math, duh * Have to traverse the whole graph in order to do aggregation * Truly tabular means not a lot of relationships between the data types * Neo4j guys say: rdb will tell you the salary of everyone in the room; graph db will tell you who will buy you a beer
  • * Emil Eifrem (Neo Tech CEO) webinar * Check out 54 minute mark
  • Application Modeling with Graph Databases

    1. Application Modelingwith Graph Databases
    2. @josh_adell• Software developer: PHP, Javascript, SQL••••
    3. The Problem
    4. The Solution?> -- First degree> SELECT actor_name FROM cast WHERE movie_title IN (SELECT DISTINCT movie_titleFROM cast WHERE actor_name=Kevin Bacon)> -- Second degree> SELECT actor_name FROM cast WHERE movie_title IN (SELECT DISTINCT movie_titleFROM cast WHERE actor_name IN (SELECT actor_name FROM cast WHERE movie_title IN(SELECT DISTINCT movie_title FROM cast WHERE actor_name=Kevin Bacon)))> -- Third degree> SELECT actor_name FROM cast WHERE movie_title IN(SELECT DISTINCT movie_titleFROM cast WHERE actor_name IN (SELECT actor_name FROM cast WHERE movie_title IN(SELECT DISTINCT movie_title FROM cast WHERE actor_name IN (SELECTactor_name FROM cast WHERE movie_title IN (SELECT DISTINCT movie_title FROMcast WHERE actor_name=Kevin Bacon))))
    5. The TruthRelational databases arent very good with relationships Data RDBMs
    6. RDBs Use Set Math
    7. Try again?
    8. Right Tool for the Job =
    9. Warning: Computer Science Ahead A graph is an ordered pair G = (V, E) where V is a set of vertices and E is a set of edges, which are pairs of vertices in V.
    10. Graphs are Everywhere
    11. Relational Databases are Graphs!
    12. Everything is connected
    13. Some Graph Use Cases• Social networking• Manufacturing• Map directions• Geo-spatial algorithms• Fraud detection• Multi-tenancy• Dependency mapping• Bioinformatics• Natural language processing
    14. Graphs are "Whiteboard-Friendly" Nouns => nodes, Verbs => relationships
    15. Back to BaconSTART s=node:actors(name="Keanu Reeves"), e=node:actors(name="Kevin Bacon")MATCH p = shortestPath( s-[*]-e )RETURN p, length(p)
    16. ACL• Users can belong to groups• Groups can belong to groups• Groups and users have permissions on objects o read o write o denied
    17. START u=node:users(name="User 3")MATCH u-[:belongs_to*]->gRETURN g
    18. START u=node:users(name="User 2"), o=node:objects(name="Home")MATCH u-[:belongs_to*0..]->g, g-[:can_read]->oRETURN g
    19. START u=node:users(name="User 3"), o=node:objects(name="Users 1 Blog")MATCH u-[:belongs_to*0..]->g, g-[:can_read]->o, u-[d?:denied*]->oWHERE d is nullRETURN g
    20. Real Life Example• Companies have brands, locations, location groups• Brands have locations, location groups• Location groups have locations
    21. START c=node:companies(name="Company 1")MATCH c-[:HAS*]->lWHERE l.type = locationRETURN l ORDER BY
    22. START b=node:brands(name="Brand 1")MATCH b<-[:HAS*]-c-[:HAS*]->l<-[h?:HAS*]-bWHERE h IS NULL AND l.type=locationRETURN l ORDER BY
    23. Tweet @chicken_techwe should be using graph dbs!
    24. But Wait...Theres More!• Mutating Cypher (insert, update)• Indexing (auto, full-text, spatial)• Batches and Transactions• Embedded (for JVM) or REST
    25. Where fore art thou, RDB?• Aggregation• Ordered data• Truly tabular data• Few or clearly defined relationships
    26. Questions?
    27. Resources•••••••• @josh_adell• Google+, Facebook, LinkedIn