Application Modeling with Graph Databases
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

Application Modeling with Graph Databases

on

  • 6,135 views

 

Statistics

Views

Total Views
6,135
Views on SlideShare
6,084
Embed Views
51

Actions

Likes
8
Downloads
75
Comments
0

5 Embeds 51

http://lanyrd.com 43
https://si0.twimg.com 5
http://www.onlydoo.com 1
http://www.mefeedia.com 1
http://www.php-talks.com 1

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • * Goal here is to inspire further investigation * Not going to go into nuts & bolts * Docs are amazing!
  • * graph db usage poll
  • * Six degrees game * Relational databases can't easily answer certain types of questions * arbitrary path query * the basic unit of social networking
  • * Each degree adds a join * Increases complexity * Decreases performance * Stop when the actor you're looking for is in the list
  • * this problem highlights the ugly truth about RDBs * they weren't designed to handle these types of problems. * RDB relationships join data, but are not data in themselves * arbitrary path query * RDB does "query", not "path" * certainly not "arbitrary"
  • * Gather everything in the set that matches these criteria, then tell me if this thing is in the set * 1 set, no problem * 2nd set no problem * 3rd set not related to 1st * 4th not related to 2nd * 5th related to 1st and 4th * etc. * Relationships are only available between overlapping sets
  • * avoid schema lock-in * intuitive * ditch digger's dilemma
  • * Neo4j * AGPL for community * ACID compliant * High Availablity mode * Embedded and REST
  • * Neo4j * AGPL for community * ACID compliant * High Availablity mode * Embedded and REST * Bindings for every language
  • * graph theory * edges can be ordered or unordered pairs * vocab: - vertex -> node - edge -> relationship
  • * Tree data-structures * Networks * Maps * vehicles on streets == packets through network * social networking * manufacturing * fraud detection * supply chain
  • * Make each record a node * Make every foreign key a relationship * RDB indexes are usually stored in a tree structure * Trees are graphs * Why not use RDBs? * The trouble with RDBs is how they are stored in memory and queried   * Require a translation step from memory blocks to graph structure * ORMs hide the problem, but do not solve it * Relationships not first-class citizens * Many problem domains map poorly to rows/tables
  • The zen of graph databases
  • * Social networking - friends of friends of friends of friends * Assembly/Manufacturing - 1 widget contains 3 gadgets each contain 2 gizmos * Map directions - starting at my house find a route to the office that goes past the pub * Multi-tenancy - root node per tenant * all queries start at root * No overlap between graphs = no accidental data spillage * Fraud: track transactions back to origination * Pretty much anything that can be drawn on a whiteboard
  • * Example: retail system * Customer makes Order * Store sells Order * Order contains Items * Supplier supplied Items * Customer rates Items * Did this customer rank supplier X highly? * Which suppliers sell the highest rated items? * Does item A get rated higher when ordered with Item B? * All can be answered with RDBs as well * Not as elegant * Not as performant
  • * Actors are nodes * Movies are nodes * Relationship: Actor is IN a movie * Compare to degree selection join queries
  • * all groups user 3 is a member of directly or inherited
  • * does user 2 have permission to read the home page?
  • * does user 3 have permission to read the user 1's blog?
  • * Find all locations in company
  • * For a given brand, find all locations not under that brand
  • * RDBs are really good at data aggregation * Set math, duh * Have to traverse the whole graph in order to do aggregation * Truly tabular means not a lot of relationships between the data types * Neo4j guys say: rdb will tell you the salary of everyone in the room; graph db will tell you who will buy you a beer
  • * Emil Eifrem (Neo Tech CEO) webinar * Check out 54 minute mark

Application Modeling with Graph Databases Presentation Transcript

  • 1. Application Modelingwith Graph Databases http://joind.in/6694
  • 2. @josh_adell• Software developer: PHP, Javascript, SQL• http://www.dunnwell.com• http://blog.everymansoftware.com• http://github.com/jadell/neo4jphp• http://frostymug.herokuapp.com
  • 3. The Problem
  • 4. The Solution?> -- First degree> SELECT actor_name FROM cast WHERE movie_title IN (SELECT DISTINCT movie_titleFROM cast WHERE actor_name=Kevin Bacon)> -- Second degree> SELECT actor_name FROM cast WHERE movie_title IN (SELECT DISTINCT movie_titleFROM cast WHERE actor_name IN (SELECT actor_name FROM cast WHERE movie_title IN(SELECT DISTINCT movie_title FROM cast WHERE actor_name=Kevin Bacon)))> -- Third degree> SELECT actor_name FROM cast WHERE movie_title IN(SELECT DISTINCT movie_titleFROM cast WHERE actor_name IN (SELECT actor_name FROM cast WHERE movie_title IN(SELECT DISTINCT movie_title FROM cast WHERE actor_name IN (SELECTactor_name FROM cast WHERE movie_title IN (SELECT DISTINCT movie_title FROMcast WHERE actor_name=Kevin Bacon))))
  • 5. The TruthRelational databases arent very good with relationships Data RDBMs
  • 6. RDBs Use Set Math
  • 7. Try again?
  • 8. Right Tool for the Job =
  • 9. Warning: Computer Science Ahead A graph is an ordered pair G = (V, E) where V is a set of vertices and E is a set of edges, which are pairs of vertices in V.
  • 10. Graphs are Everywhere
  • 11. Relational Databases are Graphs!
  • 12. Everything is connected
  • 13. Some Graph Use Cases• Social networking• Manufacturing• Map directions• Geo-spatial algorithms• Fraud detection• Multi-tenancy• Dependency mapping• Bioinformatics• Natural language processing
  • 14. Graphs are "Whiteboard-Friendly" Nouns => nodes, Verbs => relationships
  • 15. Back to BaconSTART s=node:actors(name="Keanu Reeves"), e=node:actors(name="Kevin Bacon")MATCH p = shortestPath( s-[*]-e )RETURN p, length(p) http://tinyurl.com/c65d99w
  • 16. ACL• Users can belong to groups• Groups can belong to groups• Groups and users have permissions on objects o read o write o denied
  • 17. START u=node:users(name="User 3")MATCH u-[:belongs_to*]->gRETURN g http://tinyurl.com/cyn3rkx
  • 18. START u=node:users(name="User 2"), o=node:objects(name="Home")MATCH u-[:belongs_to*0..]->g, g-[:can_read]->oRETURN g http://tinyurl.com/dx7onro
  • 19. START u=node:users(name="User 3"), o=node:objects(name="Users 1 Blog")MATCH u-[:belongs_to*0..]->g, g-[:can_read]->o, u-[d?:denied*]->oWHERE d is nullRETURN g http://tinyurl.com/bwtyhvt
  • 20. Real Life Example• Companies have brands, locations, location groups• Brands have locations, location groups• Location groups have locations
  • 21. START c=node:companies(name="Company 1")MATCH c-[:HAS*]->lWHERE l.type = locationRETURN l ORDER BY l.name http://tinyurl.com/cxm4heh
  • 22. START b=node:brands(name="Brand 1")MATCH b<-[:HAS*]-c-[:HAS*]->l<-[h?:HAS*]-bWHERE h IS NULL AND l.type=locationRETURN l ORDER BY l.name http://tinyurl.com/cl537w6
  • 23. Tweet @chicken_techwe should be using graph dbs!
  • 24. But Wait...Theres More!• Mutating Cypher (insert, update)• Indexing (auto, full-text, spatial)• Batches and Transactions• Embedded (for JVM) or REST
  • 25. Where fore art thou, RDB?• Aggregation• Ordered data• Truly tabular data• Few or clearly defined relationships
  • 26. Questions?
  • 27. Resources• http://joind.in/6694• http://neo4j.org• http://docs.neo4j.org• http://www.youtube.com/watch?v=UodTzseLh04• http://github.com/jadell/neo4jphp• http://joshadell.com• josh.adell@gmail.com• @josh_adell• Google+, Facebook, LinkedIn