Successfully reported this slideshow.

×

# Application Modeling with Graph Databases

## More Related Content

### Application Modeling with Graph Databases

1. Application Modeling with Graph Databases http://joind.in/6694
2. @josh_adell • Software developer: PHP, Javascript, SQL • http://www.dunnwell.com • http://blog.everymansoftware.com • http://github.com/jadell/neo4jphp • http://frostymug.herokuapp.com
3. The Problem
4. The Solution? > -- First degree > SELECT actor_name FROM cast WHERE movie_title IN (SELECT DISTINCT movie_title FROM cast WHERE actor_name='Kevin Bacon') > -- Second degree > SELECT actor_name FROM cast WHERE movie_title IN (SELECT DISTINCT movie_title FROM cast WHERE actor_name IN (SELECT actor_name FROM cast WHERE movie_title IN (SELECT DISTINCT movie_title FROM cast WHERE actor_name='Kevin Bacon'))) > -- Third degree > SELECT actor_name FROM cast WHERE movie_title IN(SELECT DISTINCT movie_title FROM cast WHERE actor_name IN (SELECT actor_name FROM cast WHERE movie_title IN (SELECT DISTINCT movie_title FROM cast WHERE actor_name IN (SELECT actor_name FROM cast WHERE movie_title IN (SELECT DISTINCT movie_title FROM cast WHERE actor_name='Kevin Bacon'))))
5. The Truth Relational databases aren't very good with relationships Data RDBMs
6. RDBs Use Set Math
7. Try again?
8. Right Tool for the Job =
9. Warning: Computer Science Ahead A graph is an ordered pair G = (V, E) where V is a set of vertices and E is a set of edges, which are pairs of vertices in V.
10. Graphs are Everywhere
11. Relational Databases are Graphs!
12. Everything is connected
13. Some Graph Use Cases • Social networking • Manufacturing • Map directions • Geo-spatial algorithms • Fraud detection • Multi-tenancy • Dependency mapping • Bioinformatics • Natural language processing
14. Graphs are "Whiteboard-Friendly" Nouns => nodes, Verbs => relationships
15. Back to Bacon START s=node:actors(name="Keanu Reeves"), e=node:actors(name="Kevin Bacon") MATCH p = shortestPath( s-[*]-e ) RETURN p, length(p) http://tinyurl.com/c65d99w
16. ACL • Users can belong to groups • Groups can belong to groups • Groups and users have permissions on objects o read o write o denied
17. START u=node:users(name="User 3") MATCH u-[:belongs_to*]->g RETURN g http://tinyurl.com/cyn3rkx
18. START u=node:users(name="User 2"), o=node:objects(name="Home") MATCH u-[:belongs_to*0..]->g, g-[:can_read]->o RETURN g http://tinyurl.com/dx7onro
19. START u=node:users(name="User 3"), o=node:objects(name="Users 1 Blog") MATCH u-[:belongs_to*0..]->g, g-[:can_read]->o, u-[d?:denied*]->o WHERE d is null RETURN g http://tinyurl.com/bwtyhvt
20. Real Life Example • Companies have brands, locations, location groups • Brands have locations, location groups • Location groups have locations
21. START c=node:companies(name="Company 1") MATCH c-[:HAS*]->l WHERE l.type = 'location' RETURN l ORDER BY l.name http://tinyurl.com/cxm4heh
22. START b=node:brands(name="Brand 1") MATCH b<-[:HAS*]-c-[:HAS*]->l<-[h?:HAS*]-b WHERE h IS NULL AND l.type='location' RETURN l ORDER BY l.name http://tinyurl.com/cl537w6
23. Tweet @chicken_tech we should be using graph dbs!
24. But Wait...There's More! • Mutating Cypher (insert, update) • Indexing (auto, full-text, spatial) • Batches and Transactions • Embedded (for JVM) or REST
25. Where fore art thou, RDB? • Aggregation • Ordered data • Truly tabular data • Few or clearly defined relationships
26. Questions?

### Editor's Notes

• * Goal here is to inspire further investigation * Not going to go into nuts &amp; bolts * Docs are amazing!
• * graph db usage poll
• * Six degrees game * Relational databases can&apos;t easily answer certain types of questions * arbitrary path query * the basic unit of social networking
• * Each degree adds a join * Increases complexity * Decreases performance * Stop when the actor you&apos;re looking for is in the list
• * this problem highlights the ugly truth about RDBs * they weren&apos;t designed to handle these types of problems. * RDB relationships join data, but are not data in themselves * arbitrary path query * RDB does &amp;quot;query&amp;quot;, not &amp;quot;path&amp;quot; * certainly not &amp;quot;arbitrary&amp;quot;
• * Gather everything in the set that matches these criteria, then tell me if this thing is in the set * 1 set, no problem * 2nd set no problem * 3rd set not related to 1st * 4th not related to 2nd * 5th related to 1st and 4th * etc. * Relationships are only available between overlapping sets
• * avoid schema lock-in * intuitive * ditch digger&apos;s dilemma
• * Neo4j * AGPL for community * ACID compliant * High Availablity mode * Embedded and REST
• * Neo4j * AGPL for community * ACID compliant * High Availablity mode * Embedded and REST * Bindings for every language
• * graph theory * edges can be ordered or unordered pairs * vocab: - vertex -&gt; node - edge -&gt; relationship
• * Tree data-structures * Networks * Maps * vehicles on streets == packets through network * social networking * manufacturing * fraud detection * supply chain
• * Make each record a node * Make every foreign key a relationship * RDB indexes are usually stored in a tree structure * Trees are graphs * Why not use RDBs? * The trouble with RDBs is how they are stored in memory and queried   * Require a translation step from memory blocks to graph structure * ORMs hide the problem, but do not solve it * Relationships not first-class citizens * Many problem domains map poorly to rows/tables
• The zen of graph databases
• * Social networking - friends of friends of friends of friends * Assembly/Manufacturing - 1 widget contains 3 gadgets each contain 2 gizmos * Map directions - starting at my house find a route to the office that goes past the pub * Multi-tenancy - root node per tenant * all queries start at root * No overlap between graphs = no accidental data spillage * Fraud: track transactions back to origination * Pretty much anything that can be drawn on a whiteboard
• * Example: retail system * Customer makes Order * Store sells Order * Order contains Items * Supplier supplied Items * Customer rates Items * Did this customer rank supplier X highly? * Which suppliers sell the highest rated items? * Does item A get rated higher when ordered with Item B? * All can be answered with RDBs as well * Not as elegant * Not as performant
• * Actors are nodes * Movies are nodes * Relationship: Actor is IN a movie * Compare to degree selection join queries
• * all groups user 3 is a member of directly or inherited