• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Neo4j - The Benefits of Graph Databases (OSCON 2009)
 

Neo4j - The Benefits of Graph Databases (OSCON 2009)

on

  • 36,742 views

Neo4j -- The Benefits of Graph Databases, presentation given at OSCON 2009.

Neo4j -- The Benefits of Graph Databases, presentation given at OSCON 2009.

Statistics

Views

Total Views
36,742
Views on SlideShare
31,060
Embed Views
5,682

Actions

Likes
75
Downloads
0
Comments
2

59 Embeds 5,682

http://www.oscon.com 3028
http://wiki.github.com 814
http://en.oreilly.com 556
https://github.com 369
http://muhammadhamed.blogspot.com 315
http://www.slideshare.net 154
http://muhammadhamed.blogspot.in 71
http://kms.sec.samsung.net 40
http://www.linkedin.com 36
http://muhammadhamed.blogspot.co.uk 28
http://muhammadhamed.blogspot.fr 27
http://muhammadhamed.blogspot.com.au 22
http://eventifier.co 18
http://muhammadhamed.blogspot.kr 18
http://muhammadhamed.blogspot.de 16
http://muhammadhamed.blogspot.sg 13
http://muhammadhamed.blogspot.nl 13
http://muhammadhamed.blogspot.com.es 11
http://muhammadhamed.blogspot.com.br 9
http://translate.googleusercontent.com 9
http://muhammadhamed.blogspot.gr 9
http://muhammadhamed.blogspot.se 8
http://muhammadhamed.blogspot.ch 7
http://muhammadhamed.blogspot.it 7
https://www.linkedin.com 7
http://muhammadhamed.blogspot.ca 7
http://blog.ceyloneye.com 7
http://muhammadhamed.blogspot.ie 5
http://www.feedspot.com 5
http://muhammadhamed.blogspot.com.tr 4
http://muhammadhamed.blogspot.co.il 4
http://www.e-presentations.us 4
https://twitter.com 3
http://muhammadhamed.blogspot.no 3
http://awakashasl.wordpress.com 3
http://blog.awakasha.com 2
https://si0.twimg.com 2
http://www.docshut.com 2
http://muhammadhamed.blogspot.mx 2
http://muhammadhamed.blogspot.hk 2
http://muhammadhamed.blogspot.cz 2
http://muhammadhamed.blogspot.be 2
http://muhammadhamed.blogspot.tw 2
http://muhammadhamed.blogspot.ae 1
http://muhammadhamed.blogspot.pt 1
http://66.249.89.132 1
http://muhammadhamed.blogspot.ro 1
http://muhammadhamed.blogspot.com.ar 1
http://www.docseek.net 1
http://muhammadhamed.blogspot.co.at 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel

12 of 2 previous next

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
  • Seems that this would be useful in certain cases (social networking, expert systems). Trying to do the normal stuff (build a blog, make a game, create a photo organization system, looking things up by some key...) would be difficult. It seems more about what links to what rather than finding things that match properties.

    This is ok, but your website should start out with the pitch that this is useful for X, and not useful for Y. Right now, it is all over the place. You need to state why it would be useful to the visitor, and what other projects have used it.
    Are you sure you want to
    Your message goes here
    Processing…
  • For more info, feel free to check our site: http://neo4j.org

    There's also a video of an earlier version of this talk (95% same content) on InfoQ: http://www.infoq.com/presentations/emil-eifrem-neo4j

    --
    Emil Eifrem
    http://neo4j.org
    http://twitter.com/emileifrem
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Neo4j - The Benefits of Graph Databases (OSCON 2009) Neo4j - The Benefits of Graph Databases (OSCON 2009) Presentation Transcript

    • No SQL? Image credit: http://browsertoolkit.com/fault-tolerance.png
    • Neo4j the benefits of graph databases Emil Eifrem CEO, Neo Technology
    • Death? Community experimentation: CouchDB SimpleDB Hypertable Cassandra Scalaris ...
    • ?
    • Trend 1: data is getting more connected Giant Global Information connectivity Graph (GGG) Ontologies RDF Folksonomies Tagging User- Wikis generated content Blogs RSS Hypertext Text documents web 1.0 web 2.0 “web 3.0” 1990 2000 2010 2020
    • Trend 2: ... and more semi-structured Individualization of content! In the salary lists of the 1970s, all elements had exactly one job In the salary lists of the 2000s, we need 5 job columns! Or 8? Or 15? Trend accelerated by the decentralization of content generation that is the hallmark of the age of participation (“web 2.0”)
    • Relational database Salary List Performance Majority of Webapps Social network Semantic } Trading custom Information complexity
    • We = hackers! So that’s vCPU... what about vhackers?
    • Whiteboard friendly? ? owns Björn Big Car build transport , Kids & DayCare Veggies
    • Alternative? a graph database
    • The Graph DB model: representation Core abstractions: name = “Emil” age = 29 Nodes sex = “yes” Relationships between nodes 1 2 Properties on both type = KNOWS time = 4 years 3 type = car vendor = “SAAB” model = “95 Aero”
    • Example: The Matrix name = “The Architect” name = “Morpheus” rank = “Captain” name = “Thomas Anderson” occupation = “Total badass” 42 age = 29 disclosure = public KNOWS KNOWS CODED_BY 1 KN O 7 3 WS 13 S KN name = “Cypher” KNOW OW last name = “Reagan” S name = “Agent Smith” disclosure = secret version = 1.0b age = 3 days age = 6 months language = C++ 2 name = “Trinity”
    • Code (1): Building a node space NeoService neo = ... // Get factory // Create Thomas 'Neo' Anderson Node mrAnderson = neo.createNode(); mrAnderson.setProperty( "name", "Thomas Anderson" ); mrAnderson.setProperty( "age", 29 ); // Create Morpheus Node morpheus = neo.createNode(); morpheus.setProperty( "name", "Morpheus" ); morpheus.setProperty( "rank", "Captain" ); morpheus.setProperty( "occupation", "Total bad ass" ); // Create a relationship representing that they know each other mrAnderson.createRelationshipTo( morpheus, RelTypes.KNOWS ); // ...create Trinity, Cypher, Agent Smith, Architect similarly
    • Code (1): Building a node space NeoService neo = ... // Get factory Transaction tx = neo.beginTx(); // Create Thomas 'Neo' Anderson Node mrAnderson = neo.createNode(); mrAnderson.setProperty( "name", "Thomas Anderson" ); mrAnderson.setProperty( "age", 29 ); // Create Morpheus Node morpheus = neo.createNode(); morpheus.setProperty( "name", "Morpheus" ); morpheus.setProperty( "rank", "Captain" ); morpheus.setProperty( "occupation", "Total bad ass" ); // Create a relationship representing that they know each other mrAnderson.createRelationshipTo( morpheus, RelTypes.KNOWS ); // ...create Trinity, Cypher, Agent Smith, Architect similarly tx.commit();
    • Code (1b): Defining RelationshipTypes // In package org.neo4j.api.core public interface RelationshipType { String name(); } // In package org.yourdomain.yourapp // Example on how to roll dynamic RelationshipTypes class MyDynamicRelType implements RelationshipType { private final String name; MyDynamicRelType( String name ){ this.name = name; } public String name() { return this.name; } } // Example on how to kick it, static-RelationshipType-like enum MyStaticRelTypes implements RelationshipType { KNOWS, WORKS_FOR, }
    • The Graph DB model: traversal Traverser framework for name = “Emil” high-performance traversing age = 29 sex = “yes” across the node space 1 2 type = KNOWS time = 4 years 3 type = car vendor = “SAAB” model = “95 Aero”
    • Example: Mr Anderson’s friends name = “The Architect” name = “Morpheus” rank = “Captain” name = “Thomas Anderson” occupation = “Total badass” 42 age = 29 disclosure = public KNOWS KNOWS CODED_BY 1 KN O 7 3 WS 13 S KN name = “Cypher” KNOW OW last name = “Reagan” S name = “Agent Smith” disclosure = secret version = 1.0b age = 3 days age = 6 months language = C++ 2 name = “Trinity”
    • Code (2): Traversing a node space // Instantiate a traverser that returns Mr Anderson's friends Traverser friendsTraverser = mrAnderson.traverse( Traverser.Order.BREADTH_FIRST, StopEvaluator.END_OF_GRAPH, ReturnableEvaluator.ALL_BUT_START_NODE, RelTypes.KNOWS, Direction.OUTGOING ); // Traverse the node space and print out the result System.out.println( "Mr Anderson's friends:" ); for ( Node friend : friendsTraverser ) { System.out.printf( "At depth %d => %s%n", friendsTraverser.currentPosition().getDepth(), friend.getProperty( "name" ) ); }
    • name = “The Architect” name = “Morpheus” rank = “Captain” name = “Thomas Anderson” occupation = “Total badass” 42 age = 29 disclosure = public KNOWS KNOWS CODED_BY 1 KN O 7 3 WS 13 S KN name = “Cypher” KNOW OW last name = “Reagan” S name = “Agent Smith” disclosure = secret version = 1.0b age = 3 days age = 6 months language = C++ 2 name = “Trinity” $ bin/start-neo-example Mr Anderson's friends: At depth 1 => Morpheus friendsTraverser = mrAnderson.traverse( Traverser.Order.BREADTH_FIRST, At depth 1 => Trinity StopEvaluator.END_OF_GRAPH, At depth 2 => Cypher ReturnableEvaluator.ALL_BUT_START_NODE, RelTypes.KNOWS, At depth 3 => Agent Smith Direction.OUTGOING ); $
    • Example: Friends in love? name = “The Architect” name = “Morpheus” rank = “Captain” name = “Thomas Anderson” occupation = “Total badass” 42 age = 29 disclosure = public KNOWS KNOWS CODED_BY 1 7 3 KN O WS 13 S KN KNOW name = “Cypher” OW last name = “Reagan” S name = “Agent Smith” LO disclosure = secret version = 1.0b VE age = 6 months language = C++ S 2 name = “Trinity”
    • Code (3a): Custom traverser // Create a traverser that returns all “friends in love” Traverser loveTraverser = mrAnderson.traverse( Traverser.Order.BREADTH_FIRST, StopEvaluator.END_OF_GRAPH, new ReturnableEvaluator() { public boolean isReturnableNode( TraversalPosition pos ) { return pos.currentNode().hasRelationship( RelTypes.LOVES, Direction.OUTGOING ); } }, RelTypes.KNOWS, Direction.OUTGOING );
    • Code (3a): Custom traverser // Traverse the node space and print out the result System.out.println( "Who’s a lover?" ); for ( Node person : loveTraverser ) { System.out.printf( "At depth %d => %s%n", loveTraverser.currentPosition().getDepth(), person.getProperty( "name" ) ); }
    • name = “The Architect” name = “Morpheus” rank = “Captain” name = “Thomas Anderson” occupation = “Total badass” 42 age = 29 disclosure = public KNOWS KNOWS KN O CODED_BY 1 7 3 WS 13 S KN KNOW name = “Cypher” OW last name = “Reagan” S name = “Agent Smith” LO disclosure = secret version = 1.0b VE age = 6 months language = C++ S 2 name = “Trinity” $ bin/start-neo-example new ReturnableEvaluator() Who’s a lover? { public boolean isReturnableNode( TraversalPosition pos) At depth 1 => Trinity { $ return pos.currentNode(). hasRelationship( RelTypes.LOVES, Direction.OUTGOING ); } },
    • Bonus code: domain model How do you implement your domain model? Use the delegator pattern, i.e. every domain entity wraps a Neo4j primitive: // In package org.yourdomain.yourapp class PersonImpl implements Person { private final Node underlyingNode; PersonImpl( Node node ){ this.underlyingNode = node; } public String getName() { return this.underlyingNode.getProperty( "name" ); } public void setName( String name ) { this.underlyingNode.setProperty( "name", name ); } }
    • Domain layer frameworks Qi4j (www.qi4j.org) Framework for doing DDD in pure Java5 Defines Entities / Associations / Properties Sound familiar? Nodes / Rel’s / Properties! Neo4j is an “EntityStore” backend NeoWeaver (http://components.neo4j.org/neo-weaver) Weaves Neo4j-backed persistence into domain objects in runtime (dynamic proxy / cglib based) Veeeery alpha
    • Neo4j system characteristics Disk-based Native graph storage engine with custom (“SSD- ready”) binary on-disk format Transactional JTA/JTS, XA, 2PC, Tx recovery, deadlock detection, etc Scalable Several billions of nodes/rels/props on single JVM Robust 6+ years in 24/7 production
    • Social network pathExists() 12 ~1k persons 3 7 1 Avg 50 friends per person pathExists(a, b) limit 36 41 77 depth 4 5 Two backends Eliminate disk IO so warm up caches
    • Social network pathExists() 2 Emil 1 5 7 Mike Kevin 3 John Marcus 9 4 Bruce Leigh # persons query time Relational database 1 000 2 000 ms Graph database (Neo4j) 1 000 2 ms Graph database (Neo4j) 1 000 000 2 ms
    • Pros & Cons compared to RDBMS + No O/R impedance mismatch (whiteboard friendly) + Can easily evolve schemas + Can represent semi-structured info + Can represent graphs/networks (with performance) - Lacks in tool and framework support - No other implementations => potential lock in - No support for ad-hoc queries +
    • More consequences Ability to capture semi-structured information => allowing individualization of content No predefined schema => easier to evolve model => can capture ad-hoc relationships Can capture non-normative relations => easy to model specific links to specific sets All state is kept in transactional memory => improves application concurrency
    • The Neo4j ecosystem Neo4j is an embedded database Tiny teeny lil jar file Component ecosystem index-util neo-meta neo-utils owl2neo sparql-engine ... See http://components.neo4j.org
    • Example: NeoRDF NeoRDF triple/quad store OWL SPARQL RDF Metamodel Graph match Neo4j
    • Language bindings Neo4j.py – bindings for Jython and CPython http://components.neo4j.org/neo4j.py Neo4jrb – bindings for JRuby (incl RESTful API) http://wiki.neo4j.org/content/Ruby Clojure http://wiki.neo4j.org/content/Clojure Scala (incl RESTful API) http://wiki.neo4j.org/content/Scala … .NET? Erlang?
    • Scale out – replication Rolling out Neo4j HA before end-of-year Side note: ppl roll it today w/ REST frontends & onlinebackup Master-slave replication, 1st configuration MySQL style... ish Except all instances can write, synchronously between writing slave & master (strict consistency) Updates are asynchronously propagated to the other slaves (eventual consistency) This can handle billions of entities... … but not 100B
    • Scale out – partitioning Sharding possible today … but you have to do a lot of manual work … just as with MySQL Great option: shard on top of resilient, scalable OSS app server , see: www.codecauldron.org Transparent partitioning? Neo4j 2.0 100B? Easy to say. Sliiiiightly harder to do. Fundamentals: BASE & eventual consistency Generic clustering algorithm as base case, but give lots of knobs for developers
    • How ego are you? (aka other impls?) Franz’ AllegroGraph (http://agraph.franz.com) Proprietary, Lisp, RDF-oriented but real graphdb FreeBase graphd (http://bit.ly/13VITB) In-house at Metaweb Kloudshare (http://kloudshare.com) Graph database in the cloud, still stealth mode Google Pregel (http://bit.ly/dP9IP) We are oh-so-secret Some academic papers from ~10 years ago G = {V, E} #FAIL
    • Conclusion Graphs && Neo4j => teh awesome! Available NOW under AGPLv3 / commercial license AGPLv3: “if you’re open source, we’re open source” If you have proprietary software? Must buy a commercial license But up to 1M primitives it’s free for all uses! Download http://neo4j.org Feedback http://lists.neo4j.org
    • Questions? Image credit: lost again! Sorry :(
    • http://neotechnology.com