Graph db: time for serious stuff @ codemotion 23/03/2012
Upcoming SlideShare
Loading in...5
×
 

Graph db: time for serious stuff @ codemotion 23/03/2012

on

  • 7,068 views

Graph databases are not widespread in the development communities, although they are a swiss-army knife for problem that the relational model can't simple handle well. In this talk we're gonna talk ...

Graph databases are not widespread in the development communities, although they are a swiss-army knife for problem that the relational model can't simple handle well. In this talk we're gonna talk for a few minutes about the graph theory, see how to easily solve a few relational anti-patterns with graph databases and how to integrate them in your next project. At the end we will take a practical look to OrientDB, "next big thing" of the NoSQL ecosystem, through its PHP Data Mapper, "Orient".

Statistics

Views

Total Views
7,068
Views on SlideShare
4,793
Embed Views
2,275

Actions

Likes
11
Downloads
122
Comments
0

22 Embeds 2,275

http://davidfunaro.com 2005
http://www.codemotion.it 73
http://dev.dnsee.com 67
http://presentz.org 47
http://roma2012.codemotion.it 17
http://abtasty.com 17
http://flavors.me 9
http://roma.codemotion.it 9
http://translate.googleusercontent.com 7
http://ideas.dnsee.com 5
http://codemotion.loc 4
http://dev.presentz.org 2
https://www.google.com 2
http://www.365dailyjournal.com 2
http://50.56.186.117 2
http://ranksit.com 1
http://davidfunaro.flavors.me 1
http://www.linkedin.com 1
http://www.apple.com 1
http://lilly-dev 1
http://local.host 1
https://www.linkedin.com 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Graph db: time for serious stuff @ codemotion 23/03/2012 Graph db: time for serious stuff @ codemotion 23/03/2012 Presentation Transcript

  • Graph databases:time for serious stuff David Funaro Alessandro Nadalin 1
  • Agenda•Theory•When to use a graph?•Why graphDB?•The graphDB community•OrientDB•Orient PHP library 2
  • Essential (Theory)G = (V, E) VertexGraph Edge A 3
  • Binary Relation Hates A BItchy Scratchy 4
  • Binary Relation Edge A BVertex Vertex 4
  • Graph B E FA D G 5
  • Undirected Graph B E A D FExample: Friendship 6
  • Directed Edge Edge A BVertex Vertex 7
  • Directed Graph A B A F DExample: Followee 8
  • Path B E FA D G 9
  • PathA B D G E F 10
  • Graph -> GraphDBA GraphDB is a database that use the graph as its primary data structure 11
  • ... when to use a graph ?
  • Recommendations lives inJohn type shows Mr Fun Cinema B Bean loca tion lik Rome es shows Cinema A location type Thriller Se7en s ho ws location Milan Cinema C 14
  • Recommendations lives inJohn x x x Fun type Mr Bean shows Cinema B loca tion lik ✓ Rome es type shows ✓ Cinema A location ✓ ✓ shows x x ✓ Thriller Se7en location Milan Cinema C 22
  • Your data is a graph 23
  • a tree is a graph 24
  • parent_id is a graph 25
  • Solve decision problems
  • Maximum flow
  • Given a dataset, calculate how to best organize it maximum flow
  • travelling salesman problem
  • The pizza guy needs to deliver on A, B,C.
  • Decision base on distance, traffic, time and so on.
  • Shortest path
  • Identify "special" nodes of the graph
  • Given your dataset, organize some clusters Are there some nodes which cannot belong to a cluster?They probably have some properties different from the average
  • Given your dataset, organize some clusters Are there some nodes which cannot belong to a cluster?They probably have some properties different from the average ACHTUNG! TERRORISTEN!
  • but ... why graphDB? 36
  • Representing a Graph in: http://www.slideshare.net/slidarko/problemsolving-using-graph-traversals-searching- scoring-ranking-and-recommendation#✓ Relational Database (mysql, oracle)✓ Document Oriented DB (mongodb, couchdb)✓ XML Database (MarkLogic, eXist-db) 37
  • where is the difference ? 38
  • GraphDB A graph database is any storage system that provides index-free adjacency.http://www.slideshare.net/slidarko/problemsolving-using-graph-traversals-searching-scoring-ranking-and-recommendation
  • Step by step exampleGiven a list of people, find their homepages 40
  • Tree-based DB WAY David Funaro put in the Search Engine 2 find 3 1 http://davidfunaro.com 41
  • Tree-based DB WAY David Funaro The cost to find Search Engine friend HP put in the a single 2grows as the friends HP tables grows find 3 1 http://davidfunaro.com 41
  • GraphDB WAY 1 get the embedded information(index) www.odino.orgit’s like that the GraphDB has an additional information (the ancor <a>) 42
  • GraphDB WAY The Anchor work as a local index to reach the document = index-free adjacency <a href=”http://odino.org”> Alessandro Nadalin </a> 43
  • Local costThe local cost is O(k) = Constant 44
  • Local costThe local cost is O(k) = Constant 45
  • Local cost Thus, as the graph grows in size,the cost of a local step remain the same 46
  • any database can implicity represent a graph BUTonly a graph database make the graph structure explicit 47
  • BenchmarkDeph RDBMS Graph 1 100ms 30ms • 1 Million Vertex • 4 Million Edge 2 1000ms 500ms • Scale Free Tolopogy 3 10000ms 3000ms • MySql VS Neo4J 4 100000m 50000ms s • Both Hash and BTree 5 N/A 100000m s 48 http://markorodriguez.com/2011/02/18/mysql-vs-neo4j-on-a-large-scale-graph-traversal/
  • How ?PREFIX geospecies: <http://rdf.geospecies.org/ont/geospecies#>PREFIX lycopodiophyta: <http://lod.geospecies.org/phyla/Pc2>PREFIX door_county: <http://sws.geonames.org/5250768/>PREFIX dcterms: <http://purl.org/dc/terms/>SELECT DISTINCT ?family_name ?canonicalName ?commonName ?identifier ?wikipedia_urlWHERE {?x geospecies:hasFamilyName ?family_name; geospecies:hasCanonicalName ?canonicalName; geospecies:hasCommonName ?commonName; dcterms:identifier ?identifier; geospecies:inPhylum lycopodiophyta:; geospecies:isUSDA_ExpectedIn door_county:. OPTIONAL { ?x geospecies:hasCommonName ?commonName; geospecies:hasWikipediaArticle ?wikipedia_url} } ORDER BY ?family_name ?canonicalName 49
  • 50
  • NoSPARQL http://blog.acaro.org/entry/somebody-is-going-to-hate-me-nosparql
  • community that is building and feeding the GraphDB ecosystem NoSPARQL ThinkerPop Stack Databases
  • data model and their implementation Blueprints is a collection of interfaces, implementations,ouplementations, and test suites for the property graph data model. Blueprints is analogous to the JDBC, but for graph databases. https://github.com/tinkerpop/blueprints/wiki/
  • a data flow Framework using Process Graph provide a collection of "pipes" that are connected togheter to from processing pipelines
  • a graph-based programming language.a Turing-Complete graph-base programming language that compiles Gremlin syntax down to Pipes
  • a REST-full graph shell.Allow blueprints graph to be exposed through a RESTful API (HTTP)
  • Whats hot
  • OrientDB
  • Glossary RID <10:05>Cluster Position CLASS 59
  • Main features
  • Inheritance
  • class Vehicle class Carclass Bike
  • class Vehicle class Car class BikeSELECT FROM Vehicle WHERE owner = 1:1
  • class Vehicle class Car class Bikecan return records of class Bike or Car
  • Traversal
  • SELECT FROM fellas WHERE any() traverse(0,-1) ( @rid = [Michelle @rid] ) 68
  • SELECT FROM fellas WHERE any() traverse(0,2) ( @rid = [Michelle @rid] )
  • SQL synthax
  • beyond SQL
  • SELECT FROM authors WHERE book.title = ...
  • ACID
  • speaks JSON
  • { "schema": { "name": "Address" }, "result": [{ "@type": "d", "@rid": "#13:0", "@version": 6, "@class": "Address", "type": "Residence", "street": "Piazza Navona, 1", "city": "#14:0", "nick": "Luca2" }, { ... ...
  • Double Protocol
  • HTTPUniversal
  • HTTPEasy to interact with
  • binary Blazing fast
  • on-record SELECTs
  • SELECT FROM cats
  • SELECT FROM 11:0
  • SELECT FROM [11:0,11:1]
  • SELECT FROM [11:0,12:0]
  • stress-free setup
  • 2 Mb
  • ./orient/bin/server.sh 94
  • in-memory DB
  • or disk-persisted
  • Supports standards Supports standards 97
  • •Inheritance •Traversal •SQL-like syntax •ACIDOrientDB •Speak JSON •Double protocol •on-record Select •ThinkerPop Compliant
  • Language Bindings http://code.google.com/p/orient/wiki/ProgrammingLanguageBindings 99
  • Orient = PHP Library to work with OrientDB https://github.com/congow/Orient 101
  • Data MapperQuery Builder HTTP Binding
  • HTTP Binding
  • use CongowOrient;use CongowOrientFoundationBinding;$driver   = new OrientHttpClientCurl();$orient   = new Binding($driver, 127.0.0.1, 2480, admin, admin, demo);$response = $orient->query("SELECT FROM Address");$output   = json_decode($response->getBody());foreach ($output->result as $address){  var_dump($address->street);}
  • apart from ->query($SQL)
  • ->get|delete|postClass($class)
  • ->post|delete|put|getDocument($rid)
  • ...and much more!(connect, disconnect, ...)
  • Query Builder
  • use CongowOrientQuery;$query = new Query();$query->from(array(users))->where(username = ?, "admin");echo $query->getRaw(); // SELECT FROM users WHERE username = "admin"
  •    $query->select(array(name, username, email), false)    ->from(array(12:0, 12:1), false)    ->where(any() traverse ( any() like "%danger%" ))    ->orWhere("1 = ?", 1)    ->andWhere("links = ?", 1)    ->limit(20)    ->orderBy(username)    ->orderBy(name, true, true)    ->range("12:0", "12:1");  SELECT name, username, email   FROM [12:0, 12:1]   WHERE any() traverse ( any() like "%danger%" )  OR 1 = "1" AND links = "1"   ORDER BY name, username   LIMIT 20   RANGE 12:0 12:1
  • Data Mapper
  • namespace PolandPHPConEntity;use CongowOrientODMMapperAnnotations as ODM;/*** @ODMDocument(class="Person")*/class Speaker{    /**     * @ODMProperty( type="string")     */    protected $name;    public function setName($name)    {        $this->name = $name;    }
  • Domain Driven Design
  • { "schema": { "name": "Speaker" }, "result": [{ "@type": "d", "@rid": "#1:0", "@version": 6, "@class": "Speaker", "name": "David Coallier" }, { ... ...
  • { "schema": { "name": "Speaker" }, "result": [{ "@type": "d", "@rid": "#1:0", "@version": 6, "@class": "Speaker", "name": "David Coallier" }, { ... ... $david = $mapper->hydrate(json_decode($speaker));
  • { "schema": { "name": "Speaker" }, "result": [{ "@type": "d", "@rid": "#1:0", "@version": 6, "@class": "Speaker", "name": "Martin Fowler" }, { ... ... $david instanceOf PolandPHPConEntitySpeaker
  • Repository Pattern$repo = $manager->getRepository(Speaker)
  • $speakers = $repo->findAll();
  • $speaker = $repo->find($rid);
  • $criteria = array(Name => Martin); $lornas = $repo->findBy($criteria);
  • $criteria = array( Name => Martin, last_name => Fowler);$lornaJ = $repo->findOneBy($criteria);
  • https://github.com/doctrine/common/tree/master/lib/Doctrine/Common/Persistence 134
  • 135
  • That’s all, folks!David Funaro Alessandro Nadalin@ingdavidino @_odino_http://davidfunaro.com http://odino.org 136
  • Creditshttp://www.flickr.com/photos/sayamindu/5677281218/sizes/l/in/photostream/ http://farm1.static.flickr.com/182/471383865_79d04aec36_o.png http://farm1.static.flickr.com/134/318947873_12028f1b66_b.jpg http://www.flickr.com/photos/atomdocs/3275758118/sizes/o/in/photostream/ http://www.flickr.com/photos/pattipics/5229478393/sizes/o/in/photostream/ http://www.flickr.com/photos/kongharald/366597251/sizes/o/in/photostream/ http://www.everaldo.com/ http://www.flickr.com/photos/tusnelda/6140792529/sizes/l/in/photostream/ http://www.flickr.com/photos/mondi/5368644355/sizes/l/in/photostream/ http://www.flickr.com/photos/jayneandd/4191106566/sizes/l/in/photostream/ http://www.flickr.com/photos/jooon/2093253534/sizes/l/in/photostream/ http://www.flickr.com/photos/bluedharma/89186151/sizes/o/in/photostream/ http://www.flickr.com/photos/exfordy/2747089295/sizes/l/in/photostream/ http://www.flickr.com/photos/nostri-imago/3137422976/sizes/o/in/photostream/ http://www.flickr.com/photos/fionasjournal/379587818/sizes/z/in/photostream/ http://www.flickr.com/photos/nperlapro/1297392267/ http://www.flickr.com/photos/fastphive/28428808/sizes/m/in/photostream/ http://www.flickr.com/photos/rnugraha/2003147365/sizes/o/in/photostream/ http://www.flickr.com/photos/zigazou76/4412946911/sizes/l/in/photostream/ http://www.flickr.com/photos/greatnet/4667555436/sizes/l/in/photostream/ http://www.flickr.com/photos/mnsc/2768391365/sizes/l/in/photostream/http://www.flickr.com/photos/christmaswithak/4675962453/sizes/l/in/photostream/ http://www.amazon.com/Trainspotting-Irvine-Welsh/dp/0393314804http://www.flickr.com/photos/franconadalin59/5778176872/sizes/l/in/photostream/ http://farm6.static.flickr.com/5176/5474445627_875d621689_b.jpg http://farm3.static.flickr.com/2243/2189435082_a16d3c89ae_b.jpg http://farm3.static.flickr.com/2647/3816311930_ac52cff491_o.jpg http://i130.photobucket.com/albums/p266/feike1977/PES6-4-3-3defencesettings.jpg http://images.usatoday.com/life/_photos/2006/11/30/numb3rs-topper.jpg http://www.flickr.com/photos/jakecaptive/3205277810/sizes/l/in/photostream/