Upcoming SlideShare
×

# GraphDB in PHP @ Codemotion 03/23/2012

2,746 views

Published on

Presentation given about graph databases and OrientDB.

Published in: Technology
0 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

• Be the first to like this

Views
Total views
2,746
On SlideShare
0
From Embeds
0
Number of Embeds
463
Actions
Shares
0
20
0
Likes
0
Embeds 0
No embeds

No notes for slide

### GraphDB in PHP @ Codemotion 03/23/2012

1. 1. Graph databases:time for serious stuff David Funaro Alessandro Nadalin 1
2. 2. Agenda•Theory•When to use a graph?•Why graphDB?•The graphDB community•OrientDB•Orient PHP library 2
3. 3. Essential (Theory)G = (V, E) VertexGraph Edge A 3
4. 4. Binary Relation Hates A BItchy Scratchy 4
5. 5. Binary Relation Edge A BVertex Vertex 4
6. 6. Graph B E FA D G 5
7. 7. Undirected Graph B E A D FExample: Friendship 6
8. 8. Directed Edge Edge A BVertex Vertex 7
9. 9. Directed Graph A B A F DExample: Followee 8
10. 10. Path B E FA D G 9
11. 11. PathA B D G E F 10
12. 12. Graph -> GraphDBA GraphDB is a database that use the graph as its primary data structure 11
13. 13. ... when to use a graph ?
14. 14. Recommendations lives inJohn type shows Mr Fun Cinema B Bean loca tion lik Rome es shows Cinema A location type Thriller Se7en s ho ws location Milan Cinema C 14
15. 15. Recommendations lives inJohn x x x Fun type Mr Bean shows Cinema B loca tion lik ✓ Rome es type shows ✓ Cinema A location ✓ ✓ shows x x ✓ Thriller Se7en location Milan Cinema C 22
16. 16. Your data is a graph 23
17. 17. a tree is a graph 24
18. 18. parent_id is a graph 25
19. 19. Solve decision problems
20. 20. Maximum flow
21. 21. Given a dataset, calculate how to best organize it maximum flow
22. 22. travelling salesman problem
23. 23. The pizza guy needs to deliver on A, B,C.
24. 24. Decision base on distance, traffic, time and so on.
25. 25. Shortest path
26. 26. Identify "special" nodes of the graph
27. 27. Given your dataset, organize some clusters Are there some nodes which cannot belong to a cluster?They probably have some properties different from the average
28. 28. Given your dataset, organize some clusters Are there some nodes which cannot belong to a cluster?They probably have some properties different from the average ACHTUNG! TERRORISTEN!
29. 29. but ... why graphDB? 36
30. 30. Representing a Graph in: http://www.slideshare.net/slidarko/problemsolving-using-graph-traversals-searching- scoring-ranking-and-recommendation#✓ Relational Database (mysql, oracle)✓ Document Oriented DB (mongodb, couchdb)✓ XML Database (MarkLogic, eXist-db) 37
31. 31. where is the difference ? 38
32. 32. GraphDB A graph database is any storage system that provides index-free adjacency.http://www.slideshare.net/slidarko/problemsolving-using-graph-traversals-searching-scoring-ranking-and-recommendation
33. 33. Step by step exampleGiven a list of people, find their homepages 40
34. 34. Tree-based DB WAY David Funaro put in the Search Engine 2 find 3 1 http://davidfunaro.com 41
35. 35. Tree-based DB WAY David Funaro The cost to find Search Engine friend HP put in the a single 2grows as the friends HP tables grows find 3 1 http://davidfunaro.com 41
36. 36. GraphDB WAY 1 get the embedded information(index) www.odino.orgit’s like that the GraphDB has an additional information (the ancor <a>) 42
37. 37. GraphDB WAY The Anchor work as a local index to reach the document = index-free adjacency <a href=”http://odino.org”> Alessandro Nadalin </a> 43
38. 38. Local costThe local cost is O(k) = Constant 44
39. 39. Local costThe local cost is O(k) = Constant 45
40. 40. Local cost Thus, as the graph grows in size,the cost of a local step remain the same 46
41. 41. any database can implicity represent a graph BUTonly a graph database make the graph structure explicit 47
42. 42. BenchmarkDeph RDBMS Graph 1 100ms 30ms • 1 Million Vertex • 4 Million Edge 2 1000ms 500ms • Scale Free Tolopogy 3 10000ms 3000ms • MySql VS Neo4J 4 100000m 50000ms s • Both Hash and BTree 5 N/A 100000m s 48 http://markorodriguez.com/2011/02/18/mysql-vs-neo4j-on-a-large-scale-graph-traversal/
43. 43. How ?PREFIX geospecies: <http://rdf.geospecies.org/ont/geospecies#>PREFIX lycopodiophyta: <http://lod.geospecies.org/phyla/Pc2>PREFIX door_county: <http://sws.geonames.org/5250768/>PREFIX dcterms: <http://purl.org/dc/terms/>SELECT DISTINCT ?family_name ?canonicalName ?commonName ?identifier ?wikipedia_urlWHERE {?x geospecies:hasFamilyName ?family_name; geospecies:hasCanonicalName ?canonicalName; geospecies:hasCommonName ?commonName; dcterms:identifier ?identifier; geospecies:inPhylum lycopodiophyta:; geospecies:isUSDA_ExpectedIn door_county:. OPTIONAL { ?x geospecies:hasCommonName ?commonName; geospecies:hasWikipediaArticle ?wikipedia_url} } ORDER BY ?family_name ?canonicalName 49
44. 44. 50
45. 45. NoSPARQL http://blog.acaro.org/entry/somebody-is-going-to-hate-me-nosparql
46. 46. community that is building and feeding the GraphDB ecosystem NoSPARQL ThinkerPop Stack Databases
47. 47. data model and their implementation Blueprints is a collection of interfaces, implementations,ouplementations, and test suites for the property graph data model. Blueprints is analogous to the JDBC, but for graph databases. https://github.com/tinkerpop/blueprints/wiki/
48. 48. a data flow Framework using Process Graph provide a collection of "pipes" that are connected togheter to from processing pipelines
49. 49. a graph-based programming language.a Turing-Complete graph-base programming language that compiles Gremlin syntax down to Pipes
50. 50. a REST-full graph shell.Allow blueprints graph to be exposed through a RESTful API (HTTP)
51. 51. Whats hot
52. 52. OrientDB
53. 53. Glossary RID <10:05>Cluster Position CLASS 59
54. 54. Main features
55. 55. Inheritance
56. 56. class Vehicle class Carclass Bike
57. 57. class Vehicle class Car class BikeSELECT FROM Vehicle WHERE owner = 1:1
58. 58. class Vehicle class Car class Bikecan return records of class Bike or Car
59. 59. Traversal
60. 60. SELECT FROM fellas WHERE any() traverse(0,-1) ( @rid = [Michelle @rid] ) 68
61. 61. SELECT FROM fellas WHERE any() traverse(0,2) ( @rid = [Michelle @rid] )
62. 62. SQL synthax
63. 63. beyond SQL
64. 64. SELECT FROM authors WHERE book.title = ...
65. 65. ACID
66. 66. speaks JSON
67. 67. { "schema": { "name": "Address" }, "result": [{ "@type": "d", "@rid": "#13:0", "@version": 6, "@class": "Address", "type": "Residence", "street": "Piazza Navona, 1", "city": "#14:0", "nick": "Luca2" }, { ... ...
68. 68. Double Protocol
69. 69. HTTPUniversal
70. 70. HTTPEasy to interact with
71. 71. binary Blazing fast
72. 72. on-record SELECTs
73. 73. SELECT FROM cats
74. 74. SELECT FROM 11:0
75. 75. SELECT FROM [11:0,11:1]
76. 76. SELECT FROM [11:0,12:0]
77. 77. stress-free setup
78. 78. 2 Mb
79. 79. ./orient/bin/server.sh 94
80. 80. in-memory DB
81. 81. or disk-persisted
82. 82. Supports standards Supports standards 97
83. 83. •Inheritance •Traversal •SQL-like syntax •ACIDOrientDB •Speak JSON •Double protocol •on-record Select •ThinkerPop Compliant
84. 84. Language Bindings http://code.google.com/p/orient/wiki/ProgrammingLanguageBindings 99
85. 85. Orient = PHP Library to work with OrientDB https://github.com/congow/Orient 101
86. 86. Data MapperQuery Builder HTTP Binding
87. 87. HTTP Binding
89. 89. apart from ->query(\$SQL)
90. 90. ->get|delete|postClass(\$class)
91. 91. ->post|delete|put|getDocument(\$rid)
92. 92. ...and much more!(connect, disconnect, ...)
93. 93. Query Builder
95. 95.    \$query->select(array(name, username, email), false)    ->from(array(12:0, 12:1), false)    ->where(any() traverse ( any() like "%danger%" ))    ->orWhere("1 = ?", 1)    ->andWhere("links = ?", 1)    ->limit(20)    ->orderBy(username)    ->orderBy(name, true, true)    ->range("12:0", "12:1");  SELECT name, username, email   FROM [12:0, 12:1]   WHERE any() traverse ( any() like "%danger%" )  OR 1 = "1" AND links = "1"   ORDER BY name, username   LIMIT 20   RANGE 12:0 12:1
96. 96. Data Mapper
97. 97. namespace PolandPHPConEntity;use CongowOrientODMMapperAnnotations as ODM;/*** @ODMDocument(class="Person")*/class Speaker{    /**     * @ODMProperty( type="string")     */    protected \$name;    public function setName(\$name)    {        \$this->name = \$name;    }
98. 98. Domain Driven Design
99. 99. { "schema": { "name": "Speaker" }, "result": [{ "@type": "d", "@rid": "#1:0", "@version": 6, "@class": "Speaker", "name": "David Coallier" }, { ... ...
100. 100. { "schema": { "name": "Speaker" }, "result": [{ "@type": "d", "@rid": "#1:0", "@version": 6, "@class": "Speaker", "name": "David Coallier" }, { ... ... \$david = \$mapper->hydrate(json_decode(\$speaker));
101. 101. { "schema": { "name": "Speaker" }, "result": [{ "@type": "d", "@rid": "#1:0", "@version": 6, "@class": "Speaker", "name": "Martin Fowler" }, { ... ... \$david instanceOf PolandPHPConEntitySpeaker
102. 102. Repository Pattern\$repo = \$manager->getRepository(Speaker)
103. 103. \$speakers = \$repo->findAll();
104. 104. \$speaker = \$repo->find(\$rid);
105. 105. \$criteria = array(Name => Martin); \$lornas = \$repo->findBy(\$criteria);
106. 106. \$criteria = array( Name => Martin, last_name => Fowler);\$lornaJ = \$repo->findOneBy(\$criteria);
107. 107. https://github.com/doctrine/common/tree/master/lib/Doctrine/Common/Persistence 134
108. 108. 135
109. 109. That’s all, folks!David Funaro Alessandro Nadalin@ingdavidino @_odino_http://davidfunaro.com http://odino.org 136
110. 110. Creditshttp://www.flickr.com/photos/sayamindu/5677281218/sizes/l/in/photostream/ http://farm1.static.flickr.com/182/471383865_79d04aec36_o.png http://farm1.static.flickr.com/134/318947873_12028f1b66_b.jpg http://www.flickr.com/photos/atomdocs/3275758118/sizes/o/in/photostream/ http://www.flickr.com/photos/pattipics/5229478393/sizes/o/in/photostream/ http://www.flickr.com/photos/kongharald/366597251/sizes/o/in/photostream/ http://www.everaldo.com/ http://www.flickr.com/photos/tusnelda/6140792529/sizes/l/in/photostream/ http://www.flickr.com/photos/mondi/5368644355/sizes/l/in/photostream/ http://www.flickr.com/photos/jayneandd/4191106566/sizes/l/in/photostream/ http://www.flickr.com/photos/jooon/2093253534/sizes/l/in/photostream/ http://www.flickr.com/photos/bluedharma/89186151/sizes/o/in/photostream/ http://www.flickr.com/photos/exfordy/2747089295/sizes/l/in/photostream/ http://www.flickr.com/photos/nostri-imago/3137422976/sizes/o/in/photostream/ http://www.flickr.com/photos/fionasjournal/379587818/sizes/z/in/photostream/ http://www.flickr.com/photos/nperlapro/1297392267/ http://www.flickr.com/photos/fastphive/28428808/sizes/m/in/photostream/ http://www.flickr.com/photos/rnugraha/2003147365/sizes/o/in/photostream/ http://www.flickr.com/photos/zigazou76/4412946911/sizes/l/in/photostream/ http://www.flickr.com/photos/greatnet/4667555436/sizes/l/in/photostream/ http://www.flickr.com/photos/mnsc/2768391365/sizes/l/in/photostream/http://www.flickr.com/photos/christmaswithak/4675962453/sizes/l/in/photostream/ http://www.amazon.com/Trainspotting-Irvine-Welsh/dp/0393314804http://www.flickr.com/photos/franconadalin59/5778176872/sizes/l/in/photostream/ http://farm6.static.flickr.com/5176/5474445627_875d621689_b.jpg http://farm3.static.flickr.com/2243/2189435082_a16d3c89ae_b.jpg http://farm3.static.flickr.com/2647/3816311930_ac52cff491_o.jpg http://i130.photobucket.com/albums/p266/feike1977/PES6-4-3-3defencesettings.jpg http://images.usatoday.com/life/_photos/2006/11/30/numb3rs-topper.jpg http://www.flickr.com/photos/jakecaptive/3205277810/sizes/l/in/photostream/