Why relationships are cool but join sucks - Big Data & Graphs in Rome

1,815 views

Published on

Published in: Software
0 Comments
15 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,815
On SlideShare
0
From Embeds
0
Number of Embeds
17
Actions
Shares
0
Downloads
44
Comments
0
Likes
15
Embeds 0
No embeds

No notes for slide
  • Good afternoon!
    Today I’d like to show you a new way to design a database.
    In 1970 Relational DBMS
  • Why relationships are cool but join sucks - Big Data & Graphs in Rome

    1. 1. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 1 www.orientechnologies.com Luca Garulli – Founder and CEO @Orient Technologies Ltd Author of OrientDB www.twitter.com/lgarulli Why Relationships are cool but the “JOIN” sucks BigData & Graphs In Rome
    2. 2. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 2 1979 First Relational DBMS available as product 2009 NoSQL movement
    3. 3. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 3 1979 First Relational DBMS available as product 2009 NoSQL movement Hey, 30 years in the IT field is so huge!
    4. 4. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 4 Before 2009 teams of developers always fought to select: Operative System Programming Language Middleware (App-Servers) What about the Database?
    5. 5. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 5 One of the main resistances of RDBMS users to pass to a NoSQL product are related to the complexity of the model: Ok, NoSQL products are super for BigData and BigScale but...
    6. 6. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 6 ...what about the model?
    7. 7. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 7 What is the NoSQL answer about managing complex domains? Key-Value stores ? Column-Based ? Document database ? Graph database ! NoRelationships support
    8. 8. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 8 Why most of NoSQL products don’t support Relationship Between entities?
    9. 9. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 9 To understand why, let’s see how Relational DBMS managed them
    10. 10. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 10 Domain: the super minimal “Selling App” CustomerCustomer AddressAddress OrderOrder StockStock Registry system Order system
    11. 11. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 11 StockStock Registry system Domain: the super minimal “Selling App” OrderOrder Order system CustomerCustomer AddressAddress How does Relational DBMS manage relationships?
    12. 12. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 12 Relational World: 1-1 Relationships JOIN Customer.Address -> Address.Id Customer Id Name Address 10 Luca 34 11 Jill 44 34 John 54 56 Mark 66 88 Steve 68 Address Id Location 34 Rome 44 London 54 Moscow 66 New Mexico 68 Palo Alto Foreign key Primary keyPrimary key
    13. 13. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 13 Relational World: 1-N Relationships Inverse JOIN Address.Customer -> Customer.Id Customer Id Name 10 Luca 11 Jill 34 John 56 Mark 88 Steve Address Id Customer Location 24 10 Rome 33 10 London 44 34 Moscow 66 56 Cologne 68 88 Palo Alto
    14. 14. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 14 Relational World: N-M Relationships Additional table with 2 JOINs (1) CustomerAddress.Id -> Customer.Id and (2) CustomerAddress.Address -> Address.Id Customer Id Name 10 Luca 11 Jill 34 John 56 Mark 88 Steve Address Id Location 24 Rome 33 London 44 Moscow 66 Cologne 68 Palo Alto CustomerAddress Id Address 10 24 10 33 34 44
    15. 15. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 15 What’s wrong with the Relational Model?
    16. 16. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 16 These are all JOINs executed everytime you traverse a relationship The JOIN is the evil! Customer Id Name 10 Luca 11 Jill 34 John 56 Mark 88 Steve Address Id Location 24 Rome 33 London 44 Moscow 66 Cologne 68 Palo Alto These are all JOINs executed everytime you traverse a relationship These are all JOINs executed everytime you traverse a relationship These are all JOINs executed everytime you traverse a relationship! CustomerAddress Id Address 10 24 10 33 34 24
    17. 17. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 17 A JOIN means searching for a key in another table The first rule to improve performance is indexing all the keys Index speeds up searches, but slows down insert, updates and deletes
    18. 18. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 18 So in the best case a JOIN is a lookup into an index This is done per single join! If you traverse hundreds of relationships you’re executing hundreds of JOINs
    19. 19. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 19 Index Lookup is it really that fast?
    20. 20. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 20 Index Lookup: how does it works? A-Z A-L M-Z Think to an Address Book where we have to find the Luca’s phone number
    21. 21. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 21 Index Lookup: how does it works? A-Z A-L M-Z A-L A-D E-L M-Z M-R S-Z Index algorithms are all similar and based on balanced trees
    22. 22. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 22 Index Lookup: how does it works? A-Z A-L M-Z A-L A-D E-L M-Z M-R S-Z A-D A-B C-D E-L E-G H-L
    23. 23. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 23 Index Lookup: how does it works? A-Z A-L M-Z A-L A-D E-L M-Z M-R S-Z A-D A-B C-D E-L E-G H-L E-G E-F G H-L H-J K-L
    24. 24. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 24 Index Lookup: how does it works? A-Z A-L M-Z A-L A-D E-L M-Z M-R S-Z A-D A-B C-D E-L E-G H-L E-G E-F G H-L H-J K-L Luca Found! This lookup took 5 steps and grows up with the index size! Found! This lookup took 5 steps and grows up with the index size!
    25. 25. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 25 Can you imagine how many steps a Lookup operation does into an Index with Millions or Billions of records?
    26. 26. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 26 And this JOIN is executed foreach involved table, multiplied foreach scanned records !
    27. 27. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 27 Querying more tables can easily produce millions of JOINs/Lookups! Here the rule: more entries = more lookup steps = slower JOIN
    28. 28. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 28 Oh! This is why performance of my database drops down when it becomes bigger, and bigger, and bigger!
    29. 29. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 29 What about Document Databases like MongoDB?
    30. 30. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 30 How MongoDB manages relationships: { “_id” : “292846512”, “type” : “Order”, “number” : 1223, “customer” : “123456789” }
    31. 31. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 31 MongoDB uses the same approach: it stores the _id of the connected documents. At run-time it lookups up for the _id by using an index.
    32. 32. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 32 Is there a better way to manage relationships?
    33. 33. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 33 “A graph database is any storage system that provides index-free adjacency” - Marko Rodriguez (author of TinkerPop Blueprints)
    34. 34. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 34 How does GraphDB manage index-free relationships?
    35. 35. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 35 Every developer knows the Relational Model, but who knows the Graph one?
    36. 36. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 36 Back to school: Graph Theory crash course
    37. 37. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 37 Basic Graph LucaLuca NoSQL Day NoSQL Day Likes
    38. 38. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 38 Property Graph Model* Luca name: Luca surname: Garulli company: Orient Tech Luca name: Luca surname: Garulli company: Orient Tech NoSQL Day date: Nov 15° 2013 NoSQL Day date: Nov 15° 2013 Likes since: 2013 Vertices and Edges can have properties Vertices and Edges can have properties Vertices and Edges can have properties Vertices are directed * https://github.com/tinkerpop/blueprints/wiki/Property-Graph-Model
    39. 39. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 39 Property Graph Model LucaLuca NoSQL Day NoSQL Day Likes since: 2013 Speaks title: «Switching...» abstract: «This talk presents...» An Edge connects 2 vertices: use multiple edges to represents 1-N and N-M relationships
    40. 40. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 40 Property Graph Model Likes DanielDaniel LucaLuca Organizes FriendOf NoSQL Day NoSQL Day UdineUdine located Studies
    41. 41. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 41 Compliments, this is your diploma in «Graph Theory»
    42. 42. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 42 The Graph theory is so simple to be so powerful
    43. 43. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 43 Let’s go back to the Graph Stuff How does OrientDB manage relationships?
    44. 44. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 44 Luca (vertex) Luca (vertex) OrientDB: traverse a relationship label : ‘Customer’ name : ‘Luca’ label : ‘Customer’ name : ‘Luca’ RID = #13:35RID = #13:35 RID = #13:100RID = #13:100 label = ‘Address’ name = ‘Rome’ label = ‘Address’ name = ‘Rome’ The Record ID (RID) is the physical position Rome (vertex) Rome (vertex) The Record ID (RID) is the physical position
    45. 45. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 45 Lives OrientDB: traverse a relationship out : [#14:54] label : ‘Customer’ name : ‘Luca’ out : [#14:54] label : ‘Customer’ name : ‘Luca’ out: [#13:35] in: [#13:100] Label : ‘Lives’ out: [#13:35] in: [#13:100] Label : ‘Lives’ RID = #13:35RID = #13:35 RID = #13:100RID = #13:100 in: [#14:54] label = ‘Address’ name = ‘Rome’ in: [#14:54] label = ‘Address’ name = ‘Rome’ The Edge’s RID is saved inside both vertices, as «out» and «in» The Edge’s RID is saved inside both vertices, as «out» and «in» RID = #14:54RID = #14:54 Luca (vertex) Luca (vertex) Rome (vertex) Rome (vertex)
    46. 46. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 46 LucaLuca Lives OrientDB: traverse -> outgoing out : [#14:54] label : ‘Customer’ name : ‘Luca’ out : [#14:54] label : ‘Customer’ name : ‘Luca’ out: [#13:35] in: [#13:100] Label : ‘Lives’ out: [#13:35] in: [#13:100] Label : ‘Lives’ RID = #13:35RID = #13:35 RID = #14:54RID = #14:54 RID = #13:100RID = #13:100 in: [#14:54] label = ‘Address’ name = ‘Rome’ in: [#14:54] label = ‘Address’ name = ‘Rome’ RomeRome
    47. 47. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 47 LucaLuca Lives OrientDB: traverse <- incoming out : [#14:54] label : ‘Customer’ name : ‘Luca’ out : [#14:54] label : ‘Customer’ name : ‘Luca’ out: [#13:35] in: [#13:100] Label : ‘Lives’ out: [#13:35] in: [#13:100] Label : ‘Lives’ RID = #13:35RID = #13:35 RID = #14:54RID = #14:54 RID = #13:100RID = #13:100 in: [#14:54] label = ‘Address’ name = ‘Rome’ in: [#14:54] label = ‘Address’ name = ‘Rome’ RomeRome
    48. 48. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 48 GraphDB handles relationships as a physical LINK to the record assigned when the edge is created on the other side RDBMS computes the relationship every time you query a database Is not that crazy?!
    49. 49. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 49 This means jumping from a O(log N) algorithm to a near O(1) traversing cost is not more affected by database size! This is huge in the BigData age
    50. 50. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 50 an Open Source (Apache licensed) document-graph NoSQL dbms
    51. 51. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 51 OrientDB in the Blueprints micro-benchmark, on common hw, with a hot cache, traverses 29,6 Millions of records in less than 5 seconds about 6 Millions of nodes traversed per sec! *unless you live in the Google’s server farm Do not try this at home with a RDBMS*!
    52. 52. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 52 Create the graph in SQL $luca> cd bin $luca> ./console.sh OrientDB console v.1.6.1 (www.orientdb.org) Type 'help' to display all the commands supported. orientdb> create vertex Customer set name = ‘Luca’ Created vertex #13:35 in 0.03 secs orientdb> create vertex Address set name = ‘Rome’ Created vertex #13:100 in 0.02 secs orientdb> create edge Lives from #13:35 to #13:100 Created edge #14:54 in 0.02 secs
    53. 53. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 53 Create the graph in Java Graph graph = new OrientGraph("local:/tmp/db/graph”); Vertex luca = graph.addVertex( “class:Customer” ); luca.setProperty( “name", “Luca” ); Vertex rome = graph.addVertex ( “class:Address” ); rome.setProperty( “name", “Rome” ); Edge edge = luca.addEdge( “Lives”, rome ); graph.shutdown();
    54. 54. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 54 Query the graph in SQL orientdb> select in(‘Lives’) from Address where name = ‘Rome’ ---+------+---------|--------------------+--------------------+--------+   #| RID  |@class   |label               |out_Lives           |in      | ---+------+---------+--------------------+--------------------+--------+   0| 13:35|Customer |Luca                |[#14:54]            |        | ---+------+---------+--------------------+--------------------+--------+ 1 item(s) found. Query executed in 0.007 sec(s). Incoming vertices
    55. 55. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 55 More on query power orientdb> select sum( out(‘Order’).total ) from Customer where name = ‘Luca’ orientdb> traverse both(‘Friend’) from Customer while $depth <= 7 orientdb> select from ( traverse both(‘Friend’) from Customer while $depth <= 7 ) where @class=‘Customer’ and city.name = ‘Udine’
    56. 56. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 56 Query vs traversal Once you’ve a well connected database in the form of a Super Graph you can cross records instead of query them! All you need is a few“Root Vertices” where to start traversing
    57. 57. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 57 Query vs traversal CustomersCustomers LucaLuca Mar k Mar k JillJill Order 2332 Order 2332 Order 8834 Order 8834 White Soap White Soap StocksStocks Special Customers Special Customers This is a root vertex
    58. 58. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 58 Root Vertices can be enriched by Meta Graphs to decorate Graphs with additional information and make easier/faster the retrieval
    59. 59. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 59 Temporal based Meta Graph Order 2333 Order 2333 Order 2334 Order 2334 CalendarCalendar Hour 9/4/2013 10:00 Hour 9/4/2013 10:00 Hour 9/4/2013 09:00 Hour 9/4/2013 09:00 Order 2332 Order 2332 Day 9/4/2013 Day 9/4/2013 Month April 2013 Month April 2013 Year 2013 Year 2013
    60. 60. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 60 Location based Meta Graph Order 2333 Order 2333 Order 2334 Order 2334 LocationLocation City Rome City Rome City Fiumicino City Fiumicino Order 2332 Order 2332 State RM State RM Region Lazio Region Lazio Country Italy Country Italy
    61. 61. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 61 Mix & Merge graphs Order 2333 Order 2333 Order 2334 Order 2334 LocationLocation City Rome City Rome City Fiumicino City Fiumicino Order 2332 Order 2332 State RM State RM Region Lazio Region Lazio Country Italy Country Italy CalendarCalendar Hour 9/4/2013 10:00 Hour 9/4/2013 10:00 Hour 9/4/2013 09:00 Hour 9/4/2013 09:00 Day 9/4/2013 Day 9/4/2013 Month April 2013 Month April 2013 Year 2013 Year 2013
    62. 62. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 62 Order 2333 Order 2333 Order 2334 Order 2334 LocationLocation City Rome City Rome City Fiumicino City Fiumicino Order 2332 Order 2332 State RM State RM Region Lazio Region Lazio Country Italy Country Italy CalendarCalendar Hour 9/4/2013 10:00 Hour 9/4/2013 10:00 Hour 9/4/2013 09:00 Hour 9/4/2013 09:00 Day 9/4/2013 Day 9/4/2013 Month April 2013 Month April 2013 Year 2013 Year 2013 Get all the orders sold in “Fiumicino” city on 9/4/2013 at 10:00
    63. 63. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 63 Start from Calendar, look for Hour 10:00 Order 2333 Order 2333 Order 2334 Order 2334 LocationLocation City Rome City Rome City Fiumicino City Fiumicino Order 2332 Order 2332 State RM State RM Region Lazio Region Lazio Country Italy Country Italy CalendarCalendar Hour 9/4/2013 10:00 Hour 9/4/2013 10:00 Hour 9/4/2013 09:00 Hour 9/4/2013 09:00 Day 9/4/2013 Day 9/4/2013 Month April 2013 Month April 2013 Year 2013 Year 2013
    64. 64. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 64 Start from Calendar, look for Hour 10:00 Order 2333 Order 2333 Order 2334 Order 2334 LocationLocation City Rome City Rome City Fiumicino City Fiumicino Order 2332 Order 2332 State RM State RM Region Lazio Region Lazio Country Italy Country Italy CalendarCalendar Hour 9/4/2013 10:00 Hour 9/4/2013 10:00 Hour 9/4/2013 09:00 Hour 9/4/2013 09:00 Day 9/4/2013 Day 9/4/2013 Month April 2013 Month April 2013 Year 2013 Year 2013 Found 2 Orders, filter by incoming edges< Found 2 Orders, now filter by incoming edges
    65. 65. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 65 Order 2333 Order 2333 LocationLocation City Fiumicino City Fiumicino Order 2332 Order 2332 State RM State RM Region Lazio Region Lazio Country Italy Country Italy CalendarCalendar Hour 9/4/2013 10:00 Hour 9/4/2013 10:00 Hour 9/4/2013 09:00 Hour 9/4/2013 09:00 Day 9/4/2013 Day 9/4/2013 Month April 2013 Month April 2013 Year 2013 Year 2013 Order 2334 Order 2334 Only “Order 2333” has incoming connections with “Fiumicino” City Rome City Rome Start from Calendar, look for Hour 10:00
    66. 66. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 66 Order 2333 Order 2333 LocationLocation City Fiumicino City Fiumicino Order 2332 Order 2332 State RM State RM Region Lazio Region Lazio Country Italy Country Italy CalendarCalendar Hour 9/4/2013 10:00 Hour 9/4/2013 10:00 Hour 9/4/2013 09:00 Hour 9/4/2013 09:00 Day 9/4/2013 Day 9/4/2013 Month April 2013 Month April 2013 Year 2013 Year 2013 Order 2334 Order 2334 City Rome City Rome Or start from Location, look for Fiumicino
    67. 67. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 67 Order 2333 Order 2333 Order 2332 Order 2332 CalendarCalendar Hour 9/4/2013 10:00 Hour 9/4/2013 10:00 Hour 9/4/2013 09:00 Hour 9/4/2013 09:00 Day 9/4/2013 Day 9/4/2013 Month April 2013 Month April 2013 Year 2013 Year 2013 Order 2334 Order 2334 Start from Location, look for Fiumicino LocationLocation City Rome City Rome City Fiumicino City Fiumicino State RM State RM Region Lazio Region Lazio Country Italy Country Italy
    68. 68. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 68 LucaLuca Recommendation system JillJill EnricoEnrico Friend Friend
    69. 69. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 69 Da CarloneDa Carlone LucaLuca Recommendation system JillJill EnricoEnrico La Mediterranea La Mediterranea MeridionaleMeridionale Friend Friend Eats Eats Eats Eats EaitalyEaitaly Eats
    70. 70. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 70 Recommendation system Da CarloneDa Carlone LucaLuca JillJill EnricoEnrico La Mediterranea La Mediterranea MeridionaleMeridionale Friend Friend Eats Eats Eats Eats EaitalyEaitaly select both(‘Friend’) from Person where name = ‘Luca’ select both(‘Friend’) from Person where name = ‘Luca’
    71. 71. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 71 Recommendation system Da CarloneDa Carlone LucaLuca JillJill EnricoEnrico La Mediterranea La Mediterranea MeridionaleMeridionale Friend Friend Eats Eats Eats Eats EaitalyEaitaly select both(‘Friend’).out(‘Eats’) from Person where name = ‘Luca’ select both(‘Friend’).out(‘Eats’) from Person where name = ‘Luca’
    72. 72. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 72 Recommendation system Da CarloneDa Carlone LucaLuca JillJill EnricoEnrico La Mediterranea La Mediterranea MeridionaleMeridionale Friend Friend Eats Eats Eats Eats EaitalyEaitaly select both(‘Friend’).out(‘Eats’) from Person where name = ‘Luca’ select both(‘Friend’).out(‘Eats’) from Person where name = ‘Luca’
    73. 73. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 73 This is your database
    74. 74. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 74 Get last customer bought ‘Barolo’ select last(out(‘Order’).in(‘Customer)) from Stock where name = ‘Barolo’ #34:22
    75. 75. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 75 Get his’s country select out(‘City’) from #34:22 Udine, Italy #55:12
    76. 76. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 76 Get orders from that country select in(‘Customer’) from #55:12
    77. 77. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 77 Let’s move like a Spider on the web
    78. 78. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 78 OrientDB = { flexibility of Document databases + complexity of the Graph model + Object Oriented concepts + super fast Index + powerful SQL dialect + multi-master replication and sharding}
    79. 79. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 79 Ø configdownload, unzip, run! cut & paste the db
    80. 80. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 80 150,000records per second (flat records, no index, on commodity hw)
    81. 81. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 81 Schema-less schema is not mandatory, relaxed model, collect heterogeneous documents all together
    82. 82. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 82 Schema-full schema with constraints on fields and validation rules Customer.age > 17 Customer.address not null Customer.surname is mandatory Customer.email matches 'b[A-Z0-9._%+-]+@[A-Z0-9.-]+.[A-Z]{2,4}b'
    83. 83. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 83 Schema-mixed schema with mandatory and optional fields + constraints the best of schema-less and schema-full modes
    84. 84. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 84 ACID Transactions db.begin(); try{ // your code ... db.commit(); } catch( Exception e ) { db.rollback(); }
    85. 85. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 85 SQLselect * from employee where name like '%Jay%' and status=0
    86. 86. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 86 Why reinvent yet another language when the 100% of developers already know SQL? OrientDB begins from SQL but improves it with new operators for graph manipulation
    87. 87. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 87 For the most of the queries everyday a programmer needs SQL is simpler, more readable and compact then Scripting (Map/Reduce)
    88. 88. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 88 SQL & relationships select from Account where address.city.country.name = 'Italy' select from Account where addresses contains (city.country.name = 'Italy')
    89. 89. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 89 SQL & trees/graphs select out('friend’) from V where name = 'Luca' and surname = 'Garulli' select out[@class='knows’] from V where name = 'Jay' and surname = 'Miner' traverse friends from #13:55 where $depth < 7
    90. 90. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 90 SQL sub queries select from ( traverse friends from Profile where $depth < 7 ) where home.city.name = ‘Cologne’
    91. 91. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 91 SQL & strings select from Profile where name.toUpperCase() = 'LUCA' select from City where country.name.substring(1,3).toUpperCase() = 'TAL' select from Agenda where phones contains ( number.indexOf( '+39' ) > -1 ) select from Agenda where email matches 'bA-Z0-9._%+-?+@A-Z0-9.-?+.A-Z?{2,4}b'
    92. 92. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 92 SQL & schema-less select from Profile where any() like '%Jay%' select from Stock where all() is not null
    93. 93. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 93 SQL & collections select from Tree where children contains ( married = true ) select from Tree where children containsAll ( married = true ) select from User where roles containsKey 'shutdown' select from Graph where edges.size() > 0
    94. 94. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 94 Native JSONODocument = new ODocument().fromJSON( " { '@rid' = '26:10', '@class' = 'Developer', 'name' : 'Luca', 'surname' : 'Garulli', 'out' : [ #10:33, #10:232 ] }“ );
    95. 95. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 95 Always Free Open Source Apache 2 license free for any purposes, even commercials
    96. 96. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 96 Some clients Kondoot Scenari
    97. 97. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 97 www.orientechnologies.com Thanks! Luca Garulli – Founder and CEO www.twitter.com/lgarulli

    ×