Why Relationships
are cool
but the “JOIN” sucks
Luca Garulli –
Founder and CEO
@Orient Technologies Ltd
Author of OrientDB...
1979
First Relational DBMS available as product

2009
NoSQL movement
(c) Luca Garulli

Licensed under a Creative Commons A...
1979
First Relational DBMS available as product

Hey, 30 years in the
IT field is so huge!

2009
NoSQL movement
(c) Luca G...
Before 2009 teams of developers
always fought to select:
Operative System
Programming Language
Middleware (App-Servers)
Wh...
One of the main resistances of
RDBMS users to pass to a NoSQL product
are related to the
complexity of the model:
Ok, NoSQ...
...what about the model?

(c) Luca Garulli

Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License

P...
What is the NoSQL answer
about managing complex domains?

(c) Luca Garulli

Licensed under a Creative Commons Attribution-...
Why
most of NoSQL
products
don’t support
Relationship
Between entities?
(c) Luca Garulli

Licensed under a Creative Common...
To understand why,
let’s see how
Relational DBMS
managed them
(c) Luca Garulli

Licensed under a Creative Commons Attribut...
Domain: the super minimal “Selling App”
Customer
Customer

Address
Address

Registry system
Order system
Order
Order

(c) ...
Domain: the super minimal “Selling App”
Customer
Customer

Address
Address

How does
Relational DBMS
manage relationships?...
Relational World: 1-1 Relationships
Primary key

Primary key
Customer

Id

Name

Address
Address

10 Luca

34

11 Jill

Fo...
Relational World: 1-N Relationships
Customer
Id

Address

Name

Id

Customer

Location

10 Luca

24

10

Rome

11 Jill

33...
Relational World: N-M Relationships
Customer
Id

Name

CustomerAddress
Id

Address

Address
Id

Location

10

Luca

10

24...
What’s wrong with the
Relational Model?

(c) Luca Garulli

Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unpo...
The JOIN is the evil!
Customer
Id

CustomerAddress

Name

Id

Address

Address
Id

Location

10

Luca

10

24

24

Rome

1...
A JOIN means searching for a key in
another table
The first rule to improve performance
is indexing all the keys
Index spe...
So in the best case a JOIN is a lookup
into an index
This is done per single join!
If you traverse hundreds of relationshi...
Index Lookup
is it really that fast?
(c) Luca Garulli

Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported...
Index Lookup: how does it works?
A-Z
A-L

M-Z

Think to an
Address Book
where we have to find
the Luca’s phone
number

(c)...
Index Lookup: how does it works?
A-Z
A-L

M-Z

A-L
A-D

M-Z
E-L

M-R

S-Z

Index algorithms are all
similar and based on
b...
Index Lookup: how does it works?
A-Z
A-L

M-Z

A-L
A-D

M-Z
E-L

M-R

A-D
A-B

(c) Luca Garulli

S-Z

E-L
C-D

E-G

H-L

L...
Index Lookup: how does it works?
A-Z
A-L

M-Z

A-L
A-D

M-Z
E-L

M-R

A-D
A-B

E-L
C-D

E-G

H-L

E-G
E-F

(c) Luca Garull...
Index Lookup: how does it works?
A-Z
A-L

M-Z

A-L
A-D

M-Z

E-L

A-D
A-B

Found!
M-R
S-Z
This lookup took 5
steps and gro...
Can you imagine
how many steps a
Lookup operation does into an
Index with Millions or Billions
of records?
(c) Luca Garull...
And this JOIN is executed
foreach involved table,
multiplied
foreach scanned records
!
(c) Luca Garulli

Licensed under a ...
Querying more tables can easily
produce millions of JOINs/Lookups!
Here the rule: more entries
= more lookup steps = slowe...
Oh! This is why
performance of my database
drops down when
it becomes bigger,
and bigger,
and bigger!
(c) Luca Garulli

Li...
What about
Document Databases
like MongoDB?

(c) Luca Garulli

Licensed under a Creative Commons Attribution-NoDerivs 3.0 ...
How MongoDB manages relationships:
{
“_id” : “292846512”,
“type” : “Order”,
“number” : 1223,
“customer” : “123456789”
}
(c...
MongoDB uses the same approach:
it stores the _id of the connected
documents. At run-time it lookups up
for the _id by usi...
Is there a better way to
manage relationships?

(c) Luca Garulli

Licensed under a Creative Commons Attribution-NoDerivs 3...
“A graph database is any
storage system
that provides
index-free adjacency”
- Marko Rodriguez
(author of TinkerPop Bluepri...
How does GraphDB manage
index-free relationships?

(c) Luca Garulli

Licensed under a Creative Commons Attribution-NoDeriv...
Every developer knows
the Relational Model,
but who knows the
Graph one?
(c) Luca Garulli

Licensed under a Creative Commo...
Back to school:
Graph Theory crash course

(c) Luca Garulli

Licensed under a Creative Commons Attribution-NoDerivs 3.0 Un...
Basic Graph

Luca
Luca

(c) Luca Garulli

Likes

NoSQL
NoSQL
Day
Day

Licensed under a Creative Commons Attribution-NoDeri...
Property Graph Model*
Vertices are
directed

Luca
Luca

Likes

name: Luca
name: Luca
surname: Garulli
surname: Garulli
com...
Property Graph Model
Likes
2
since:

Luca
Luca

013

Speak
s

NoSQL
NoSQL
Day
Day

ti
abstra tle: «Switch
in
ct: «Th
is ta...
Property Graph Model
Studies

Udine
Udine

Luca
Luca

located

Likes
FriendOf
Daniel
Daniel
(c) Luca Garulli

ganizes
Or
L...
Compliments, this is your diploma in
«Graph Theory»

(c) Luca Garulli

Licensed under a Creative Commons Attribution-NoDer...
The Graph theory
is so simple to be so
powerful

(c) Luca Garulli

Licensed under a Creative Commons Attribution-NoDerivs ...
Let’s go back
to the Graph Stuff
How does OrientDB
manage relationships?
(c) Luca Garulli

Licensed under a Creative Commo...
OrientDB: traverse a relationship
The Record ID (RID)
is the physical position

RID = #13:35
RID = #13:35

RID = #13:100
R...
OrientDB: traverse a relationship
The Edge’s RID is saved
inside both vertices, as
«out» and «in»
RID = #13:35
RID = #13:3...
OrientDB: traverse -> outgoing

RID = #13:35
RID = #13:35

RID = #13:100
RID = #13:100
RID = #14:54
RID = #14:54

Luca
Luc...
OrientDB: traverse <- incoming

RID = #13:35
RID = #13:35

RID = #13:100
RID = #13:100
RID = #14:54
RID = #14:54

Luca
Luc...
GraphDB handles relationships as a
physical LINK to the record
assigned when the edge is created
on the other side
RDBMS c...
This means jumping from a
O(log N) algorithm to a near O(1)
traversing cost is not more affected
by database size!
This is...
an Open Source (Apache licensed)
document-graph NoSQL dbms
(c) Luca Garulli

Licensed under a Creative Commons Attribution...
OrientDB in the Blueprints micro-benchmark,
on common hw, with a hot cache,
traverses 29,6 Millions
of records in less tha...
Create the graph in SQL
$luca> cd bin
$luca> ./console.sh
OrientDB console v.1.6.1 (www.orientdb.org)
Type 'help' to displ...
Create the graph in Java
Graph graph = new OrientGraph("local:/tmp/db/graph”);
Vertex luca = graph.addVertex( “class:Custo...
Query the graph in SQL
orientdb> select in(‘Lives’) from Address where name = ‘Rome’
---+------+---------|----------------...
More on query power
orientdb> select sum( out(‘Order’).total ) from Customer
where name = ‘Luca’
orientdb> traverse both(‘...
Query vs traversal
Once you’ve a well connected database
in the form of a Super Graph you can
cross records instead of que...
Query vs traversal
Special
Special
Customers
Customers

Customers
Customers

Luca
Luca
This is a
root vertex

(c) Luca Gar...
Root Vertices can be enriched by
Meta Graphs
to decorate Graphs with
additional information
and make easier/faster
the ret...
Temporal based Meta Graph
Calendar
Calendar

Year
Year
2013
2013

Month
Month
April 2013
April 2013

Day
Day
9/4/2013
9/4/...
Location based Meta Graph
Location
Location

Country
Country
Italy
Italy

Region
Region
Lazio
Lazio

State
State
RM
RM

Ci...
Mix & Merge graphs
Region
Region
Lazio
Lazio
Country
Country
Italy
Italy

State
State
RM
RM
City
City
Rome
Rome

City
City...
Region
Region
Lazio
Lazio

Get all the orders
sold in “Fiumicino” city
Order
Order
Order
Order
2332
2333
2332
2333
on 9/4/...
Start from Calendar, look for Hour 10:00
Region
Region
Lazio
Lazio
Country
Country
Italy
Italy

State
State
RM
RM
City
Cit...
Start from Calendar, look for Hour 10:00
Found 2 Orders,
filter by incoming
now filter by
City
City
incoming edges
edges<
...
Start from Calendar, look for Hour 10:00
Region
Region
Lazio
Lazio
Country
Country
Italy
Italy

State
State
RM
RM
City
Cit...
Or start from Location, look for Fiumicino
Region
Region
Lazio
Lazio
Country
Country
Italy
Italy

State
State
RM
RM
City
C...
Start from Location, look for Fiumicino
Region
Region
Lazio
Lazio
Country
Country
Italy
Italy

State
State
RM
RM
City
City...
This is your database

(c) Luca Garulli

Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License

Page...
Get last customer bought ‘Barolo’
select last(out(‘Order’).in(‘Customer)) from Stock
where name = ‘Barolo’

#34:22

(c) Lu...
Get his’s country

select out(‘City’) from #34:22

Udine, Italy
#55:12

(c) Luca Garulli

Licensed under a Creative Common...
Get orders from that country

select in(‘Customer’) from #55:12

(c) Luca Garulli

Licensed under a Creative Commons Attri...
Let’s move like a
Spider
on the web

(c) Luca Garulli

Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported...
Thanks!

www.orientechnologies.com
www.twitter.com/orientechno
(c) Luca Garulli

Licensed under a Creative Commons Attribu...
Upcoming SlideShare
Loading in...5
×

Why relationships are cool but "join" sucks

11,208

Published on

Relational DBMS and Document Databases use the "JOIN" operation to connect records and documents. Is there a better way to connect things? This presentation illustrates how OrientDB manages relationships by using the same technique of Graph Databases for super fast traversal.

Published in: Technology
3 Comments
41 Likes
Statistics
Notes
No Downloads
Views
Total Views
11,208
On Slideshare
0
From Embeds
0
Number of Embeds
40
Actions
Shares
0
Downloads
172
Comments
3
Likes
41
Embeds 0
No embeds

No notes for slide
  • Good afternoon!
    Today I’d like to show you a new way to design a database.
    In 1970 Relational DBMS
  • Transcript of "Why relationships are cool but "join" sucks"

    1. 1. Why Relationships are cool but the “JOIN” sucks Luca Garulli – Founder and CEO @Orient Technologies Ltd Author of OrientDB www.twitter.com/lgarulli (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 1 www.orientechnologies.com
    2. 2. 1979 First Relational DBMS available as product 2009 NoSQL movement (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 2
    3. 3. 1979 First Relational DBMS available as product Hey, 30 years in the IT field is so huge! 2009 NoSQL movement (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 3
    4. 4. Before 2009 teams of developers always fought to select: Operative System Programming Language Middleware (App-Servers) What about the Database? (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 4
    5. 5. One of the main resistances of RDBMS users to pass to a NoSQL product are related to the complexity of the model: Ok, NoSQL products are super for BigData and BigScale but... (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 5
    6. 6. ...what about the model? (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 6
    7. 7. What is the NoSQL answer about managing complex domains? (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License No Relationships support Key-Value stores ? Column-Based ? Document database ? Graph database ! Page 7
    8. 8. Why most of NoSQL products don’t support Relationship Between entities? (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 8
    9. 9. To understand why, let’s see how Relational DBMS managed them (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 9
    10. 10. Domain: the super minimal “Selling App” Customer Customer Address Address Registry system Order system Order Order (c) Luca Garulli Stock Stock Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 10
    11. 11. Domain: the super minimal “Selling App” Customer Customer Address Address How does Relational DBMS manage relationships? Registry system Order system Order Order (c) Luca Garulli Stock Stock Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 11
    12. 12. Relational World: 1-1 Relationships Primary key Primary key Customer Id Name Address Address 10 Luca 34 11 Jill Foreign key Id Location 34 Rome 44 44 London 34 John 54 54 Moscow 56 Mark 66 66 New Mexico 88 Steve 68 68 Palo Alto JOIN Customer.Address -> Address.Id (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 12
    13. 13. Relational World: 1-N Relationships Customer Id Address Name Id Customer Location 10 Luca 24 10 Rome 11 Jill 33 10 London 34 John 44 34 Moscow 56 Mark 66 56 Cologne 88 Steve 68 88 Palo Alto Inverse JOIN Address.Customer -> Customer.Id (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 13
    14. 14. Relational World: N-M Relationships Customer Id Name CustomerAddress Id Address Address Id Location 10 Luca 10 24 24 Rome 11 Jill 10 33 33 London 34 John 34 44 44 Moscow 56 Mark 66 Cologne 88 Steve 68 Palo Alto Additional table with 2 JOINs (1) CustomerAddress.Id -> Customer.Id and (2) CustomerAddress.Address -> Address.Id (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 14
    15. 15. What’s wrong with the Relational Model? (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 15
    16. 16. The JOIN is the evil! Customer Id CustomerAddress Name Id Address Address Id Location 10 Luca 10 24 24 Rome 11 Jill 10 33 33 London 34 John 34 24 44 Moscow 56 Mark 66 Cologne 88 Steve 68 Palo Alto These are all JOINs executed everytime you traverse a relationship! relationship (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 16
    17. 17. A JOIN means searching for a key in another table The first rule to improve performance is indexing all the keys Index speeds up searches, but slows down insert, updates and deletes (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 17
    18. 18. So in the best case a JOIN is a lookup into an index This is done per single join! If you traverse hundreds of relationships you’re executing hundreds of JOINs (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 18
    19. 19. Index Lookup is it really that fast? (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 19
    20. 20. Index Lookup: how does it works? A-Z A-L M-Z Think to an Address Book where we have to find the Luca’s phone number (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 20
    21. 21. Index Lookup: how does it works? A-Z A-L M-Z A-L A-D M-Z E-L M-R S-Z Index algorithms are all similar and based on balanced trees (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 21
    22. 22. Index Lookup: how does it works? A-Z A-L M-Z A-L A-D M-Z E-L M-R A-D A-B (c) Luca Garulli S-Z E-L C-D E-G H-L Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 22
    23. 23. Index Lookup: how does it works? A-Z A-L M-Z A-L A-D M-Z E-L M-R A-D A-B E-L C-D E-G H-L E-G E-F (c) Luca Garulli S-Z H-L G H-J K-L Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 23
    24. 24. Index Lookup: how does it works? A-Z A-L M-Z A-L A-D M-Z E-L A-D A-B Found! M-R S-Z This lookup took 5 steps and grows up with the index size! E-L C-D E-G H-L E-G E-F H-L G H-J K-L Luca (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 24
    25. 25. Can you imagine how many steps a Lookup operation does into an Index with Millions or Billions of records? (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 25
    26. 26. And this JOIN is executed foreach involved table, multiplied foreach scanned records ! (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 26
    27. 27. Querying more tables can easily produce millions of JOINs/Lookups! Here the rule: more entries = more lookup steps = slower JOIN (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 27
    28. 28. Oh! This is why performance of my database drops down when it becomes bigger, and bigger, and bigger! (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 28
    29. 29. What about Document Databases like MongoDB? (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 29
    30. 30. How MongoDB manages relationships: { “_id” : “292846512”, “type” : “Order”, “number” : 1223, “customer” : “123456789” } (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 30
    31. 31. MongoDB uses the same approach: it stores the _id of the connected documents. At run-time it lookups up for the _id by using an index. (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 31
    32. 32. Is there a better way to manage relationships? (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 32
    33. 33. “A graph database is any storage system that provides index-free adjacency” - Marko Rodriguez (author of TinkerPop Blueprints) (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 33
    34. 34. How does GraphDB manage index-free relationships? (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 34
    35. 35. Every developer knows the Relational Model, but who knows the Graph one? (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 35
    36. 36. Back to school: Graph Theory crash course (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 36
    37. 37. Basic Graph Luca Luca (c) Luca Garulli Likes NoSQL NoSQL Day Day Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 37
    38. 38. Property Graph Model* Vertices are directed Luca Luca Likes name: Luca name: Luca surname: Garulli surname: Garulli company: Orient Tech company: Orient Tech since: 2013 NoSQL NoSQL Day Day date: Nov 15° 2013 date: Nov 15° 2013 Vertices and Edges can have properties * https://github.com/tinkerpop/blueprints/wiki/Property-Graph-Model (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 38
    39. 39. Property Graph Model Likes 2 since: Luca Luca 013 Speak s NoSQL NoSQL Day Day ti abstra tle: «Switch in ct: «Th is talk g...» presen ts...» An Edge connects 2 vertices: use multiple edges to represents 1-N and N-M relationships (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 39
    40. 40. Property Graph Model Studies Udine Udine Luca Luca located Likes FriendOf Daniel Daniel (c) Luca Garulli ganizes Or Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License NoSQL NoSQL Day Day Page 40
    41. 41. Compliments, this is your diploma in «Graph Theory» (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 41
    42. 42. The Graph theory is so simple to be so powerful (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 42
    43. 43. Let’s go back to the Graph Stuff How does OrientDB manage relationships? (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 43
    44. 44. OrientDB: traverse a relationship The Record ID (RID) is the physical position RID = #13:35 RID = #13:35 RID = #13:100 RID = #13:100 Luca Luca Rome Rome (vertex) (vertex) label : :‘Customer’ label ‘Customer’ name : :‘Luca’ name ‘Luca’ (c) Luca Garulli (vertex) (vertex) label = ‘Address’ label = ‘Address’ name = ‘Rome’ name = ‘Rome’ Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 44
    45. 45. OrientDB: traverse a relationship The Edge’s RID is saved inside both vertices, as «out» and «in» RID = #13:35 RID = #13:35 RID = #13:100 RID = #13:100 RID = #14:54 RID = #14:54 Luca Luca (vertex) (vertex) out ::[#14:54] out [#14:54] label : :‘Customer’ label ‘Customer’ name : :‘Luca’ name ‘Luca’ (c) Luca Garulli Lives out: [#13:35] out: [#13:35] in: [#13:100] in: [#13:100] Label : :‘Lives’ Label ‘Lives’ Rome Rome (vertex) (vertex) in: [#14:54] in: [#14:54] label = ‘Address’ label = ‘Address’ name = ‘Rome’ name = ‘Rome’ Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 45
    46. 46. OrientDB: traverse -> outgoing RID = #13:35 RID = #13:35 RID = #13:100 RID = #13:100 RID = #14:54 RID = #14:54 Luca Luca out ::[#14:54] out [#14:54] label : :‘Customer’ label ‘Customer’ name : :‘Luca’ name ‘Luca’ (c) Luca Garulli Lives out: [#13:35] out: [#13:35] in: [#13:100] in: [#13:100] Label : :‘Lives’ Label ‘Lives’ Rome Rome in: [#14:54] in: [#14:54] label = ‘Address’ label = ‘Address’ name = ‘Rome’ name = ‘Rome’ Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 46
    47. 47. OrientDB: traverse <- incoming RID = #13:35 RID = #13:35 RID = #13:100 RID = #13:100 RID = #14:54 RID = #14:54 Luca Luca out ::[#14:54] out [#14:54] label : :‘Customer’ label ‘Customer’ name : :‘Luca’ name ‘Luca’ (c) Luca Garulli Lives out: [#13:35] out: [#13:35] in: [#13:100] in: [#13:100] Label : :‘Lives’ Label ‘Lives’ Rome Rome in: [#14:54] in: [#14:54] label = ‘Address’ label = ‘Address’ name = ‘Rome’ name = ‘Rome’ Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 47
    48. 48. GraphDB handles relationships as a physical LINK to the record assigned when the edge is created on the other side RDBMS computes the relationship every time you query a database Is not that crazy?! (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 48
    49. 49. This means jumping from a O(log N) algorithm to a near O(1) traversing cost is not more affected by database size! This is huge in the BigData age (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 49
    50. 50. an Open Source (Apache licensed) document-graph NoSQL dbms (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 50
    51. 51. OrientDB in the Blueprints micro-benchmark, on common hw, with a hot cache, traverses 29,6 Millions of records in less than 5 seconds about 6 Millions of nodes traversed per sec! Do not try this at home with a RDBMS*! *unless you live in the Google’s server farm (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 51
    52. 52. Create the graph in SQL $luca> cd bin $luca> ./console.sh OrientDB console v.1.6.1 (www.orientdb.org) Type 'help' to display all the commands supported. orientdb> create vertex Customer set name = ‘Luca’ Created vertex #13:35 in 0.03 secs orientdb> create vertex Address set name = ‘Rome’ Created vertex #13:100 in 0.02 secs orientdb> create edge Lives from #13:35 to #13:100 Created edge #14:54 in 0.02 secs (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 52
    53. 53. Create the graph in Java Graph graph = new OrientGraph("local:/tmp/db/graph”); Vertex luca = graph.addVertex( “class:Customer” ); luca.setProperty( “name", “Luca” ); Vertex rome = graph.addVertex ( “class:Address” ); rome.setProperty( “name", “Rome” ); Edge edge = luca.addEdge( “Lives”, rome ); graph.shutdown(); (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 53
    54. 54. Query the graph in SQL orientdb> select in(‘Lives’) from Address where name = ‘Rome’ ---+------+---------|--------------------+--------------------+--------+   #| RID  |@class   |label               |out_Lives           |in      | ---+------+---------+--------------------+--------------------+--------+   0| 13:35|Customer |Luca                |[#14:54]            |        | ---+------+---------+--------------------+--------------------+--------+ 1 item(s) found. Query executed in 0.007 sec(s). Incoming vertices (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 54
    55. 55. More on query power orientdb> select sum( out(‘Order’).total ) from Customer where name = ‘Luca’ orientdb> traverse both(‘Friend’) from Customer while $depth <= 7 orientdb> select from ( traverse both(‘Friend’) from Customer while $depth <= 7 ) where @class=‘Customer’ and city.name = ‘Udine’ (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 55
    56. 56. Query vs traversal Once you’ve a well connected database in the form of a Super Graph you can cross records instead of query them! All you need is a few“Root Vertices” where to start traversing (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 56
    57. 57. Query vs traversal Special Special Customers Customers Customers Customers Luca Luca This is a root vertex (c) Luca Garulli Mar Mar k k Stocks Stocks Jill Jill White White Soap Soap Order Order 2332 2332 Order Order 8834 8834 Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 57
    58. 58. Root Vertices can be enriched by Meta Graphs to decorate Graphs with additional information and make easier/faster the retrieval (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 58
    59. 59. Temporal based Meta Graph Calendar Calendar Year Year 2013 2013 Month Month April 2013 April 2013 Day Day 9/4/2013 9/4/2013 Hour Hour 9/4/2013 9/4/2013 09:00 09:00 Order Order 2332 2332 (c) Luca Garulli Hour Hour 9/4/2013 9/4/2013 10:00 10:00 Order Order 2333 2333 Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Order Order 2334 2334 Page 59
    60. 60. Location based Meta Graph Location Location Country Country Italy Italy Region Region Lazio Lazio State State RM RM City City Fiumicino Fiumicino Order Order 2332 2332 (c) Luca Garulli City City Rome Rome Order Order 2333 2333 Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Order Order 2334 2334 Page 60
    61. 61. Mix & Merge graphs Region Region Lazio Lazio Country Country Italy Italy State State RM RM City City Rome Rome City City Fiumicino Fiumicino Location Location Order Order 2332 2332 Order Order 2333 2333 Order Order 2334 2334 Calendar Calendar Year Year 2013 2013 (c) Luca Garulli Hour Hour 9/4/2013 9/4/2013 09:00 09:00 Month Month April 2013 April 2013 Hour Hour 9/4/2013 9/4/2013 10:00 10:00 Day Day 9/4/2013 9/4/2013 Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 61
    62. 62. Region Region Lazio Lazio Get all the orders sold in “Fiumicino” city Order Order Order Order 2332 2333 2332 2333 on 9/4/2013 at 10:00 Country Country Italy Italy Location Location Calendar Calendar Year Year 2013 2013 (c) Luca Garulli State State RM RM City City Rome Rome City City Fiumicino Fiumicino Hour Hour 9/4/2013 9/4/2013 09:00 09:00 Month Month April 2013 April 2013 Hour Hour 9/4/2013 9/4/2013 10:00 10:00 Day Day 9/4/2013 9/4/2013 Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 62 Order Order 2334 2334
    63. 63. Start from Calendar, look for Hour 10:00 Region Region Lazio Lazio Country Country Italy Italy State State RM RM City City Rome Rome City City Fiumicino Fiumicino Location Location Order Order 2332 2332 Order Order 2333 2333 Order Order 2334 2334 Calendar Calendar Year Year 2013 2013 (c) Luca Garulli Hour Hour 9/4/2013 9/4/2013 09:00 09:00 Month Month April 2013 April 2013 Hour Hour 9/4/2013 9/4/2013 10:00 10:00 Day Day 9/4/2013 9/4/2013 Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 63
    64. 64. Start from Calendar, look for Hour 10:00 Found 2 Orders, filter by incoming now filter by City City incoming edges edges< Region Region Lazio Lazio Country Country Italy Italy State State RM RM City City Fiumicino Fiumicino Rome Rome Location Location Order Order 2332 2332 Order Order 2333 2333 Order Order 2334 2334 Calendar Calendar Year Year 2013 2013 (c) Luca Garulli Hour Hour 9/4/2013 9/4/2013 09:00 09:00 Month Month April 2013 April 2013 Hour Hour 9/4/2013 9/4/2013 10:00 10:00 Day Day 9/4/2013 9/4/2013 Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 64
    65. 65. Start from Calendar, look for Hour 10:00 Region Region Lazio Lazio Country Country Italy Italy State State RM RM City City Rome Rome City City Fiumicino Fiumicino Location Location Only “Order 2333” has incoming connections with “Fiumicino” Order Order 2332 2332 Order Order 2333 2333 Order Order 2334 2334 Calendar Calendar Year Year 2013 2013 (c) Luca Garulli Hour Hour 9/4/2013 9/4/2013 09:00 09:00 Month Month April 2013 April 2013 Hour Hour 9/4/2013 9/4/2013 10:00 10:00 Day Day 9/4/2013 9/4/2013 Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 65
    66. 66. Or start from Location, look for Fiumicino Region Region Lazio Lazio Country Country Italy Italy State State RM RM City City Rome Rome City City Fiumicino Fiumicino Location Location Order Order 2332 2332 Order Order 2333 2333 Order Order 2334 2334 Calendar Calendar Year Year 2013 2013 (c) Luca Garulli Hour Hour 9/4/2013 9/4/2013 09:00 09:00 Month Month April 2013 April 2013 Hour Hour 9/4/2013 9/4/2013 10:00 10:00 Day Day 9/4/2013 9/4/2013 Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 66
    67. 67. Start from Location, look for Fiumicino Region Region Lazio Lazio Country Country Italy Italy State State RM RM City City Rome Rome City City Fiumicino Fiumicino Location Location Order Order 2332 2332 Order Order 2333 2333 Order Order 2334 2334 Calendar Calendar Year Year 2013 2013 (c) Luca Garulli Hour Hour 9/4/2013 9/4/2013 09:00 09:00 Month Month April 2013 April 2013 Hour Hour 9/4/2013 9/4/2013 10:00 10:00 Day Day 9/4/2013 9/4/2013 Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 67
    68. 68. This is your database (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 68
    69. 69. Get last customer bought ‘Barolo’ select last(out(‘Order’).in(‘Customer)) from Stock where name = ‘Barolo’ #34:22 (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 69
    70. 70. Get his’s country select out(‘City’) from #34:22 Udine, Italy #55:12 (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 70
    71. 71. Get orders from that country select in(‘Customer’) from #55:12 (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 71
    72. 72. Let’s move like a Spider on the web (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 72
    73. 73. Thanks! www.orientechnologies.com www.twitter.com/orientechno (c) Luca Garulli Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 73
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×