Graph All the Things: An Introduction to Graph Databases
1. Graph All The Things
Introduction to Graph Databases
Neo4j Graph Day 2014
New York
Utpal Bhatt
VP Marketing, Neo4j
@bhatt_utpal
#neo4j
2. N e o Te c h n o l o g y, I n c C o n f i d e n t i a l
Creator of Neo4j, the world’s
leading graph database.
130 subscription customers,
40+ Global 2000 customers in
production.
Open source with
40,000+ downloads
per month.
$25M raised to date from
Fidelity Growth Partners
Europe (London),
Sunstone (Copenhagen)
and Conor (Helsinki).
NEO TECHNOLOGY
CREATORS OF NEO4J
70 people, offices in
Munich, Malmö Sweden,
London, Paris & San
Francisco (HQ).
COMPANY OVERVIEW
“By
2017
more
than
25%
of
enterprises
will
use
graph
databases.”
3. N e o Te c h n o l o g y, I n c C o n f i d e n t i a l
WHEN DO YOU NEED A GRAPH
DATABASE?
When your business depends on Relationships in Data
4. N e o Te c h n o l o g y, I n c C o n f i d e n t i a l
BUSINESS IMPACT OF USING
RELATIONSHIPS IN DATA
5. N e o Te c h n o l o g y, I n c C o n f i d e n t i a l
C
34,3%B
38,4%A
3,3%
D
3,8%
1,8%
1,8%
1,8%
1,8%
1,8%
E
8,1%
F
3,9%
USING RELATIONSHIP INFORMATION
IN THE CONSUMER WEB
6. N e o Te c h n o l o g y, I n c C o n f i d e n t i a l
Use of Relationship Information
in The Consumer Web
USING RELATIONSHIP INFORMATION
IN THE CONSUMER WEB
7. N e o Te c h n o l o g y, I n c C o n f i d e n t i a l
Use of Relationship Information
in The Consumer Web
USING RELATIONSHIP INFORMATION
IN THE CONSUMER WEB
8. N e o Te c h n o l o g y, I n c C o n f i d e n t i a l
Unlocking The Business Potential Of
Relationships In Data
Graph Databases are purpose built to manage Data and
its Relationships
Wide%column,
stores,
,
Data,is,mapped,by,
a,row,key,,column,
key,and,8me,
stamp.,
Key,Value,
Stores,
,
Store,keys,and,
associated,values.,
Graph,
databases,
,
Store,data,and,the,
rela8onships,
between,data.,
Document,
stores,
,
Store,all,data,
related,to,a,
specific,key,as,a,
single,document.,,
DATA,MODEL,RICHNESS,
Adapted from the 451 Group
UNLOCKING THE POTENTIAL
OF RELATIONSHIPS IN DATA
9. N e o Te c h n o l o g y, I n c C o n f i d e n t i a l
The Property Graph ModelTHE PROPERTY GRAPH
MODEL
10. N e o Te c h n o l o g y, I n c C o n f i d e n t i a l
The Property Graph Model
Ann Loves Dan
THE PROPERTY GRAPH
MODEL
11. N e o Te c h n o l o g y, I n c C o n f i d e n t i a l
The Property Graph Model
Ann DanLoves
THE PROPERTY GRAPH
MODEL
12. N e o Te c h n o l o g y, I n c C o n f i d e n t i a l
Ann Dan
The Property Graph Model
(Ann) –[:LOVES]-> (Dan)
Loves
THE PROPERTY GRAPH
MODEL
13. N e o Te c h n o l o g y, I n c C o n f i d e n t i a l
Ann Dan
The Property Graph Model
Loves
(:Person {name:"Ann"}) –[:LOVES]-> (:Person {name:"Dan"})
THE PROPERTY GRAPH
MODEL
14. N e o Te c h n o l o g y, I n c C o n f i d e n t i a l
Ann Dan
The Property Graph Model
Loves
(:Person {name:"Ann"}) –[:LOVES]-> (:Person {name:"Dan"})
THE PROPERTY GRAPH
MODEL
15. N e o Te c h n o l o g y, I n c C o n f i d e n t i a l
Ann Dan
The Property Graph Model
Loves
(:Person {name:"Ann"}) –[:LOVES]-> (:Person {name:"Dan"})
Node Relationship Node
THE PROPERTY GRAPH
MODEL
16. N e o Te c h n o l o g y, I n c C o n f i d e n t i a l
Ann Dan
The Property Graph Model
Loves
(:Person {name:"Ann"}) –[:LOVES]-> (:Person {name:"Dan"})
Node Relationship Node
property propertylabel labeltype
THE PROPERTY GRAPH
MODEL
17. N e o Te c h n o l o g y, I n c C o n f i d e n t i a l
Cypher
Query: Whom does Ann love?
(:Person {name:"Ann"})–[:LOVES]->(whom)
CYPHER
18. N e o Te c h n o l o g y, I n c C o n f i d e n t i a l
Cypher
Query: Whom does Ann love?
MATCH (:Person {name:"Ann"})–[:LOVES]->(whom)
CYPHER
19. N e o Te c h n o l o g y, I n c C o n f i d e n t i a l
Cypher
Query: Whom does Ann love?
MATCH (:Person {name:"Ann"})–[:LOVES]->(whom)
RETURN whom
CYPHER
20. N e o Te c h n o l o g y, I n c C o n f i d e n t i a l
CypherCYPHER
21. N e o Te c h n o l o g y, I n c C o n f i d e n t i a l
Under The Hood
MATCH (:Person {name:"Ann"})–[:LOVES]->(whom)RETURN whom
cypher
native graph processing
native storage
UNDER THE HOOD
22. *“Find all direct reports and how many they manage, up to 3 levels down”
Example HR Query (using SQL)
23. *“Find all direct reports and how many they manage, up to 3 levels down”
(SELECT T.directReportees AS directReportees, sum(T.count) AS count
FROM (
SELECT manager.pid AS directReportees, 0 AS count
FROM person_reportee manager
WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName")
UNION
SELECT manager.pid AS directReportees, count(manager.directly_manages) AS count
FROM person_reportee manager
WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName")
GROUP BY directReportees
UNION
SELECT manager.pid AS directReportees, count(reportee.directly_manages) AS count
FROM person_reportee manager
JOIN person_reportee reportee
ON manager.directly_manages = reportee.pid
WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName")
GROUP BY directReportees
UNION
SELECT manager.pid AS directReportees, count(L2Reportees.directly_manages) AS count
FROM person_reportee manager
JOIN person_reportee L1Reportees
ON manager.directly_manages = L1Reportees.pid
JOIN person_reportee L2Reportees
ON L1Reportees.directly_manages = L2Reportees.pid
WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName")
GROUP BY directReportees
) AS T
GROUP BY directReportees)
UNION
(SELECT T.directReportees AS directReportees, sum(T.count) AS count
FROM (
SELECT manager.directly_manages AS directReportees, 0 AS count
FROM person_reportee manager
WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName")
UNION
SELECT reportee.pid AS directReportees, count(reportee.directly_manages) AS count
FROM person_reportee manager
JOIN person_reportee reportee
ON manager.directly_manages = reportee.pid
WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName")
GROUP BY directReportees
UNION
(continued from previous page...)
SELECT depth1Reportees.pid AS directReportees,
count(depth2Reportees.directly_manages) AS count
FROM person_reportee manager
JOIN person_reportee L1Reportees
ON manager.directly_manages = L1Reportees.pid
JOIN person_reportee L2Reportees
ON L1Reportees.directly_manages = L2Reportees.pid
WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName")
GROUP BY directReportees
) AS T
GROUP BY directReportees)
UNION
(SELECT T.directReportees AS directReportees, sum(T.count) AS count
FROM(
SELECT reportee.directly_manages AS directReportees, 0 AS count
FROM person_reportee manager
JOIN person_reportee reportee
ON manager.directly_manages = reportee.pid
WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName")
GROUP BY directReportees
UNION
SELECT L2Reportees.pid AS directReportees, count(L2Reportees.directly_manages) AS
count
FROM person_reportee manager
JOIN person_reportee L1Reportees
ON manager.directly_manages = L1Reportees.pid
JOIN person_reportee L2Reportees
ON L1Reportees.directly_manages = L2Reportees.pid
WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName")
GROUP BY directReportees
) AS T
GROUP BY directReportees)
UNION
(SELECT L2Reportees.directly_manages AS directReportees, 0 AS count
FROM person_reportee manager
JOIN person_reportee L1Reportees
ON manager.directly_manages = L1Reportees.pid
JOIN person_reportee L2Reportees
ON L1Reportees.directly_manages = L2Reportees.pid
WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName")
)
!
Example HR Query (using SQL)
24. MATCH
(boss)-‐[:MANAGES*0..3]-‐>(sub),
(sub)-‐[:MANAGES*1..3]-‐>(report)
WHERE
boss.name
=
“John
Doe”
RETURN
sub.name
AS
Subordinate,
count(report)
AS
Total
Same Query in Cypher
*“Find all direct reports and how many they manage, up to 3 levels down”
26. Neo Technology, Inc Confidential
“Our
Neo4j
solution
is
literally
thousands
of
times
faster
than
the
prior
MySQL
solution,
with
queries
that
require
10-‐100
times
less
code.”
!
-‐
Volker
Pacher,
Senior
Developer
eBay
But what about
the Real World
27. N e o Te c h n o l o g y, I n c C o n f i d e n t i a l
UTILIZING RELATIONSHIPS FOR
RECOMMENDATIONS
Utilize Relationships in Data to Enable
Context-Rich Recommendations
The Solution
David
Jane
Purchased
Order
56
Order
54
Monster
energy
drink
Low fat
frozen
Yogurt
Dairy and
eggs
Beverages
Weight
Management
Frozen
Order
55
Susan
Customers Orders Product
The Need
Clear & Performant Access to Customer, Purchase &
Interest Data to make Recommendations
Category
28. N e o Te c h n o l o g y, I n c C o n f i d e n t i a l
Total Dollar
Amount
Transaction
Count
Investigate
Investigate
UNCOVERING FRAUD RINGS
29. N e o Te c h n o l o g y, I n c C o n f i d e n t i a l
Utilize Relationship in you
Logistics Network work as
a Graph A
B
Utilize Relationship information in your Logistics Network to Minimize Time
& Maximize Use of your Network
Utilizing Relationships In
Supply Chain And Logistics
MANAGING SUPPLY CHAIN AND
LOGISTICS
30. N e o Te c h n o l o g y, I n c C o n f i d e n t i a l
Utilizing Relationships to diagnose problems and
gauge their impact
The Solution
Instantly diagnose problems across 1B+ element
networks
The Problem
The Internet Of ThingsPOWERING THE INTERNET OF
THINGS
31. N e o Te c h n o l o g y, I n c C o n f i d e n t i a l
Identity and Access Control
Network Diagnostics
Graph based Search
MORE ENTERPRISE EXAMPLES
Master Data Management
32. N e o Te c h n o l o g y, I n c C o n f i d e n t i a l
Ref: http://www.gartner.com/id=2081316
Interest Graph
Payment Graph
Intent Graph
Mobile Graph
Consumer Web Giants Depends on Five Graphs
Gartner’s “5 Graphs”
Social Graph
GARTNER’S 5 GRAPHS
CONSUMER GIANTS DEPEND UPON 5 THINGS
34. N e o Te c h n o l o g y, I n c C o n f i d e n t i a l
GRAPH DATABASES - THE FASTEST
GROWING DBMS CATEGORY
Source: http://db-engines.com/en/ranking/graph+dbms!
35. N e o Te c h n o l o g y, I n c C o n f i d e n t i a l
0%
10%
20%
30%
2011 2014 2017
25%
2.5%
0%
%ofEnterprisesusingGraphDatabases
“Forrester estimates that over 25% of
enterprises will be using graph
databases by 2017”
Sources
• Forrester TechRadar™: Enterprise DBMS, Feb 13 2014 (http://www.forrester.com/TechRadar+Enterprise
+DBMS+Q1+2014/fulltext/-/E-RES106801)
• Dataversity Mar 31 2014: “Deconstructing NoSQL:Analysis of a 2013 Survey on the Use, Production and Assessment
of NoSQLTechnologies in the Enterprise” (http://www.dataversity.net)
• Neo Technology customer base in 2011 and 2014
• Estimation of other graph vendors’ customer base in 2011 and 2014 based on best available intelligence
“25% of survey respondents said
they plan to use Graph databases in
the future.”
Graph Databases:
Powering The Enterprise
GRAPH DATABASES - POWERING
THE ENTERPRISE
36. N e o Te c h n o l o g y, I n c C o n f i d e n t i a l
Ref: Gartner, ‘IT Market Clock for Database Management Systems, 2014,’ September 22, 2014
https://www.gartner.com/doc/2852717/it-market-clock-database-management
“Graph analysis is possibly
the single most effective
competitive differentiator for
organizations pursuing data-
driven operations and
decisions after the design of
data capture.”
Graph Databases:
Can Transform Your Business
GRAPH DATABASES - CAN
TRANSFORM YOUR BUSINESS
37. N e o Te c h n o l o g y, I n c C o n f i d e n t i a l
Summary
When your business depends on Relationships in Data
SUMMARY