26. We're talking about a
Property Graph
Em Joh
il a n
knows knows
Alli Tob Lar
Nodes
son ias knows s
knows
And And knows
knows rea rés
s
knows knows knows
Pet Miic
Mc knows Ian
er knows a
a
knows knows
De Mic
lia h ael
Relationships
Properties (each a key+value)
+ Indexes (for easy look-ups)
4
31. Cypher - a graph query language
๏ a pattern-matching query language
๏ declarative grammar with clauses (like SQL)
๏ aggregation, ordering, limits
๏ create, read, update, delete
6
32. Cypher - a graph query language
๏ a pattern-matching query language
๏ declarative grammar with clauses (like SQL)
๏ aggregation, ordering, limits
๏ create, read, update, delete
// get node with id 0
start a=node(0) return a
// traverse from node 1
start a=node(1) match (a)-->(b) return b
// return friends of friends
start a=node(1) match (a)--()--(c) return c
6
34. With love, from Sweden
๏ 2001 - a Swedish media asset project
•CTO Emil prototyped a proper graph interface
•first SQL-backed, then revised down to bare metal
•(just like Amazon->Dynamo, Facebook->Cassandra,
Google-> Big Table, Sweden->Neo4j)
๏ 2003 Neo4j went into 24/7 production
๏ 2006-2007 - Neo4j was spun off as an open source project
๏ 2009 seed funding for the company
๏ 2010 Neo4j Server was created (previously only an embedded DB)
๏ 2011 Series-A Funding, Top-Tier customers
(gratuitous name dropping)
8
36. Neo4j is a Graph Database
๏ A Graph Database:
9
37. Neo4j is a Graph Database
๏ A Graph Database:
• a Property Graph with Nodes, Relationships
and Properties on both
9
38. Neo4j is a Graph Database
๏ A Graph Database:
• a Property Graph with Nodes, Relationships
and Properties on both
• perfect for complex, highly connected data
9
39. Neo4j is a Graph Database
๏ A Graph Database:
• a Property Graph with Nodes, Relationships
and Properties on both
• perfect for complex, highly connected data
๏ A Graph Database:
9
40. Neo4j is a Graph Database
๏ A Graph Database:
• a Property Graph with Nodes, Relationships
and Properties on both
• perfect for complex, highly connected data
๏ A Graph Database:
• reliable with real ACID Transactions
9
41. Neo4j is a Graph Database
๏ A Graph Database:
• a Property Graph with Nodes, Relationships
and Properties on both
• perfect for complex, highly connected data
๏ A Graph Database:
• reliable with real ACID Transactions
• scalable: 32 Billion Nodes, 32 Billion Relationships, 64 Billion
Properties
9
42. Neo4j is a Graph Database
๏ A Graph Database:
• a Property Graph with Nodes, Relationships
and Properties on both
• perfect for complex, highly connected data
๏ A Graph Database:
• reliable with real ACID Transactions
• scalable: 32 Billion Nodes, 32 Billion Relationships, 64 Billion
Properties
• Server with REST API, or Embeddable on the JVM
9
43. Neo4j is a Graph Database
๏ A Graph Database:
• a Property Graph with Nodes, Relationships
and Properties on both
• perfect for complex, highly connected data
๏ A Graph Database:
• reliable with real ACID Transactions
• scalable: 32 Billion Nodes, 32 Billion Relationships, 64 Billion
Properties
• Server with REST API, or Embeddable on the JVM
• high-performance with High-Availability (read scaling) 9
54. [A] Mozilla Pancake
This Material is subject to the terms of the Mozilla Public # License, v. 2.0. If a copy of the
MPL was not distributed with this # file, You can obtain one at http://mozilla.org/MPL/2.0/
55. [A] Mozilla Pancake
• Experimental cloud-based browser
• Built to improve how users Discover,
Collect, Share & Organize things on
the web
• Goal: help users better access &
curate information on the net, on
any device
This Material is subject to the terms of the Mozilla Public # License, v. 2.0. If a copy of the
MPL was not distributed with this # file, You can obtain one at http://mozilla.org/MPL/2.0/
56. This Material is subject to the terms of the Mozilla Public # License, v. 2.0. If a copy of the
MPL was not distributed with this # file, You can obtain one at http://mozilla.org/MPL/2.0/
57. Why Neo4J?
This Material is subject to the terms of the Mozilla Public # License, v. 2.0. If a copy of the
MPL was not distributed with this # file, You can obtain one at http://mozilla.org/MPL/2.0/
58. Why Neo4J?
• The internet is a network of pages
connected to each other. What
better way to model that than in
graphs?
• No time lost fighting with less
expressive datastores
• Easy to implement experimental
features
This Material is subject to the terms of the Mozilla Public # License, v. 2.0. If a copy of the
MPL was not distributed with this # file, You can obtain one at http://mozilla.org/MPL/2.0/
59. Cute meta + data
This Material is subject to the terms of the Mozilla Public # License, v. 2.0. If a copy of the
MPL was not distributed with this # file, You can obtain one at http://mozilla.org/MPL/2.0/
60. Cute meta + data
This Material is subject to the terms of the Mozilla Public # License, v. 2.0. If a copy of the
MPL was not distributed with this # file, You can obtain one at http://mozilla.org/MPL/2.0/
61. Neo4J Co-Existence
• Node uuids as refs in external
ElasticSearch also in internal Lucene
• Custom search ranking for user
history based on node relationship
data
• MySQL for user data, Redis for
metrics
This Material is subject to the terms of the Mozilla Public # License, v. 2.0. If a copy of the
MPL was not distributed with this # file, You can obtain one at http://mozilla.org/MPL/2.0/
62. Mozilla Pancake
Available on BitBucket:
https://bitbucket.org/
mozillapancake/pancake
Questions?
Olivier Yiptong:
oyiptong@mozilla.com
This Material is subject to the terms of the Mozilla Public # License, v. 2.0. If a copy of the
MPL was not distributed with this # file, You can obtain one at http://mozilla.org/MPL/2.0/
63.
64. [B] ACL from Hell
One of the top 10 telcos worldwide
65. Background
Telenor calculated in 2010 that it´s self-service web solution for
companies, MinBedrift, would not scale with the projected
customer and subscription growth beyond 2012.
Limit
th
G row
ted
jec
Pro
2010
2015
66. [B] Telenor Background
Background
Telenor calculated in 2010 that it´s self-service web solution for
companies, MinBedrift, would not scale with the projected
customer and subscription growth beyond 2012.
Limit
th
G row
ted
jec
Pro
2010
2015
67. [B] Telenor Background
• MinBedrif, a self service web
solution for companies
Background
• 2010 - calculated that it would not
Telenor calculated in 2010 that it´s self-service web solution for
scale with projected growth
companies, MinBedrift, would not scale with the projected
customer and subscription growth beyond 2012.
Limit
th
G row
ted
jec
Pro
2010
2015
68. Business Case
The business case is built on the negative consequence of NOT
addressing the problem.
Loss of customers (income)
Reduced sales transactions (income)
Increased manual support (expenses)
Other
71. Current ACL Service
๏ Stored procedure in DB calculating all access
•cached results for up to 24 hours
•minutes to calculate for large customers
•extremely complex to understand (1500 lines)
•depends on temporary tables
•joins across multiple tables
21
72. Example Access Authorization
User
Access may be given directly or by inheritance Customer
Account
U Subscription
U Inherit = true
Inherit =
false
In
C
he
rit
=
C C C
tru
e
A A A A A
S S S S S S S S S S
75. ACL With Neo4j
๏ Faster than current solution
๏ Simpler to understand the logic
๏ Avoid large temporary tables
๏ Tailored for service (resource authorization)
23
78. [C] MDM within Cisco
master data management, sales compensation management, online customer support
Description Benefits
Real-time conflict detection in sales compensation management. Performance : “Minutes to Milliseconds”
Business-critical “P1” system. Neo4j allows Cisco to model complex Outperforms Oracle RAC, serving complex queries in real time
algorithms, which still maintaining high performance over a large Flexibility
dataset. Allows for Cisco to model interconnected data and complex queries with
ease
Background
Robustness
Neo4j replaces Oracle RAC, which was not performant enough for the
With 9+ years of production experience, Neo4j brings a solid product.
use case.
Architecture
3-node Enterprise cluster with mirrored
disaster recovery cluster
Dedicated hardware in own datacenter
Embedded in custom webapp
Sizing
35 million nodes
50 million relationships
600 million properties
80. Really, once you start
thinking in graphs
it's hard to stop
Recommendations MDM
Business intelligence
Geospatial
catalogs Systems
access control Social computingManagement
your brain
Biotechnology
routing genealogy
linguistics
Making Sense of all that
compensation
data market vectors
26
81. Really, once you start
thinking in graphs
it's hard to stop
What will you MDM
Recommendations build?
Business intelligence
Geospatial
catalogs Systems
access control Social computingManagement
your brain
Biotechnology
routing genealogy
linguistics
Making Sense of all that
compensation
data market vectors
26
Much richer data than flat history\nCan replay a user’s browsing session step-by-step\nExperimental features examples: recommendations, which pages to omit, etc.\n
Much richer data than flat history\nCan replay a user’s browsing session step-by-step\nExperimental features examples: recommendations, which pages to omit, etc.\n
Visualization done with GraphViz\nA user will have many such “stacks”\n
Search ranking weighs inbound and outbound node connections as part of search score calculation\n