Neo4j: What’s Under the Hood
How knowing this can help you
Philip Rathle
VP of Product Management
@prathle
1. Choose the right technology tool for the job
2. Solve intractable problems: (Business) <--> ( IT)
3. Identify new business opportunities
3
(Perspectives)-[:Shape]->(Understanding)
1. A Historical Perspective
Data Management in 1979
Paper Forms
Tiny RAM Spinning Platters
(Low Capacity /
Slow, Sequential IO) RDBMS
Relational Model
The RDBMS Era
Confidential - Neo4j, Inc.
Data Management Today
Dynamic Real-World Systems
Abundant
RAM
Flash & IO Co-
Processors
(High-Capacity Storage &
Ultra-Fast Random I/O)
Confidential - Neo4j, Inc.
A New Graph Era Emerging
Neo4j
Property Graph Model
Real-Time
Connected Data
2. An IT Portfolio Perspective
8
TRADITIONAL
DATABASES
Store and retrieve data
Real time storage & retrieval
Up to
3
Max #
of
hops
IT Portfolio Perspective
9
TRADITIONAL
DATABASES
BIG DATA
TECHNOLOGY
Store and retrieve data Aggregate and filter data
Real time storage & retrieval
Long running queries
Aggregation & filtering
Up to
3
Max #
of
hops
1
IT Portfolio Perspective
10
TRADITIONAL
DATABASES
BIG DATA
TECHNOLOGY
Store and retrieve data Aggregate and filter data Connections in data
Real time storage & retrieval Real-Time Connected Insights
Long running queries
Aggregation & filtering
“Our Neo4j solution is literally thousands of times faster
than the prior MySQL solution, with queries that require
10-100 times less code”
Volker Pacher, Senior Developer
Up to
3
Max #
of
hops
1 Millions
IT Portfolio Perspective
Illustration by David Somerville based on the original by Hugh McLeod (@gapingvoid)
RDBMS
&
Aggregate-
Oriented NoSQL
Hadoop /
EDW/
Columnar
RDBMS
|<———————- Graph Database & ———————>|
Graph Compute Engine
(Graph Transactions & Analytics)
3. A Technical Architecture Perspective
Core Technology Differences
What Makes Neo4j Different?
Index-Free Adjacency
13
Connectedness and Size of Data Set
ResponseTime
Relational and
Other NoSQL
Databases
0 to 2 hops
0 to 3 degrees
Thousands of connections
1000x
Advantage
Tens to hundreds of hops
Thousands of degrees
Billions of connections
Neo4j
“Minutes to
milliseconds”
This Enables:
“Minutes to Milliseconds” Real-Time Query Performance
ACID Consistency Non ‘Graph-ACID’ DBMSs
15
Maintains Integrity Over Time
Guaranteed Graph Consistency
Becomes Corrupt Over Time
Not ‘Good Enough’ for Graphs
And is Supported By:
ACID Graph Writes : A Requirement for Graph Transactions
What Is Different In Neo4j?
Cypher Query Language
16
MATCH (boss)-[:MANAGES*0..3]->(sub),
(sub)-[:MANAGES*1..3]->(report)
WHERE boss.name = “John Doe”
RETURN sub.name AS Subordinate,
count(report) AS Total
Project
Impact
Less time writing queries
• More time understanding the answers
• Leaving time to ask the next question
Less time debugging queries:
• More time writing the next piece of code
• Improved quality of overall code base
Code that’s easier to read:
• Faster ramp-up for new project members
• Improved maintainability & troubleshooting
17
Neo4j Graph Database: Foundational Components
1
2
3
4
5
6
Index-Free Adjacency
In memory and on flash/disk
vs
ACID Foundation
Required for safe writes
Full-Stack Clustering
Causal consistency
Language, Drivers, Tooling
Developer Experience,
Graph Efficiency, Type Safety
Graph Engine
Cost-Based Optimizer, Graph
Statistics, Cypher Runtime
Hardware Optimizations
For next-gen infrastructure
Neo4j Graph Database: Enterprise Infrastructure
18
Neo4j Security Foundation Multi-Clustering Support for
Global Internet Apps
Rolling Upgrades
Schema Constraints Concurrent/Transactional Write
Performance
Auto Cache Reheating
For Restarts, Restores and Cluster
Expansion
Neo4j 3.4 now supports
rolling upgrades
3.4 3.5
Upgrade older instances while keeping other
members stable and without requiring a restart
of the environment
3.5
What Parts of the Organization
Will Use Graphs
19
The Connected Enterprise
Consumers of Connected Data
20
AI & Graph Analytics
• Sentiment analysis
• Customer
segmentation
• Machine learning
• Cognitive computing
• Community detection
Transactional Graphs
• Fraud detection
• Real-time recommendations
• Network and IT operations
management
• Knowledge Graphs
• Master Data Management
Discovery & Visualization
• Fraud detection
• Network and IT
operations
• Product information
management
• Risk and portfolio analysis
Data
Scientists
Business
Users
Applications
What Neo4j Does:
Enables the Connected Enterprise
21
22
Development &
Administration
Analytics
Tooling
Graph
Analytics
Graph
Transactions
Data Integration
Discovery & VisualizationDrivers & APIs
AI
Neo4j Graph Platform
23
The Neo4j Desktop Platform
Graph Apps for Technologies & End Users
The easiest way to start building Neo4j apps
Includes Neo4j Enterprise for Development, Neo4j Browser, APOC, Graph Algorithms, and more
https://neo4j.com/download
The Visual Power of Graphs
24
25
Graph Visualization Options with Neo4j
Neo4j Bloom
Provided by Neo4j
Exclusively optimized for Neo4j
graphs
Deploys easily in Neo4j Desktop
Focused on graph exploration
thru a code-free UI
Near natural language search
Currently caters to data
analysts and graph SMEs
Currently for individual or small
team use
Viz Toolkits
3rd party e.g. vis.js, d3.js, Keylines
Some offer data hooks into
Neo4j, others may require
custom integration
Offer robust APIs for flexible
control of the viz output
Cater to developers who will
create a custom solution, usually
with limited interactivity
Departmental, enterprise or
public use
BI Tools
3rd party e.g. Tableau, Qlik
Not optimized for graph data,
may require a special connector
UI for dashboard and report
creation with many kinds of viz,
in addition to graph viz
Cater to business users and
data analysts
Departmental, cross-
department or enterprise use
Graph Viz Solutions
3rd party, incl. Kineviz,
Graphistry, Linkurious, …
Have to support multiple
graph models and sources
Feature UI for exploration or
APIs for customizing output
and embedding/publishing
Solutions may cater to
business users, analysts or
developers
Small team, departmental or
cross-department use
Little technical expertise Most technically involved
Exploration focused Publishing / Consumption focused
Smaller deployments Larger deployments
Graph Analytics:
Graph & ML Algorithms
neo4j.com/
graph-algorithms-
book/
Pathfinding
& Search
Centrality /
Importance
Community
Detection
Link
Prediction
Finds optimal paths
or evaluates route
availability and quality
Determines the
importance of distinct
nodes in the network
Detects group
clustering or partition
options
Evaluates how
alike nodes are
Estimates the likelihood
of nodes forming a
future relationship
Similarity
27
Learn More!
Graphs & Cypher in Spark 3.0
SparkCypher & Morpheus
Language Standardization
Working towards industry agreement across vendors
29
ISO GQLopenCypher
Supporting Cypher as an
industry-shared language
(since 2015)
Evolving towards a formal
language Standard
https://openCypher.org
30
Neo4j Community & Ecosystem
…and many more!!
Thank You!
@prathle
31

Neo4j: What's Under the Hood

  • 1.
    Neo4j: What’s Underthe Hood How knowing this can help you Philip Rathle VP of Product Management @prathle
  • 2.
    1. Choose theright technology tool for the job 2. Solve intractable problems: (Business) <--> ( IT) 3. Identify new business opportunities
  • 3.
  • 4.
    1. A HistoricalPerspective
  • 5.
    Data Management in1979 Paper Forms Tiny RAM Spinning Platters (Low Capacity / Slow, Sequential IO) RDBMS Relational Model The RDBMS Era Confidential - Neo4j, Inc.
  • 6.
    Data Management Today DynamicReal-World Systems Abundant RAM Flash & IO Co- Processors (High-Capacity Storage & Ultra-Fast Random I/O) Confidential - Neo4j, Inc. A New Graph Era Emerging Neo4j Property Graph Model Real-Time Connected Data
  • 7.
    2. An ITPortfolio Perspective
  • 8.
    8 TRADITIONAL DATABASES Store and retrievedata Real time storage & retrieval Up to 3 Max # of hops IT Portfolio Perspective
  • 9.
    9 TRADITIONAL DATABASES BIG DATA TECHNOLOGY Store andretrieve data Aggregate and filter data Real time storage & retrieval Long running queries Aggregation & filtering Up to 3 Max # of hops 1 IT Portfolio Perspective
  • 10.
    10 TRADITIONAL DATABASES BIG DATA TECHNOLOGY Store andretrieve data Aggregate and filter data Connections in data Real time storage & retrieval Real-Time Connected Insights Long running queries Aggregation & filtering “Our Neo4j solution is literally thousands of times faster than the prior MySQL solution, with queries that require 10-100 times less code” Volker Pacher, Senior Developer Up to 3 Max # of hops 1 Millions IT Portfolio Perspective
  • 11.
    Illustration by DavidSomerville based on the original by Hugh McLeod (@gapingvoid) RDBMS & Aggregate- Oriented NoSQL Hadoop / EDW/ Columnar RDBMS |<———————- Graph Database & ———————>| Graph Compute Engine (Graph Transactions & Analytics)
  • 12.
    3. A TechnicalArchitecture Perspective Core Technology Differences
  • 13.
    What Makes Neo4jDifferent? Index-Free Adjacency 13
  • 14.
    Connectedness and Sizeof Data Set ResponseTime Relational and Other NoSQL Databases 0 to 2 hops 0 to 3 degrees Thousands of connections 1000x Advantage Tens to hundreds of hops Thousands of degrees Billions of connections Neo4j “Minutes to milliseconds” This Enables: “Minutes to Milliseconds” Real-Time Query Performance
  • 15.
    ACID Consistency Non‘Graph-ACID’ DBMSs 15 Maintains Integrity Over Time Guaranteed Graph Consistency Becomes Corrupt Over Time Not ‘Good Enough’ for Graphs And is Supported By: ACID Graph Writes : A Requirement for Graph Transactions
  • 16.
    What Is DifferentIn Neo4j? Cypher Query Language 16 MATCH (boss)-[:MANAGES*0..3]->(sub), (sub)-[:MANAGES*1..3]->(report) WHERE boss.name = “John Doe” RETURN sub.name AS Subordinate, count(report) AS Total Project Impact Less time writing queries • More time understanding the answers • Leaving time to ask the next question Less time debugging queries: • More time writing the next piece of code • Improved quality of overall code base Code that’s easier to read: • Faster ramp-up for new project members • Improved maintainability & troubleshooting
  • 17.
    17 Neo4j Graph Database:Foundational Components 1 2 3 4 5 6 Index-Free Adjacency In memory and on flash/disk vs ACID Foundation Required for safe writes Full-Stack Clustering Causal consistency Language, Drivers, Tooling Developer Experience, Graph Efficiency, Type Safety Graph Engine Cost-Based Optimizer, Graph Statistics, Cypher Runtime Hardware Optimizations For next-gen infrastructure
  • 18.
    Neo4j Graph Database:Enterprise Infrastructure 18 Neo4j Security Foundation Multi-Clustering Support for Global Internet Apps Rolling Upgrades Schema Constraints Concurrent/Transactional Write Performance Auto Cache Reheating For Restarts, Restores and Cluster Expansion Neo4j 3.4 now supports rolling upgrades 3.4 3.5 Upgrade older instances while keeping other members stable and without requiring a restart of the environment 3.5
  • 19.
    What Parts ofthe Organization Will Use Graphs 19
  • 20.
    The Connected Enterprise Consumersof Connected Data 20 AI & Graph Analytics • Sentiment analysis • Customer segmentation • Machine learning • Cognitive computing • Community detection Transactional Graphs • Fraud detection • Real-time recommendations • Network and IT operations management • Knowledge Graphs • Master Data Management Discovery & Visualization • Fraud detection • Network and IT operations • Product information management • Risk and portfolio analysis Data Scientists Business Users Applications
  • 21.
    What Neo4j Does: Enablesthe Connected Enterprise 21
  • 22.
  • 23.
    23 The Neo4j DesktopPlatform Graph Apps for Technologies & End Users The easiest way to start building Neo4j apps Includes Neo4j Enterprise for Development, Neo4j Browser, APOC, Graph Algorithms, and more https://neo4j.com/download
  • 24.
    The Visual Powerof Graphs 24
  • 25.
    25 Graph Visualization Optionswith Neo4j Neo4j Bloom Provided by Neo4j Exclusively optimized for Neo4j graphs Deploys easily in Neo4j Desktop Focused on graph exploration thru a code-free UI Near natural language search Currently caters to data analysts and graph SMEs Currently for individual or small team use Viz Toolkits 3rd party e.g. vis.js, d3.js, Keylines Some offer data hooks into Neo4j, others may require custom integration Offer robust APIs for flexible control of the viz output Cater to developers who will create a custom solution, usually with limited interactivity Departmental, enterprise or public use BI Tools 3rd party e.g. Tableau, Qlik Not optimized for graph data, may require a special connector UI for dashboard and report creation with many kinds of viz, in addition to graph viz Cater to business users and data analysts Departmental, cross- department or enterprise use Graph Viz Solutions 3rd party, incl. Kineviz, Graphistry, Linkurious, … Have to support multiple graph models and sources Feature UI for exploration or APIs for customizing output and embedding/publishing Solutions may cater to business users, analysts or developers Small team, departmental or cross-department use Little technical expertise Most technically involved Exploration focused Publishing / Consumption focused Smaller deployments Larger deployments
  • 26.
    Graph Analytics: Graph &ML Algorithms neo4j.com/ graph-algorithms- book/ Pathfinding & Search Centrality / Importance Community Detection Link Prediction Finds optimal paths or evaluates route availability and quality Determines the importance of distinct nodes in the network Detects group clustering or partition options Evaluates how alike nodes are Estimates the likelihood of nodes forming a future relationship Similarity
  • 27.
  • 28.
    Graphs & Cypherin Spark 3.0 SparkCypher & Morpheus
  • 29.
    Language Standardization Working towardsindustry agreement across vendors 29 ISO GQLopenCypher Supporting Cypher as an industry-shared language (since 2015) Evolving towards a formal language Standard https://openCypher.org
  • 30.
    30 Neo4j Community &Ecosystem …and many more!!
  • 31.