We are proud to announce the release of Neo4j 3.2. This version marks an expansion in global scale, performance and refinement. It signals that the next generation of graph-powered internet applications, generating personalized content or finding coordinated malfeasance, will span the globe. This webinar detailing the themes behind Neo4j version 3.2, including: enterprise scale for global internet applications, while refining its enterprise governance capabilities and investing in performance improvements up and down the native graph stack.
1. Neo4j 3.2
Scale, Performance and Governance
for Global Internet Applications
Philip Rathle
VP of Products
Jeff Morris
Head of Product Marketing
May 2017
1
2. Neo4j: The Graph Database Leader
2
2000 2003 2007 2011 2013 2014 20152012 2016 2017
First and only
declarative
query
language
for property
graph
Invented
property
graph
model
Extended
graph data
model to
labeled
property
graph
First modern
open-source
commercial
graph DB
1st 1st
1st
Introduced
3rd-gen
clustering
architecture
with causal
consistency
Multi-
data center
support with
network
topology
awareness
1st
First
cost-based
graph
query
optimizer
1st
Published
O’Reilly
book on
graph DB
First
native
graph DB
in 24/7
production
First
visual
development
environment
for graphs
Launched
openCypher
as “SQL for
graphs”
standard
First
database
with native
graph
storage
and
processing
Introduced
graph DB as
a NoSQL
category
1st
1st 1st
Scale,
performance,
governance
for global
internet
apps
Security
Foundation
for data
security and
compliance
V3.2V3.1
1st
First
built-in
graph
ETL in
Cypher
3. Neo4j 3.1 in Review
Security
Foundation
3
Causal
Clustering
State-of-the-Art
Distributed
Architecture for
Graphs
4. RAFT-based architecture
• Continuous availability
• Consensus commits
Seamless load balancing
Drivers Bolt Cluster
Causal consistency
• Tunable ACID-based consistency
• Supports “read your own writes”
• Best model for graph transactions
1000+ heterogeneous clusters
Mix of app servers, large reporting servers,
smaller IoT devices
Neo4j 3.1 in Review:
Causal Clustering: resilient, modern, fault-tolerant architecture
4
5. Neo4j Enterprise Edition safeguards data and meets
compliance requirements
Multiple users -> flexible authentication options
Active Directory/LDAP or Native users
Role-based authorization
Assign permissions to users and groups
List and terminate running queries
Users can manage their own queries
Admins can manage all queries
Access controls for user-defined procedures
Enables subgraph access control
Enables
Sarbanes-Oxley,
HIPAA, PCI-DSS, et al
Neo4j 3.1 in Review:
Security Foundation
5
Neo4j Advantage – Security
6. Introducing Neo4j 3.2
May 2017 GA
Enterprise scale
for global
applications
Continuous
improvement in
native performance
Enterprise governance
for the
connected enterprise
6
7. Enterprise Scale for Global Applications
Causal Clusters can now span data centers
• Clusters can be subdivided into groups and spread
across DCs
• Read-time choice of consistency at global scale:
“Read Any”, “Read-your-own-Writes”
Tiered Subclusters boost performance
• Speeds local reads and writes
• Replica servers pull from nearest
replicas minimizing WAN traffic
Topology-aware stack insulates developers & apps
from the many complexities of clustering
Improved Cloud Delivery via RPM, Azure and AWS EC2
7
dc1 group
dc2 group
8. New in Neo4j 3.2
Multi-Data Center Support for Global Internet Apps
Support global-scale apps across continental data centers—via a single switch
8
Each server in a
Global Causal Cluster
is aware of its
role in the topology
Local data-center
load balancing
drives performance
and availability
Local tiered
hierarchies
speed updates
sa group
uk group
us_east group
hk group
9. Groups can include cores or just tiered replicas
Hierarchical Replica Server Updates
9
RRRR RR
C
CORES
RRRR
C
READ REPLICAS
RRRR RR
RR RR
RR
RR
READ REPLICAS
10. Fast, Local Reads and Writes with
Global Causal Consistency Across the Cluster
Reads occur at the highest speed from a local replica server,
which gets refreshed by local cores
10
CORES
RR
RR
RR RR
READ
REPLICAS
RR
Analysis
RR
READ
REPLICAS
C
CORES
RR
RR
RR
RR
Analysis
RR
C
C
C
C
READ WRITE
Writes are written to a local core server, which propagates
the new data to other local cores, and then to remote core servers
11. Global Read-Your-Own-Writes:
Choices at Read time: Immediacy Or Full Consistency
Readers can choose between immediate access to Replica data
or waiting for any pending writes to propagate to the Replica
11
CORES
RR
RR
RR RR
READ
REPLICAS
RR
Analysis
RR
READ
REPLICAS
C
CORES
RR
RR
RR
RR
Analysis
RR
CREAD WRITE
Neo4j drivers maintain knowledge of server locations and
transaction IDs so developers and users don’t need to
C
C
C
12. Enterprise Governance
Neo4j is IT friendly
Node Keys: new type of schema constraint
• Tied to labels, nodes can have any number of Node Keys
• Ensure graph integrity by enforcing existence and uniqueness
• Improves data exchange across multiple data sources
Kerberos encrypted-authentication module add-on
• Supports three-tier integration of client, directory
and database
Causal Clustering available on CAPI-Flash hardware
from IBM Power8 via add-on
Better metrics in Query Monitor to reveal query
behavior and resource consumption
12
13. Native Graph Performance Improvements
• Native Label index improves write speed by 30-
300%
• Composite indexes supercharge lookup speeds
• Cypher’s depth query in DISTINCT function
eliminates repetitious traversals through
deep levels creating exponential time savings
• New Compiled Cypher runtime in Enterprise Edition
to speed common queries by 300%
• Cost-based-optimizer replaces rules based optimizer
(which has been deprecated)
• Snappier Neo4j Browser with new more flexible
JavaScript framework
13
14. One More Thing!
New Cypher Editor in Neo4j Browser
Syntax Highlighting
Auto Complete for Labels, Relationship
Types, Properties, and Variables
Command Auto Complete
15. Summary
New in Neo4j 3.2 Community Edition
Indexing
• Composite indexes supercharge lookup
speeds
• Native label index improves write speeds
by 30-300%
Cypher query language
• Depth query function DISTINCT improves
“reachability” by no longer treading nodes
it already knows
• Cost-based query optimizer now
automatically invoked
Neo4j Browser re-written
• Improved performance & flexibility via
new JavaScript framework
Driver Pack 1.3
• Transaction handling in Bolt driver library
moved to driver
• Cluster management and routing decisions
moved to driver
Deployment tools
• RPM Packages available
• Cloud delivery via Azure and AWS EC2
15
16. Summary
New in Neo4j 3.2 Enterprise Edition
Multi-data center support
• Improves horizontal scaling and fault
tolerance for global applications
• Adds hierarchical subclusters to speed
replication processes while minimizing
WAN traversals
Causal clustering API
• Drivers in BOLT control transactionality &
preferred server for reads & writes
across the cluster and data centers
Compiled Cypher runtime
• Common and repetitive queries can be
compiled to dramatically improve
performance
Node Keys
• Impose schema constraints into graph
models for existence or data validation
Query monitoring
• API adds many new query metrics
Kerberos encryption
• Add-on module offers Enterprise-caliber
authentication
Clustering on CAPI Flash
• Add-on support on IBM POWER8 Systems
Built to deploy large-scale, mission-critical
graph apps over the Internet
16
17. Summary
Neo4j 3.2 Highlights
Enterprise scale
for global
applications
Continuous
improvement in
native performance
Enterprise governance
for the connected
enterprise
17
19. Native Performance
Native Label index
speed writes
Composite indexes
speed query
performance
Compiled Cypher
Runtime
for common
queries (EE)
Query depth
optimization
for DISTINCT
New JavaScript
framework for
better
flexibility
Cost-based
optimizer
default
CypherEngine
Cypher HTTP Endpoint Bolt EndpointCustom Rest
APOC
Extensions
Parser
Compiled
Runtime (EE)
Interpreted
Runtime
Native Graph Engine
In-Memory Page Cache
Native Graph Storage
Indexing
ACID
Cost-based Optimizer
Fast Write Buffering
CAPI Adapter Configuration Data Stores Logging
Security
HighAvailability
Monitoring
Command
LineInterface
Neo4j
Browser
Sync
Custom
Functions
App or Community Driver Language Drivers
19
20. Neo4j 3.2
Scale, Performance and Governance
for Global Internet Applications
Philip Rathle
VP of Products
Jeff Morris
Head of Product Marketing
May 2017
20
Editor's Notes
Over a decade of leadership in the Graph Space. The seed was planted back in 2000 when our founders invented the property graph model but it wasn’t until 2010 that we contributed the first GA version of Neo4j 1.0 to the open source community and started building a commercial engine around it. We have had a series of first – we introduced Cypher, the first and only declarative language for property graph, launched graph connect and the O’Reilly book to build out the category. The marked rewarded us with commercial success and by the end of 2015 we had 150 paying customers and 50k monthly downloads. The V3.1 and 3.2 releases make Neo4j ready to develop and deploy mission-critical, internet-based, enterprise graph applications.
“Masterless”
No branched data – robust, durable transactions
Industry-Leading
Quorum-based writes
Six levels of tunable read consistency
Neo4j Enterprise Edition has advanced security features for safeguarding data and meeting compliance requirements.
It can authenticate defined Neo4j users as well as authenticate through Active Directory or OpenLDAP. There is a plug-in extension endpoint for integrating with custom authentication and authorization services.
Neo4j’s role-based access control framework lets you define user privileges and permissions for accessing data at a granular level.
Users can also list and terminate their running queries, and admins have global control over running queries. For deeper analysis, Neo4j includes advanced query logging.
A new mechanism controls access to User-Defined Procedures.
Neo4j also includes a new Security Event log for analyzing and auditing security issues.
Establishes Neo4j as the enterprise standard graph technology