2. Safe Harbour Roadmap Disclaimer
The information presented here is Neo4j, Inc. confidential and does
not constitute, and should not be construed as, a promise or
commitment by Neo4j to develop, market or deliver any particular
product, feature or function.
Neo4j reserves the right to change its product plans or roadmap at
any time, without obligation to notify any person of such changes.
The timing and content of Neo4j’s future product releases could
differ materially from the expectations discussed herein.
Neo4j Inc. All rights reserved 2023
2
4. Neo4j Inc. All rights reserved 2023
4
A Modern, Enterprise Data Platform
Graph Applications
(System of Record Applications)
Application Stack
Real-time Messaging & Processing
(Clickstream, IOT, CDC etc.)
Developer
Intelligent apps
Data Analyst
Query and analyze
Biz Analyst
Visual Analytics
Data Scientist
Algorithms and features
Data engineer
Get clean, useful data
ML Engineer
ML Ops
Data Platform
AI/ML Ops
BI Platforms
Graph Analytics & Analytical Apps
(Real-time applications, visualizations
and algorithms)
Powered by
5. Neo4j Inc. All rights reserved 2023
5
A Data Platform designed for the Cloud
Fully-managed SaaS
Consumption-based pricing
Cloud-native
Self-service deployment
No access to underlying
infrastructure and systems
White-glove managed service
by Neo4j experts
Fully customizable deployment
model and service levels
Operate In own data centers
or Virtual Private Cloud
For private and hybrid
cloud, or on-prem
Bring your own license
Full control of your environment
Run in any cloud, in your account
Graph-as-a-Service Self-hosted
Cloud Managed Services
6. Neo4j Inc. All rights reserved 2023
6
Neo4j Database built for
Operational and Analytical Workloads
Graph Transactions,
Storage & Querying
Graph Analytics, ML,
& Data Science
Intelligent Operational Systems Better Predictions for Analytics
7. Neo4j Inc. All rights reserved 2023
7
Neo4j Database built for
Operational and Analytical Workloads
Enterprise
Trust &
Security
Runs on Cloud
(Azure, Google and AWS)
and on-premises
Scale: Autonomous Clustering
& Composite Databases (Fabric)
Hybrid
Workloads
Native Graph
Architecture
Powers
Graph Data
Science
Comprehensive
Toolset &
Ecosystem
connectivity
Large
Community
growing 80%
yoy
Supports all
data shapes &
relationships
9. Neo4j 5 Continuous Release Support Model
Neo4j Inc. All rights reserved 2023
9
5.LTS.0
Fixes ONLY
Fixes
Frequent
5.0.0
5.LTS+n.0
Features
LTS in June 2024
Final Features
Released
End of
Support
Nov 2027
Upgrade to the latest version to receive latest
fixes from Support
-New features and fixes released in each Minor releases
-Frequent Minor releases aligned between Aura and self-managed
-Ensure two-way migration
-Supports any-to-any upgrade e.g. 5.0 → 5.4
-Long Term Support for a minimum period of 3.5 yrs
10. The Journey so far
5.0
Asyncio Python functions-Python Panda dataframes support-
ElementID object and function-Notifications-Graph Pattern Matching
with inline rel patterns and label expressions- COUNT {} - EXISTS {}
RANGE and POINT indexes-FULLTEXT indexes for lists-faster k-hop
queries-SLOTTED runtime in CE-Incremental offline import-Write
operations in PIPELINED runtime-Autonomous Clustering-
Composite databases
Immutable privileges-Server Side Routing enabled by
default-neo4j-admin interface-CREATE DATABASE from
URI-New backup with differential load-strict validation configuration
-Namespaces for metrics-log4j controlled logs Cypher Shell logging
& impersonation
5.1 Online analyzer in FULLTEXT queries
Id type support for incremental import-Trigram analyzer for TEXT
indexes-Server Side Routing support for reads
5.2 graph.propertiesByName() for composite databases DRYRUN command for Autonomous Clustering Helm Charts for Neo4j 5
5.3 New version of shortest path-subquery support for COUNT and EXISTS
for non-updating queries
Multiple ID columns for CSV import-Improved searches by
ElementID-Improved routing procedures
Extended SHOW DATABASES-extended min password length-
automated quarantine databases
5.4 Warnings for queries with relationship type expressions that can
never be fulfilled-IntersectionNodeByLabelsScan operator
Command completion scripts for neo4j and neo4j-admin-New
licensing acceptance for trial and commercial
5.5 Simplified query API in the official language drivers-plan count store
lookup for relationship endpoints
ALTER DATABASE can update the topology from a single primary to
many primaries
neo4j-admin copy can perform in-place compact/clean-up of a
database-neo4j-admin server validate-config
5.6
Support for bookmarks and CALL {} IN TRANSACTIONS in HTTP Cypher
Transactional API-NotificationCode class including severity level,
category, title and code-COLLECT subqueries
Server tags SHOW SETTINGS Cypher command
5.7 Relationship property uniqueness and relationship key constraints-
error and status output for CALL IN TRANSACTIONS
ALTER DATABASE…WAIT-improved shared cache for multidatabases-
:param and :params in Cypher Shell with Cypher map based syntax
5.8 db.logs.query.annotation_data_as_json_enabled to configure the
output format of annotation data in the query log to be JSON
Automated clustering upgrade-Improved configuration file validation
5.9 Quantified path patterns extended to match against repetitions of
complex subpaths.
Combine time-based transaction log pruning strategy with max size
limit
5.10 Property type constraint: node and relationship properties can now
be restricted to be of a specific type.
Property type constraints include support for list types-<SCALAR
TYPE> NOT NULL variants to the Type Predicate Expression
Clusters can be configured to automatically enable new members
that join the cluster
5.11 HNSW Vector Search Indexes
Expand property type constraints with support for union types and
Type Predicate Expressions to allow for multiple types to be
checked at once
SHOW DATABASE now reports databases starting and stopping-New
index metrics: <prefix>.index.<indexType>.(queried|populated) -
:sysinfo command to Cypher Shell
5.12
Improved argument validation in vector index procedures-
db.nameFromElementId to resolve the database name for a given
element id in Fabric
Track memory of de-referenced nodes and relationships in
ProduceResult-:sysinfo command to Cypher Shell
Neo4j Inc. All rights reserved 2023
10
11. Graph Pattern Matching
Neo4j Inc. All rights reserved 2023
11
Part of the upcoming GQL standard.
It extends the expressivity of graph navigation
(the ASCII art expression after MATCH statements).
Simpler alternative syntax
to navigate and traverse
your graph.
We are working on:
✔ Quantified Path Pattern
✱ Shortest Path for QPPs
✱ MATCH modes: Repeatable Elements
MATCH (:Station { name:'Denmark Hill' })<-[:CALLS_AT]-(d:Stop)
( (:Stop)-[:NEXT]->(:Stop) ){1,3}
(a:Stop)-[:CALLS_AT]->(:Station { name:'Clapham Junction' })
RETURN d.departs AS departureTime, a.arrives AS arrivalTime
12. GQL Standard
GQL is the upcoming ISO standard for
Graph query languages, inspired by
Cypher.
The most 'user-desired' GQL features
have been in development for some
time (Graph Pattern Matching is an
example).
Neo4j Inc. All rights reserved 2023
12
14. Very Large Graphs: LDBC Trillion Entity Graph
Neo4j Inc. All rights reserved 2023
14
LDBC social forum data set-3 Billion users, 1110 forums, scaled horizontally across 1129 servers
● Full dataset is 280 TB with 1 trillion relationships
● 1128 forum shards (250GB each), 1 person shard
(850GB), 3 Neo4j Fabric processors (16 vCPU-
64GB RAM)
● Forum shards contains 900 million relationships
and 182 million nodes
● Person shard contains 3 billion people and 16
billion relationships between them
● Query response times range from 12-66ms
https://github.com/neo4j/trillion-graph
15. Unbounded Scalability To Harness All Data
Neo4j Inc. All rights reserved 2023
15
Autonomous Clustering
Easy, Automated Horizontal Scale-Out
Fabric
Federated Queries and Sharded Graphs
Elastic policy-driven scale up and down on demand for extreme
performance, guaranteed availability with consistency
Scale clusters to 100s of machines
High Availability with automated failover
0 downtime upgrades
Query multiple business graphs
Chain queries for sophisticated real-time analysis
Hybrid cloud queries
17. What if…
Neo4j Inc. All rights reserved 2023
17
…you can…
Run faster queries
Use less memory
Get faster property access
Make improvements faster
Christian Gloor-CC BY-SA 2.0
18. Introducing Block Format
Neo4j Inc. All rights reserved 2023
18
Faster queries-read queries improvements
and even benefits on write performance.
Memory efficient — optimized collocation of
data with improvements on systems where
the working set doesn’t fit into memory. Few
pages need to be loaded to serve a query.
Faster property access — properties are
stored in blocks with their Node/Relationship,
reducing the amount of pointer chasing
required to access them.
• Data is colocated
• Uses dynamic records and tree structures to
handle “growing” data
Small store
Dynamic store
21. Neo4j Inc. All rights reserved 2023
21
How CDC works
Tx Logs
App e.g. Demo Go App
CALL Cdc.query("A3V16ZaL6lU7mppHLFkWrl…A0A")
Neo4j Driver e.g. Go Driver
Tx1
Poll every 500ms
{ "id": "A3V16ZaL6lU7mppppHLFkWrl…A0AA",
"txId": 12,
"seq": 0,
"metadata": {
"executingUser": "neo4j",
"authenticatedUser": "neo4j",
"captureMode": "FULL",
"connectionClient": "127.0.0.1:51320",
"serverId": "e605bd8f",
"connectionType": "bolt",
"connectionServer": "127.0.0.1:51316",
"txStartTime": "2023-03-03T11:58:30.429Z",
"txCommitTime": "2023-03-03T11:58:30.526Z" },
"event": {
"elementId": "4:b7e35973-0aff-42fa-873b-5de31868cb4a:1",
"keys": { "userId": "1001",
"name": "John",
"lastName": "Doe" }
}
}
Application updates the graph
e.g. adds a new user John Doe.
Event written to the transaction log.
23. Cypher Parallel Runtime
Neo4j Inc. All rights reserved 2023
23
Distribute
long-running
analytical query plans
across multiple
concurrent threads.
Dramatic speedup of
analytical queries
Users can configure the
number of cores available to
the Parallel Runtime-leaving
remaining cores for
transactional workloads &
other runtimes.
Works best on servers with
low concurrency.
Parallel Runtime drastically
reduces the response times for
analytical and other long-running
read-only queries.
Particularly suited to ‘graph
global’ analytical queries, i.e., that
do not specify a starting node and
result in large graph traversals.
Typically benefits
long-running, analytical
queries in low-concurrency
environments.
Fast-running queries are
unlikely to perform better (and
may run slower) on Parallel
Runtime.
Can be used on servers
with mixed (analytical
& transactional
workloads)
24. Neo4j Inc. All rights reserved 2023
24
CYPHER runtime = parallel
MATCH (:Station { name: 'Denmark Hill' })<-[:CALLS_AT]-(d:Stop)
((:Stop)-[:NEXT]->(:Stop))+
(a:Stop)-[:CALLS_AT]->(:Station { name: 'Clapham Junction' })
RETURN count(*)
29. Neo4j Inc. All rights reserved 2023
29
Tests and use cases
CYPHER runtime = parallel
MATCH (u:User)-[:POSTED]->(q:Question)-[:TAGGED]->(t:Tag)
WHERE $datetime <= q.createdAt
< $datetime + duration({months: $months})
RETURN t.name AS name, count(DISTINCT u) AS users,
max(q.score) AS score,
round(avg(count{(q)<-[:ANSWERED]-()}),2) AS avgAnswers,
count(DISTINCT q) AS questions
ORDER BY questions DESC LIMIT 20
Benchmark Query
pipelined
(8 cpu)
parallel
(8 cpu)
parallel
(16 cpu)
parallel
(32 cpu)
parallel
(48 cpu)
parallel
(64 cpu)
parallel
(96 cpu)
"User_Engagement_Per_Tag" 58781 6229 3194 1671 1562 806 574
Speedup 9x 18x 35x 38x 73x 103x
On Stackoverflow, which tags correlate with the best (highest scoring) user engagement?
Recommendations
For anyone between the age of 25-65, find VPs of
Marketing who live in the Bay Area and Seattle and
work for one of the largest 20 tech firms by
revenue.
Supply Chain
Find food retailers supplied by wholesalers that
receive eggs or poultry from farms within 5km of
reported avian flu cases.
IAM / CyberSecurity
Find users who have logged into a host that has
communicated in the last 24 hours with a host that
has run an executable with an unsigned certificate.
30. Neo4j Inc. All rights reserved 2023
30
Thursday, October 26 2023
Join us for NODES 2023
a free 24-hour online conference for developers, data scientists,
architects, and data analysts across the globe. Whether you’re a
beginner or an expert, you’ll find inspiring content on the latest
innovations in graph technology for your applications and ML
models–so you can build your skills and stay ahead of the curve.
Save your seat-https://neo4j.com/event/nodes-2023