More Related Content Similar to The path to success with Graph Database and Graph Data Science (20) The path to success with Graph Database and Graph Data Science1. © 2023 Neo4j, Inc. All rights reserved.
© 2023 Neo4j, Inc. All rights reserved.
The Path To Success With Graph
Database and Data Science
Jesus Barrasa
RVP Field Engineering at Neo4j
1
2. © 2023 Neo4j, Inc. All rights reserved.
© 2022 Neo4j, Inc. All rights reserved.
Neo4j Graph Data Platform
2
BUSINESS
USERS
DEVELOPERS
DATA
SCIENTISTS
DATA
ANALYSTS
Enterprise Ready
Data Science & MLOps
Graph Data Science
OLAP
Data Science and Analytics
Tools, algorithms, and Integrated ML framework
AutoML
Integrations
Discovery & Visualization
Low-code querying, data modeling and exploration tools
Neo4j
Bloom
BI
Connectors
Neo4j
Browser
Language
interfaces
Application Development Tools & Frameworks
Tools and APIs for rapid prototyping and development
Graph Query Language
Cypher and GQL as the lingua franca for graphs
Transactions Analytics
Graph Database
Data Consolidation
Contextualization
OLTP
Native Graph Database
The core component of Neo4j platform
Runs Anywhere
Run by yourself or as DBaaS by Neo4j, in the cloud or on premises
Data Connectors
Ecosystem & Integrations
Rich set of connectors to plug into existing data ecosystems
Data Sources
3. © 2023 Neo4j, Inc. All rights reserved.
Engineering Expertise
>1000 person-years investment
First mover advantage
Maturity, Most enterprise deployments
Largest graph community
Growing at 80%+ annually
Neo4j Graph Database Capabilities
Hybrid
Workloads
Native Graph
Architecture
Powers
Graph Data
Science
Rich
Toolset
Enterprise
Trust
Runs
Anywhere
3
4. © 2023 Neo4j, Inc. All rights reserved.
4
Native Graph Architecture
Native Graph
Storage
Native Graph
Processing
• No mismatch
• Data integrity / ACID
• Schema flexible
• 1000x faster than relational
• K-Hop now 10-1000x faster
than version 4
Fabric
• Federation of scaled
out shards
• Instant composite
database
Composite DB
Autonomous
Clustering
• Elastic scale-out for
high throughput
• 100s of machines
across clusters
Data integrity and high speed also true in scaled out situations
5. © 2023 Neo4j, Inc. All rights reserved.
Hybrid Workload Duality
5
Intelligent
Applications
Transactions -
Security -
Performance & Scalability -
ACID Consistency -
Intelligent Modeling
- Extensive & Supported Algo Library
- Scalable
- Graph Visualization
- Graph Transformations
Graph
Transactions
Graph Analytics
& Data Science
6. © 2023 Neo4j, Inc. All rights reserved.
Powers Neo4j Graph Data Science
Graph Data Science
MACHINE LEARNING
Analytics
Feature
Engineering
Data
Exploration
Graph
Data
Science
TensorFlow
KNIME Python
6
Project your graph for in-memory analytics
● Unparalleled analytical processing
● .. with 60+ Algorithms for predictive analytics
● .. and pipeline to supervised AI/ML models
● Making AI smarter!
7. © 2023 Neo4j, Inc. All rights reserved.
Developer Productivity: Rich tooling and easy onramp
ops manager
7
data importer
Visualize and explore your data
Query editor and results visualizer
Code-free data loader and modeler
AuraWorkspace
Unified Workspace
8. © 2023 Neo4j, Inc. All rights reserved.
8
Plugs into your data and development ecosystem
Neo4j BI
Connector
Apache Spark
Connector
Apache Kafka
Connector
Data Warehouse
Connector
Java Python .NET
JavaScript Go
9. © 2023 Neo4j, Inc. All rights reserved.
Enterprise-Grade: Security and Trust Built In
Single Sign-On Secure Development
Practices
Dedicated VPC Role- & Schema-Based
Access Control
Encryption
(At-Rest, In-Transit,
and Intra Cluster)
SOC 2 Type 1
9
10. © 2023 Neo4j, Inc. All rights reserved.
● Real-time Performance at Scale
● Automatic Upgrades, Patches, Backups
● Scale on Demand, No Downtime
● High Availability
● Multi Cloud, Any Region
● Enterprise-grade Security
● Simple Capacity-Based Pricing
10
Run Anywhere: self managed, or by Neo4j
● Full administrative control
● On-premises or via cloud marketplace
● Fit where cloud isn’t appropriate (e.g. special
compliance scenarios)
● Easy migration to AuraDB
Self-Managed
11. © 2023 Neo4j, Inc. All rights reserved.
Forward looking investments
Developer
Experience
Complete multi-cloud availability AuraDB on Azure in addition to GCP, AWS
Making graph ubiquitous with GQL compliance
Programmatic management and monitoring with APIs for AuraDB
Solidifying Neo4j as the data store of record: CDC + next-gen Kafka connector
Theme: the first-choice and primary database that graph-powers any application
Performance at
Scale
Analytic step-up performance with Parallel Cypher Queries
Improved mem-to-storage ratio / Lower TCO with Freki next-gen storage
Even more autonomous clustering with declarative server management
Operational Trust Better monitoring and tuning with query analyzer in Neo4j Ops Manager
Integrated observability with AuraDB metrics and log streaming
Customer managed security in AuraDB with customer managed encryption keys
and customer managed RBAC
12. © 2023 Neo4j, Inc. All rights reserved.
© 2023 Neo4j, Inc. All rights reserved.
Neo4j Graph Data Science
13. © 2023 Neo4j, Inc. All rights reserved.
What’s important?
Prioritization
Who has the most connections?
Who has the highest page rank?
Who is an influencer?
What’s unusual?
Anomaly & Fraud Detection
Where is a community forming?
What are the group dynamics?
What’s unusual about this data?
What’s next?
Predictions
What’s the most common path?
Who is in the same community?
What relationship will form?
13
Pl
ay
s
Lives_in
In_sport
Likes
F
a
n
_
o
f
Plays_for
K
n
o
w
s
Knows
Knows
K
n
o
w
s
Graph Structure Improves Data Science Outcomes
14. © 2023 Neo4j, Inc. All rights reserved.
And created Neo4j Graph Data Science:
Eliminate Pain & Optimize Data Science Workflows with the data you already have
Eliminates Pain Optimizes Data Science
Flows
Complex joins operations
Mining Multiple Tables
Tedious Manual
Approximations
Brute Force Comparisons
Fractured Data
95% reduction in computation
time
500x faster than open source
libraries
Improves Customer
Outcomes
20-30% improvement in
model performance
600% improvement in traffic
$5 Million of additional fraud
detected
3x better churn predictability
5x reduction in factory
production lead time
14
15. © 2023 Neo4j, Inc. All rights reserved.
15
Data
Scientists
> Native Python Client
> Apache Arrow integration
> Unified ML pipelines
We invest in four key areas
Built by data scientists,
for data scientists
Better
Predictions
> 65+ Graph algorithms &
embeddings
> Graph native ML Pipelines
> Vertex AI & SageMaker
Integrations
The best graph data
science and ML engine
Ecosystem
> Apache Spark & Kafka
Connectors
> Native BI Connector
> Data Warehouse Connector
> GNN library support
Seamlessly works with
your data stack and
pipeline
Production
Ready
> Compatible with all major
clouds
> Enterprise Scale & Security
> Deploy anywhere
Go to production with
speed, scale, and
security
16. © 2023 Neo4j, Inc. All rights reserved.
16
With The Largest Catalog of Graph Algorithms
Pathfinding &
Search
Centrality &
Importance
Community
Detection
Supervised
Machine Learning
Heuristic Link
Prediction
Similarity Graph
Embeddings
…and more
Graph algorithms are a set of instructions that visit the nodes of a graph to
analyze the relationships in connected data.
17. © 2023 Neo4j, Inc. All rights reserved.
And Full Support Across the entire ML Lifecycle
Feature
Engineering
Model Training
& Tuning
Model
Deployment
Data Collection
& Preparation
Exploratory
Data Analysis
Model
Evaluation &
Selection
Drivers,
Connectors, Fast
Import/Export
Graph Queries,
Algorithms, and
Visualization
Graph
Embeddings &
Algorithms
Predict APIs,
Model/Graph
Catalog
Operations,
Connectors
Graph Native ML Pipelines
Unsupervised Graph Algorithms
Graph Features -> External ML Pipelines
17
18. © 2023 Neo4j, Inc. All rights reserved.
And made it seamless for all ecosystems and pipelines
Graph Data Science
BI & VISUALIZATIONS
INGEST
STORE
PROCESS
Apache
Kafka
MACHINE LEARNING
Cloud
Functions
Neo4j
Bloom
PubSub
DataProc
Analytics
Feature
Engineering
Data
Exploration
Graph
Data
Science
Business
Applications &
Existing Systems
Files (unstructured,
structured)
TensorFlow
KNIME Python
Cloud Storage
AWS
Lambda
18
Graph Database
19. © 2023 Neo4j, Inc. All rights reserved.
View the most well connected and influential nodes
Recommendations from shared user interactions and associations
Our Visualizations Make analysis easy to understand
19
20. © 2023 Neo4j, Inc. All rights reserved.
20
What’s in it for you:
● Improve model accuracy by 30%
● Simplify processes and remove
headaches
● More projects into production
without additional hiring
Neo4j Graph Data Science
Analytics
Feature
Engineering
Data
Exploration
Graph
Data
Science
Queries & Search
Machine Learning Visualization
21. © 2023 Neo4j, Inc. All rights reserved.
21
Customer Case Study:
Fraud Detection
Correctly identify account holders
committing fraud
Results:
● 300% increase in fraud detection
● 10% true positive escalations
(industry standard < 1%)
● Reduced false positive escalations
● 150% increase in payment flow
22. © 2023 Neo4j, Inc. All rights reserved.
22
How to get started…
3. Graph Native
Machine Learning
Learn features in your graph
that you don’t even know are
important yet using
embeddings.
Predict links, labels, and
missing data with in-graph
supervised ML models.
Identify associations,
anomalies, and trends using
unsupervised machine
learning.
2. Graph Algorithms
1. Knowledge Graphs
Find the patterns you’re looking
for in connected data
23. © 2023 Neo4j, Inc. All rights reserved.
23
What’s New in Graph Data
Science
24. © 2023 Neo4j, Inc. All rights reserved.
Algos & Embeddings
HashGNN Embedding: Faster
approach than GNNs for knowledge
graphs
KMeans Cluster data based on
properties like graph embeddings
Leiden Algorithm: Fast and scalable
modularity based community detection
New
Image courtesy of: Traag, V.A., Waltman, L. & van Eck, N.J.
Image courtesy of: javatpoint.com
Leiden Algorithm:
K-means Clustering:
24
25. © 2023 Neo4j, Inc. All rights reserved.
ML Pipelines
Autotuning: Find optimal
hyperparameters to
improve model
performance
Multilayer Perceptrons
(MLPs): Fully connected
neural networks now
available for Link Prediction
and Node Classification
New
25
26. © 2023 Neo4j, Inc. All rights reserved.
GNN Support
Graph Sampling: sample a
representative subgraph
from a larger graph for
training complex models
Graph Export: use our
projections in other graph
ML libraries like Deep Graph
Library (DGL), PyG, and
Tensorflow GNN
New
Image courtesy of Google Cloud
26
27. © 2023 Neo4j, Inc. All rights reserved.
27
Other
Data Stores
Transactions Analytics
Graph Database Graph Data Science
Integrated AI/Machine
Learning
Data
Integrations
&
Connectors
Admin
Cypher
Drivers
&
APIs
Dev
Tools
Application Layer: Digital Twin, Recommendation, Fraud Detection, Cybersecurity, …
Query
Browser
GraphQL
Analytics & AI/Machine Learning Pipelines
The Neo4j Graph Data Platform
Flexible Graph Schema
Performance, Reliability &
Integrity
Scale-Up & Scale-Out Architecture
Development Tools Breadth
Enterprise Ecosystem
28. © 2023 Neo4j, Inc. All rights reserved.
Continue your graph journey
Connect with passionate graphistas
Free online training and
certification
• dev.neo4j.com/learn
• dev.neo4j.com/datasets
Graph expert group - The
Ninjas
• dev.neo4j.com/ninjas
Connect with the
community:
• dev.neo4j.com/chat
• dev.neo4j.com/forum
• dev.neo4j.com/newsletter
Next developer events
• Live Streams - Weekly & Online
• Local Meetups neo4j.com/events
29. © 2023 Neo4j, Inc. All rights reserved.
Meet the Neo4j Ninjas
Masters of Graphs
Ninjas are:
Active graph bloggers, presenters, GitHub contributors, professors,
user group leaders, and researchers - all sharing their graph expertise
Benefits:
Ninjas benefit from exclusive access to Neo4j experts, VIP event
experience, special giveaways and much more
Interested? For more information visit:
30. © 2023 Neo4j, Inc. All rights reserved.
APOC Documentation
Other Neo4j Resources
Neo4j Graph Data Science Documentation
Neo4j Cypher Manual
Neo4j Driver Manual
Cypher Style Guide
Arrows App
• APOC is a great plugin to level up your
cypher
• This documentation outlines different
commands one could use
• Link to APOC documentation
• The Cypher manual can be used to get
more information about Cypher commands
• Link to cypher manual
• Neo4j Graph Data Science documentation
is a great reference to see which algorithms
to use
• Show how to use different algorithms
• Link to Graph Data Science documentation
• The driver manual provides the official
drivers that are supported by Neo4j
• Link to Neo4j driver manual
• The cypher style guide provide
recommendations for building clean, easy to
read Cypher queries
• Link to Cypher style guide
• The Arrows app allows one to design a
graph without using Cypher
• Link to Arrows app
Cypher Cheat Sheet
• This page gives quick examples of how to
write different queries within Cypher
• Link to Cypher cheat sheet
GraphGists
• GraphGists has many different use cases
and examples for specific industries
• Link to GraphGists
Neo4j Sandbox
• The Neo4j sandbox provides a quick
deployment of a Neo4j server
• It does not require a download
• Comes with example projects
• Link to Neo4j Sandbox
31. © 2023 Neo4j, Inc. All rights reserved.
THANK YOU
Share feedback at slido.com
#GraphSummitMunich2023