SlideShare a Scribd company logo
1 of 48
Five Tips for
Getting to
Production with
DataStax
Enterprise Graph
1 © DataStax, All Rights Reserved.
A robust, scale-out graph database that focuses on storing, processing,
and acting on highly connected and complex data relationships in real-
time.
DataStax Enterprise (DSE) Graph
© DataStax, All Rights Reserved. Confidential2
• Customer 360
• Personalization
• Recommendations
• Fraud Detection
• Internet of Things
• Asset Management
• Data Integration
Common DSE Graph Use Cases
3 © DataStax, All Rights Reserved. Confidential
Integrating data silos and
exploring neighborhoods to
provide personalized user
experience in real-time.
What is Customer 360 (C360)?
© DataStax, All Rights Reserved. Confidential4
Location
Social
Orders
Account
Contact
Feedback
Devices
360°
Customer
Channels
1. Know Your Data
Distributions
5 © DataStax, All Rights Reserved. Confidential
Ask Yourself...
© DataStax, All Rights Reserved. Confidential6
What relationships exist currently or
could possibly exist in the data?1
Location
Social
Orders
Account
Contact
Feedback
Devices
360°
Customer
Channels
Ask Yourself...
© DataStax, All Rights Reserved. Confidential7
Which of the identified relationships
are important?
What relationships exist currently or
could possibly exist in the data?1
2
Location
Social
Orders
Account
Contact
Feedback
Devices
Customer
Channels
Ask Yourself...
© DataStax, All Rights Reserved. Confidential8
What is the distribution of those
relationships?
Which of the identified relationships
are important?
What relationships exist currently or
could possibly exist in the data?1
2
3
Email to Customer Distribution
...
Number of Customers
CountofEmails
Ask Yourself...
© DataStax, All Rights Reserved. Confidential9
What is the distribution of those
relationships?
Which of the identified relationships
are important?
What relationships exist currently or
could possibly exist in the data?1
2
3
Email to Customer Distribution
...
Number of Edges (Degree)
CountofEmails
2. Know Your Access
Patterns… As Much as
Possible
10 © DataStax, All Rights Reserved. Confidential
Data Modeling
© DataStax, All Rights Reserved. Confidential11
“
The paradigm shift is that
we write our data according to
how we are going to read it.
Nate McCall on the journey of Apache Cassandra during DataStax Accelerate
Relational vs. Cassandra Data Modeling
© DataStax, All Rights Reserved. Confidential12
Application
Models
Data
Data
Models
Application
Relational Cassandra
Relational vs. Cassandra vs Graph Data Modeling
© DataStax, All Rights Reserved. Confidential13
Models
Data Application
Application
Models
Data
Data
Models
Application
Relational Cassandra
Graph
Common C360 Questions
© DataStax, All Rights Reserved. Confidential14
• Who is this customer?
• What is their name, location,
gender, and age?
• What has this customer recently
purchased online or in stores?
• What feedback have they left about
those purchases?
• Who is this customer related to?
• How influential is this customer?
Location
Social
Orders
Account
Contact
Feedback
Devices
Customer
Channels
Common C360 Queries
© DataStax, All Rights Reserved. Confidential15
• Who is this customer?
• What is their name, location,
gender, and age?
• What has this customer recently
purchased online or in stores?
• What feedback have they left about
those purchases?
• Who is this customer related to?
• How influential is this customer?
Location
Social
Orders
Account
Contact
Feedback
Devices
Customer
Channels
Conceptual Data Model
© DataStax, All Rights Reserved. Confidential16
• Who is this customer?
• What is their name, location,
gender, and age?
• What has this customer recently
purchased online or in stores?
• What feedback have they left about
those purchases?
• Who is this customer related to?
• How influential is this customer?
Logical Data Model
© DataStax, All Rights Reserved. Confidential17
• An entity with a single property and an average branching factor of one is a good
indication that the entity should be a property rather than a vertex.
• An entity that has a high median branching factor should be considered for properties
as opposed to vertices.
3. Optimize Query
Performance
18 © DataStax, All Rights Reserved. Confidential
Understand Branching Factor
© DataStax, All Rights Reserved. Confidential19
Traversal time is
roughly
proportional to the
number of edges
and vertices
visited.
Understand Branching Factor
© DataStax, All Rights Reserved. Confidential20
Traversal time is
roughly
proportional to the
number of edges
and vertices
visited.
Filter Vertices out Along the Way
© DataStax, All Rights Reserved. Confidential21
If you know which
vertices you are not
looking for, avoid
walking to them.
Pick the Best Starting Point
© DataStax, All Rights Reserved. Confidential22
Consider where
your traversal
starts - do you walk
along less edges
when you start at
the black vertex or
the red vertex?
Go Back to the Data Model
© DataStax, All Rights Reserved. Confidential23
Can you optimize
the path from black
to red by adding a
short-cut edge?
Go Back to the Data Model
© DataStax, All Rights Reserved. Confidential24
Can you optimize
the path from black
to red by adding a
short-cut edge?
4. Design a Supernode
Strategy
25 © DataStax, All Rights Reserved. Confidential
What is a supernode?
© DataStax, All Rights Reserved. Confidential26
A vertex with a disproportionately high
level of connected edges.
Causes problems such as:
• performance issues
• stability issues
• issues with visualization
• partial or incorrect results
What should you do if you find a supernode?
© DataStax, All Rights Reserved. Confidential27
Try to model this
vertex as a
property.
Consider a
supernode
optimization
strategy.
Validate and clean
data on ingestion.
RIGHT-SKEWED
YES
NO YES
LEFT-SKEWED
NO
What does
the
distribution
look like?
Is your data
the sole
source of
truth?
Is the
data
valid?
What should you do if you find a supernode?
© DataStax, All Rights Reserved. Confidential28
Try to model this
vertex as a
property.
Consider a
supernode
optimization
strategy.
Validate and clean
data on ingestion.
YES
NO YES
LEFT-SKEWED
NO
What does
the
distribution
look like?
Is your data
the sole
source of
truth?
RIGHT-SKEWED
Is the
data
valid?
What should you do if you find a supernode?
© DataStax, All Rights Reserved. Confidential29
Try to model this
vertex as a
property.
Consider a
supernode
optimization
strategy.
Validate and clean
data on ingestion.
RIGHT-SKEWED
YES
NO YES
LEFT-SKEWED
NO
What does
the
distribution
look like?
Is your data
the sole
source of
truth?
Is the
data
valid?
What should you do if you find a supernode?
© DataStax, All Rights Reserved. Confidential30
Try to model this
vertex as a
property.
Consider a
supernode
optimization
strategy.
Validate and clean
data on ingestion.
RIGHT-SKEWED
YES
NO YES
LEFT-SKEWED
NO
What does
the
distribution
look like?
Is your data
the sole
source of
truth?
Is the
data
valid?
What should you do if you find a supernode?
© DataStax, All Rights Reserved. Confidential31
Try to model this
vertex as a
property.
Consider a
supernode
optimization
strategy.
Validate and clean
data on ingestion.
RIGHT-SKEWED
YES
NO YES
LEFT-SKEWED
NO
What does
the
distribution
look like?
Is your data
the sole
source of
truth?
Is the
data
valid?
What should you do if you find a supernode?
© DataStax, All Rights Reserved. Confidential32
Try to model this
vertex as a
property.
Consider a
supernode
optimization
strategy.
Validate and clean
data on ingestion.
RIGHT-SKEWED
YES
NO YES
LEFT-SKEWED
NO
What does
the
distribution
look like?
Is your data
the sole
source of
truth?
Is the
data
valid?
Supernode Strategy: Add an Edge Index
© DataStax, All Rights Reserved. Confidential33
Edge indices, also called Vertex Centric Indices are local to a vertex,
and give the ability to find and traverse only the edges we need
without scanning all edges.
To leverage the index, filter on the edge during the traversal.
Supernode Strategy: Get More Specific
© DataStax, All Rights Reserved. Confidential34
Make your vertices more granular by including another field in the ID
of the vertex.
vs.
If you have a known supernode, but the vertex is too complex to be a
property, you can avoid performance issues by only traversing in to
the vertex to gather information.
Supernode Strategy: Traverse In, but not Out
© DataStax, All Rights Reserved. Confidential35
5. Embrace a Multi-Model
Approach
36 © DataStax, All Rights Reserved. Confidential
Using the Right Tool for the Problem
© DataStax, All Rights Reserved. Confidential37
DSE
Core
DSE Analytics
DSE Search DSE Graph
Query Complexity
Simple Complex
Offline
Fast
Human
Fast
QueryLatency(p99)
Query: Who is this customer?
© DataStax, All Rights Reserved. Confidential38
DSE
Core
DSE Analytics
DSE Search DSE Graph
Query Complexity
Simple Complex
Offline
Fast
Human
Fast
QueryLatency(p99)
Who is this
customer?
Query: What has this customer recently purchased?
© DataStax, All Rights Reserved. Confidential39
DSE
Core
DSE Analytics
DSE Search DSE Graph
Query Complexity
Simple Complex
Offline
Fast
Human
Fast
QueryLatency(p99)
What has
this
customer
recently
purchased?
Who is this
customer?
Query: Who is this customer related to?
© DataStax, All Rights Reserved. Confidential40
DSE
Core
DSE Analytics
DSE Search DSE Graph
Query Complexity
Simple Complex
Offline
Fast
Human
Fast
QueryLatency(p99)
What has
this
customer
recently
purchased?
Who is this
customer related to?
Who is this
customer?
Query: How influential is this customer?
© DataStax, All Rights Reserved. Confidential41
DSE
Core
DSE Analytics
DSE Search DSE Graph
Query Complexity
Simple Complex
Offline
Fast
Human
Fast
QueryLatency(p99)
What has
this
customer
recently
purchased?
Who is this
customer related to?
How influential is
this customer?
Who is this
customer?
Final Multi-Model Approach
© DataStax, All Rights Reserved. Confidential42
DSE
Core
DSE Analytics
DSE Search DSE Graph
Query Complexity
Simple Complex
Offline
Fast
Human
Fast
QueryLatency(p99)
What has
this
customer
recently
purchased?
Who is this
customer related to?
How influential is
this customer?
Who is this
customer?
DataStax Graph For Labs
43 © DataStax, All Rights Reserved. Confidential
DataStax Graph for Labs
© DataStax, All Rights Reserved. Confidential44
“Model Once”
Support
Because solving
complex graph
problems requires
more than just a
graph database.
Inherits DSE
Core Benefits
Fast, scalable and
highly available for
mission critical
applications on
prem and in the
cloud.
Built by the
Experts
Designed and
tested by the core
contributors to
Apache Cassandra
and Tinkerpop.
DataStax Graph for Labs
© DataStax, All Rights Reserved. Confidential45
DOWNLOAD: Visit downloads.datastax.com/#labs to get the new
Graph Engine.
DataStax Graph for Labs
© DataStax, All Rights Reserved. Confidential46
4747
Questions?
Thank you
48

More Related Content

What's hot

A Successful Data Strategy for Insurers in Volatile Times (ASEAN)
A Successful Data Strategy for Insurers in Volatile Times (ASEAN)A Successful Data Strategy for Insurers in Volatile Times (ASEAN)
A Successful Data Strategy for Insurers in Volatile Times (ASEAN)
Denodo
 

What's hot (20)

Data Mesh at CMC Markets: Past, Present and Future
Data Mesh at CMC Markets: Past, Present and FutureData Mesh at CMC Markets: Past, Present and Future
Data Mesh at CMC Markets: Past, Present and Future
 
An Operational Data Layer is Critical for Transformative Banking Applications
An Operational Data Layer is Critical for Transformative Banking ApplicationsAn Operational Data Layer is Critical for Transformative Banking Applications
An Operational Data Layer is Critical for Transformative Banking Applications
 
Agile Data Management with Enterprise Data Fabric (ASEAN)
Agile Data Management with Enterprise Data Fabric (ASEAN)Agile Data Management with Enterprise Data Fabric (ASEAN)
Agile Data Management with Enterprise Data Fabric (ASEAN)
 
Managing Smart Meter with DataStax DSE
Managing Smart Meter with DataStax DSEManaging Smart Meter with DataStax DSE
Managing Smart Meter with DataStax DSE
 
A Big Data Journey
A Big Data JourneyA Big Data Journey
A Big Data Journey
 
Get Started with Cloudera’s Cyber Solution
Get Started with Cloudera’s Cyber SolutionGet Started with Cloudera’s Cyber Solution
Get Started with Cloudera’s Cyber Solution
 
Webinar: Building a Multi-Cloud Strategy with Data Autonomy featuring 451 Res...
Webinar: Building a Multi-Cloud Strategy with Data Autonomy featuring 451 Res...Webinar: Building a Multi-Cloud Strategy with Data Autonomy featuring 451 Res...
Webinar: Building a Multi-Cloud Strategy with Data Autonomy featuring 451 Res...
 
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...
 
A Successful Data Strategy for Insurers in Volatile Times (ASEAN)
A Successful Data Strategy for Insurers in Volatile Times (ASEAN)A Successful Data Strategy for Insurers in Volatile Times (ASEAN)
A Successful Data Strategy for Insurers in Volatile Times (ASEAN)
 
Self Service Analytics enabled by Data Virtualization from Denodo
Self Service Analytics enabled by Data Virtualization from DenodoSelf Service Analytics enabled by Data Virtualization from Denodo
Self Service Analytics enabled by Data Virtualization from Denodo
 
Analyst Keynote: Forrester: Data Fabric Strategy is Vital for Business Innova...
Analyst Keynote: Forrester: Data Fabric Strategy is Vital for Business Innova...Analyst Keynote: Forrester: Data Fabric Strategy is Vital for Business Innova...
Analyst Keynote: Forrester: Data Fabric Strategy is Vital for Business Innova...
 
Crowdsourcing Data Governance
Crowdsourcing Data GovernanceCrowdsourcing Data Governance
Crowdsourcing Data Governance
 
Modern Data Architecture
Modern Data ArchitectureModern Data Architecture
Modern Data Architecture
 
Flash session -streaming--ses1243-lon
Flash session -streaming--ses1243-lonFlash session -streaming--ses1243-lon
Flash session -streaming--ses1243-lon
 
The Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubThe Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data Hub
 
Why Data Virtualization Matters in Your Portfolio
Why Data Virtualization Matters in Your PortfolioWhy Data Virtualization Matters in Your Portfolio
Why Data Virtualization Matters in Your Portfolio
 
Enterprise 360 - Graphs at the Center of a Data Fabric
Enterprise 360 - Graphs at the Center of a Data FabricEnterprise 360 - Graphs at the Center of a Data Fabric
Enterprise 360 - Graphs at the Center of a Data Fabric
 
How to Build the Data Mesh Foundation: A Principled Approach | Zhamak Dehghan...
How to Build the Data Mesh Foundation: A Principled Approach | Zhamak Dehghan...How to Build the Data Mesh Foundation: A Principled Approach | Zhamak Dehghan...
How to Build the Data Mesh Foundation: A Principled Approach | Zhamak Dehghan...
 
Future of Data Strategy (ASEAN)
Future of Data Strategy (ASEAN)Future of Data Strategy (ASEAN)
Future of Data Strategy (ASEAN)
 
Rethink Analytics with an Enterprise Data Hub
Rethink Analytics with an Enterprise Data HubRethink Analytics with an Enterprise Data Hub
Rethink Analytics with an Enterprise Data Hub
 

Similar to Best Practices for Getting to Production with DataStax Enterprise Graph

Mastering Customer Data on Apache Spark
Mastering Customer Data on Apache SparkMastering Customer Data on Apache Spark
Mastering Customer Data on Apache Spark
Caserta
 
¿En qué se parece el Gobierno del Dato a un parque de atracciones?
¿En qué se parece el Gobierno del Dato a un parque de atracciones?¿En qué se parece el Gobierno del Dato a un parque de atracciones?
¿En qué se parece el Gobierno del Dato a un parque de atracciones?
Denodo
 

Similar to Best Practices for Getting to Production with DataStax Enterprise Graph (20)

DataStax | Network Analysis Adventure with DSE Graph, DataStax Studio, and Ti...
DataStax | Network Analysis Adventure with DSE Graph, DataStax Studio, and Ti...DataStax | Network Analysis Adventure with DSE Graph, DataStax Studio, and Ti...
DataStax | Network Analysis Adventure with DSE Graph, DataStax Studio, and Ti...
 
Using Machine Learning to Understand and Predict Marketing ROI
Using Machine Learning to Understand and Predict Marketing ROIUsing Machine Learning to Understand and Predict Marketing ROI
Using Machine Learning to Understand and Predict Marketing ROI
 
Webinar: Fighting Fraud with Graph Databases
Webinar: Fighting Fraud with Graph DatabasesWebinar: Fighting Fraud with Graph Databases
Webinar: Fighting Fraud with Graph Databases
 
Datastax - The Architect's guide to customer experience (CX)
Datastax - The Architect's guide to customer experience (CX)Datastax - The Architect's guide to customer experience (CX)
Datastax - The Architect's guide to customer experience (CX)
 
Mastering Customer Data on Apache Spark
Mastering Customer Data on Apache SparkMastering Customer Data on Apache Spark
Mastering Customer Data on Apache Spark
 
Building a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business GoalsBuilding a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business Goals
 
Get Started with the Most Advanced Edition Yet of Neo4j Graph Data Science
Get Started with the Most Advanced Edition Yet of Neo4j Graph Data ScienceGet Started with the Most Advanced Edition Yet of Neo4j Graph Data Science
Get Started with the Most Advanced Edition Yet of Neo4j Graph Data Science
 
A6 big data_in_the_cloud
A6 big data_in_the_cloudA6 big data_in_the_cloud
A6 big data_in_the_cloud
 
Taming data lake - scalable metrics model
Taming data lake - scalable metrics modelTaming data lake - scalable metrics model
Taming data lake - scalable metrics model
 
How Big Data Can Lead to Bigger ROI
How Big Data Can Lead to Bigger ROIHow Big Data Can Lead to Bigger ROI
How Big Data Can Lead to Bigger ROI
 
DataStax | Adversarial Modeling: Graph, ML, and Analytics for Identity Fraud ...
DataStax | Adversarial Modeling: Graph, ML, and Analytics for Identity Fraud ...DataStax | Adversarial Modeling: Graph, ML, and Analytics for Identity Fraud ...
DataStax | Adversarial Modeling: Graph, ML, and Analytics for Identity Fraud ...
 
¿En qué se parece el Gobierno del Dato a un parque de atracciones?
¿En qué se parece el Gobierno del Dato a un parque de atracciones?¿En qué se parece el Gobierno del Dato a un parque de atracciones?
¿En qué se parece el Gobierno del Dato a un parque de atracciones?
 
Taming the Data Lake with Scalable Metrics Model Framework
Taming the Data Lake with Scalable Metrics Model FrameworkTaming the Data Lake with Scalable Metrics Model Framework
Taming the Data Lake with Scalable Metrics Model Framework
 
When to Consider Semantic Technology for Your Enterprise
When to Consider Semantic Technology for Your EnterpriseWhen to Consider Semantic Technology for Your Enterprise
When to Consider Semantic Technology for Your Enterprise
 
The Value of Metadata
The Value of MetadataThe Value of Metadata
The Value of Metadata
 
Deliveinrg explainable AI
Deliveinrg explainable AIDeliveinrg explainable AI
Deliveinrg explainable AI
 
Are You Underestimating the Value Within Your Data? A conversation about grap...
Are You Underestimating the Value Within Your Data? A conversation about grap...Are You Underestimating the Value Within Your Data? A conversation about grap...
Are You Underestimating the Value Within Your Data? A conversation about grap...
 
Building New Data Ecosystem for Customer Analytics, Strata + Hadoop World, 2016
Building New Data Ecosystem for Customer Analytics, Strata + Hadoop World, 2016Building New Data Ecosystem for Customer Analytics, Strata + Hadoop World, 2016
Building New Data Ecosystem for Customer Analytics, Strata + Hadoop World, 2016
 
Building a New Platform for Customer Analytics
Building a New Platform for Customer Analytics Building a New Platform for Customer Analytics
Building a New Platform for Customer Analytics
 
Why an AI-Powered Data Catalog Tool is Critical to Business Success
Why an AI-Powered Data Catalog Tool is Critical to Business SuccessWhy an AI-Powered Data Catalog Tool is Critical to Business Success
Why an AI-Powered Data Catalog Tool is Critical to Business Success
 

More from DataStax

More from DataStax (18)

Running DataStax Enterprise in VMware Cloud and Hybrid Environments
Running DataStax Enterprise in VMware Cloud and Hybrid EnvironmentsRunning DataStax Enterprise in VMware Cloud and Hybrid Environments
Running DataStax Enterprise in VMware Cloud and Hybrid Environments
 
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step JourneyWebinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
 
Webinar | How to Understand Apache Cassandra™ Performance Through Read/Writ...
Webinar  |  How to Understand Apache Cassandra™ Performance Through Read/Writ...Webinar  |  How to Understand Apache Cassandra™ Performance Through Read/Writ...
Webinar | How to Understand Apache Cassandra™ Performance Through Read/Writ...
 
Webinar | Better Together: Apache Cassandra and Apache Kafka
Webinar  |  Better Together: Apache Cassandra and Apache KafkaWebinar  |  Better Together: Apache Cassandra and Apache Kafka
Webinar | Better Together: Apache Cassandra and Apache Kafka
 
Top 10 Best Practices for Apache Cassandra and DataStax Enterprise
Top 10 Best Practices for Apache Cassandra and DataStax EnterpriseTop 10 Best Practices for Apache Cassandra and DataStax Enterprise
Top 10 Best Practices for Apache Cassandra and DataStax Enterprise
 
Introduction to Apache Cassandra™ + What’s New in 4.0
Introduction to Apache Cassandra™ + What’s New in 4.0Introduction to Apache Cassandra™ + What’s New in 4.0
Introduction to Apache Cassandra™ + What’s New in 4.0
 
Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...
Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...
Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...
 
Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...
Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...
Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...
 
Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...
Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...
Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...
 
Becoming a Customer-Centric Enterprise Via Real-Time Data and Design Thinking
Becoming a Customer-Centric Enterprise Via Real-Time Data and Design ThinkingBecoming a Customer-Centric Enterprise Via Real-Time Data and Design Thinking
Becoming a Customer-Centric Enterprise Via Real-Time Data and Design Thinking
 
Innovation Around Data and AI for Fraud Detection
Innovation Around Data and AI for Fraud DetectionInnovation Around Data and AI for Fraud Detection
Innovation Around Data and AI for Fraud Detection
 
How to get Real-Time Value from your IoT Data - Datastax
How to get Real-Time Value from your IoT Data - DatastaxHow to get Real-Time Value from your IoT Data - Datastax
How to get Real-Time Value from your IoT Data - Datastax
 
Real Time Customer Experience for today's Right-Now Economy
Real Time Customer Experience for today's Right-Now EconomyReal Time Customer Experience for today's Right-Now Economy
Real Time Customer Experience for today's Right-Now Economy
 
Accelerating Digital Transformation using Cloud Native Solutions
Accelerating Digital Transformation using Cloud Native SolutionsAccelerating Digital Transformation using Cloud Native Solutions
Accelerating Digital Transformation using Cloud Native Solutions
 
Webinar - Data Management for the "Right-Now" Economy - The 5 Key Ingredients
Webinar - Data Management for the "Right-Now" Economy - The 5 Key IngredientsWebinar - Data Management for the "Right-Now" Economy - The 5 Key Ingredients
Webinar - Data Management for the "Right-Now" Economy - The 5 Key Ingredients
 
Webinar: Customer Experience in Banking - a CTO's Perspective
Webinar: Customer Experience in Banking - a CTO's PerspectiveWebinar: Customer Experience in Banking - a CTO's Perspective
Webinar: Customer Experience in Banking - a CTO's Perspective
 
GDPR: The Catalyst for Customer 360
GDPR: The Catalyst for Customer 360GDPR: The Catalyst for Customer 360
GDPR: The Catalyst for Customer 360
 
Webinar - Delivering Enhanced Message Processing at Scale With an Always-on D...
Webinar - Delivering Enhanced Message Processing at Scale With an Always-on D...Webinar - Delivering Enhanced Message Processing at Scale With an Always-on D...
Webinar - Delivering Enhanced Message Processing at Scale With an Always-on D...
 

Recently uploaded

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 

Recently uploaded (20)

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 

Best Practices for Getting to Production with DataStax Enterprise Graph

  • 1. Five Tips for Getting to Production with DataStax Enterprise Graph 1 © DataStax, All Rights Reserved.
  • 2. A robust, scale-out graph database that focuses on storing, processing, and acting on highly connected and complex data relationships in real- time. DataStax Enterprise (DSE) Graph © DataStax, All Rights Reserved. Confidential2
  • 3. • Customer 360 • Personalization • Recommendations • Fraud Detection • Internet of Things • Asset Management • Data Integration Common DSE Graph Use Cases 3 © DataStax, All Rights Reserved. Confidential
  • 4. Integrating data silos and exploring neighborhoods to provide personalized user experience in real-time. What is Customer 360 (C360)? © DataStax, All Rights Reserved. Confidential4 Location Social Orders Account Contact Feedback Devices 360° Customer Channels
  • 5. 1. Know Your Data Distributions 5 © DataStax, All Rights Reserved. Confidential
  • 6. Ask Yourself... © DataStax, All Rights Reserved. Confidential6 What relationships exist currently or could possibly exist in the data?1 Location Social Orders Account Contact Feedback Devices 360° Customer Channels
  • 7. Ask Yourself... © DataStax, All Rights Reserved. Confidential7 Which of the identified relationships are important? What relationships exist currently or could possibly exist in the data?1 2 Location Social Orders Account Contact Feedback Devices Customer Channels
  • 8. Ask Yourself... © DataStax, All Rights Reserved. Confidential8 What is the distribution of those relationships? Which of the identified relationships are important? What relationships exist currently or could possibly exist in the data?1 2 3 Email to Customer Distribution ... Number of Customers CountofEmails
  • 9. Ask Yourself... © DataStax, All Rights Reserved. Confidential9 What is the distribution of those relationships? Which of the identified relationships are important? What relationships exist currently or could possibly exist in the data?1 2 3 Email to Customer Distribution ... Number of Edges (Degree) CountofEmails
  • 10. 2. Know Your Access Patterns… As Much as Possible 10 © DataStax, All Rights Reserved. Confidential
  • 11. Data Modeling © DataStax, All Rights Reserved. Confidential11 “ The paradigm shift is that we write our data according to how we are going to read it. Nate McCall on the journey of Apache Cassandra during DataStax Accelerate
  • 12. Relational vs. Cassandra Data Modeling © DataStax, All Rights Reserved. Confidential12 Application Models Data Data Models Application Relational Cassandra
  • 13. Relational vs. Cassandra vs Graph Data Modeling © DataStax, All Rights Reserved. Confidential13 Models Data Application Application Models Data Data Models Application Relational Cassandra Graph
  • 14. Common C360 Questions © DataStax, All Rights Reserved. Confidential14 • Who is this customer? • What is their name, location, gender, and age? • What has this customer recently purchased online or in stores? • What feedback have they left about those purchases? • Who is this customer related to? • How influential is this customer? Location Social Orders Account Contact Feedback Devices Customer Channels
  • 15. Common C360 Queries © DataStax, All Rights Reserved. Confidential15 • Who is this customer? • What is their name, location, gender, and age? • What has this customer recently purchased online or in stores? • What feedback have they left about those purchases? • Who is this customer related to? • How influential is this customer? Location Social Orders Account Contact Feedback Devices Customer Channels
  • 16. Conceptual Data Model © DataStax, All Rights Reserved. Confidential16 • Who is this customer? • What is their name, location, gender, and age? • What has this customer recently purchased online or in stores? • What feedback have they left about those purchases? • Who is this customer related to? • How influential is this customer?
  • 17. Logical Data Model © DataStax, All Rights Reserved. Confidential17 • An entity with a single property and an average branching factor of one is a good indication that the entity should be a property rather than a vertex. • An entity that has a high median branching factor should be considered for properties as opposed to vertices.
  • 18. 3. Optimize Query Performance 18 © DataStax, All Rights Reserved. Confidential
  • 19. Understand Branching Factor © DataStax, All Rights Reserved. Confidential19 Traversal time is roughly proportional to the number of edges and vertices visited.
  • 20. Understand Branching Factor © DataStax, All Rights Reserved. Confidential20 Traversal time is roughly proportional to the number of edges and vertices visited.
  • 21. Filter Vertices out Along the Way © DataStax, All Rights Reserved. Confidential21 If you know which vertices you are not looking for, avoid walking to them.
  • 22. Pick the Best Starting Point © DataStax, All Rights Reserved. Confidential22 Consider where your traversal starts - do you walk along less edges when you start at the black vertex or the red vertex?
  • 23. Go Back to the Data Model © DataStax, All Rights Reserved. Confidential23 Can you optimize the path from black to red by adding a short-cut edge?
  • 24. Go Back to the Data Model © DataStax, All Rights Reserved. Confidential24 Can you optimize the path from black to red by adding a short-cut edge?
  • 25. 4. Design a Supernode Strategy 25 © DataStax, All Rights Reserved. Confidential
  • 26. What is a supernode? © DataStax, All Rights Reserved. Confidential26 A vertex with a disproportionately high level of connected edges. Causes problems such as: • performance issues • stability issues • issues with visualization • partial or incorrect results
  • 27. What should you do if you find a supernode? © DataStax, All Rights Reserved. Confidential27 Try to model this vertex as a property. Consider a supernode optimization strategy. Validate and clean data on ingestion. RIGHT-SKEWED YES NO YES LEFT-SKEWED NO What does the distribution look like? Is your data the sole source of truth? Is the data valid?
  • 28. What should you do if you find a supernode? © DataStax, All Rights Reserved. Confidential28 Try to model this vertex as a property. Consider a supernode optimization strategy. Validate and clean data on ingestion. YES NO YES LEFT-SKEWED NO What does the distribution look like? Is your data the sole source of truth? RIGHT-SKEWED Is the data valid?
  • 29. What should you do if you find a supernode? © DataStax, All Rights Reserved. Confidential29 Try to model this vertex as a property. Consider a supernode optimization strategy. Validate and clean data on ingestion. RIGHT-SKEWED YES NO YES LEFT-SKEWED NO What does the distribution look like? Is your data the sole source of truth? Is the data valid?
  • 30. What should you do if you find a supernode? © DataStax, All Rights Reserved. Confidential30 Try to model this vertex as a property. Consider a supernode optimization strategy. Validate and clean data on ingestion. RIGHT-SKEWED YES NO YES LEFT-SKEWED NO What does the distribution look like? Is your data the sole source of truth? Is the data valid?
  • 31. What should you do if you find a supernode? © DataStax, All Rights Reserved. Confidential31 Try to model this vertex as a property. Consider a supernode optimization strategy. Validate and clean data on ingestion. RIGHT-SKEWED YES NO YES LEFT-SKEWED NO What does the distribution look like? Is your data the sole source of truth? Is the data valid?
  • 32. What should you do if you find a supernode? © DataStax, All Rights Reserved. Confidential32 Try to model this vertex as a property. Consider a supernode optimization strategy. Validate and clean data on ingestion. RIGHT-SKEWED YES NO YES LEFT-SKEWED NO What does the distribution look like? Is your data the sole source of truth? Is the data valid?
  • 33. Supernode Strategy: Add an Edge Index © DataStax, All Rights Reserved. Confidential33 Edge indices, also called Vertex Centric Indices are local to a vertex, and give the ability to find and traverse only the edges we need without scanning all edges. To leverage the index, filter on the edge during the traversal.
  • 34. Supernode Strategy: Get More Specific © DataStax, All Rights Reserved. Confidential34 Make your vertices more granular by including another field in the ID of the vertex. vs.
  • 35. If you have a known supernode, but the vertex is too complex to be a property, you can avoid performance issues by only traversing in to the vertex to gather information. Supernode Strategy: Traverse In, but not Out © DataStax, All Rights Reserved. Confidential35
  • 36. 5. Embrace a Multi-Model Approach 36 © DataStax, All Rights Reserved. Confidential
  • 37. Using the Right Tool for the Problem © DataStax, All Rights Reserved. Confidential37 DSE Core DSE Analytics DSE Search DSE Graph Query Complexity Simple Complex Offline Fast Human Fast QueryLatency(p99)
  • 38. Query: Who is this customer? © DataStax, All Rights Reserved. Confidential38 DSE Core DSE Analytics DSE Search DSE Graph Query Complexity Simple Complex Offline Fast Human Fast QueryLatency(p99) Who is this customer?
  • 39. Query: What has this customer recently purchased? © DataStax, All Rights Reserved. Confidential39 DSE Core DSE Analytics DSE Search DSE Graph Query Complexity Simple Complex Offline Fast Human Fast QueryLatency(p99) What has this customer recently purchased? Who is this customer?
  • 40. Query: Who is this customer related to? © DataStax, All Rights Reserved. Confidential40 DSE Core DSE Analytics DSE Search DSE Graph Query Complexity Simple Complex Offline Fast Human Fast QueryLatency(p99) What has this customer recently purchased? Who is this customer related to? Who is this customer?
  • 41. Query: How influential is this customer? © DataStax, All Rights Reserved. Confidential41 DSE Core DSE Analytics DSE Search DSE Graph Query Complexity Simple Complex Offline Fast Human Fast QueryLatency(p99) What has this customer recently purchased? Who is this customer related to? How influential is this customer? Who is this customer?
  • 42. Final Multi-Model Approach © DataStax, All Rights Reserved. Confidential42 DSE Core DSE Analytics DSE Search DSE Graph Query Complexity Simple Complex Offline Fast Human Fast QueryLatency(p99) What has this customer recently purchased? Who is this customer related to? How influential is this customer? Who is this customer?
  • 43. DataStax Graph For Labs 43 © DataStax, All Rights Reserved. Confidential
  • 44. DataStax Graph for Labs © DataStax, All Rights Reserved. Confidential44 “Model Once” Support Because solving complex graph problems requires more than just a graph database. Inherits DSE Core Benefits Fast, scalable and highly available for mission critical applications on prem and in the cloud. Built by the Experts Designed and tested by the core contributors to Apache Cassandra and Tinkerpop.
  • 45. DataStax Graph for Labs © DataStax, All Rights Reserved. Confidential45
  • 46. DOWNLOAD: Visit downloads.datastax.com/#labs to get the new Graph Engine. DataStax Graph for Labs © DataStax, All Rights Reserved. Confidential46