DataStax: Datastax Enterprise - The Multi-Model Platform

DataStax Enterprise: The Multi-Model Data Platform

1 Multi-Model Defined
2 The evolution of Models
3 Graph Databases Overview
4 The DataStax Approach to Graph Databases
5 DataStax in the Open Source Graph Community
6 Conclusion
2© 2015. All Rights Reserved.

Multi Model Defined
© 2015. All Rights Reserved. 3
Most database management systems are
organized around a single data model that
determines how data can be organized,
stored, and manipulated. In contrast, a multi-
model database is designed to support
multiple data models against a single,
integrated backend.
Source - https://en.wikipedia.org/wiki/Multi-model_database

The evolution of Models
Application
Application
Application
(OLTP and OLAP)
Data Access Abstraction
RDBMS / SQL Polyglot
Persistence
Multi
Model

DSE Multi Model
Multiple (polygot) Models exposed via
cohesive interaction mechanisms, (APIs)
for OLTP and OLAP workloads, (mixed workload)
with a unified persistence layer, (Apache Cassandra)
providing GR, always on characteristics
within a TCO efficient data base platform
for a variety of use cases
Cassandra

©2015 DataStax Conﬁdential. Do not distribute without consent.
6
DataStax Enterprise – Wide Row / MVs
C1 C2
MV1c1
MV1c2
Agg1c1
Agg2c1

7
DataStax Enterprise - JSON
Inserting JSON data is easy
Reading JSON data is easy
Finding JSON errors is easy

Introduction to Graph Databases

What is a Graph Database?
•  Store, manage and query highly connected data
•  Data stored as nodes (vertices), edges and properties to represent
a domain model
•  By explicitly embedding relationships in the data model, you store
a more logical business model using the natural data access
language Gremlin
•  Think of a graph as 
a pre-joined 
database

Choose the Right Model to Fit your Business Needs
© 2015 DataStax, All Rights Reserved. Company Confidential 10
DSE Wide Row
- Build and maintain models using
CQL’s DDL features
- Super-fast CRUD at scale with CQL
DML features
- Good option for denormalized
schemas with high-throughput
requirements
- Perfect fit for IoT applications that
require consuming enormous
amounts of data with specific
data retrieval requirements:
product catalogs, high-volume
messaging systems, collecting
and storing sensor data
DSE Graph
- Flexible schema that is easy to modify
and maintain with Gremlin
- Clearly maps business semantics to a
logical model for easy maintenance
and understanding
- Ideal model for highly-connected data
models
- Perfect fit for social-engagement models,
recommendation engines and IT
network / device management
- Update and query the graph in real-time
with easy-to-learn open-source
Gremlin language
- Good option for transitioning from slow
RDBMS 3NF models with lots of
JOINs

What is DataStax Enterprise (DSE) Graph?
©2015 DataStax
•  Highly scalable graph database for modern web, mobile, and IoT applications that
need to manage highly connected and heterogeneous data
•  Built-in support for real-time search, and analytic graph queries via tight integration
with the DSE platform
•  A property graph model native inside the DataStax product, engineered specifically
for Cassandra
•  Store & find relationships in data fast and easy on huge graphs

DSE Graph: Built-in Scalability
©2015 DataStax
•  Scale out Graph vs. Scale up only
•  Graph partitioning built on Cassandra’s scale-out architecture
•  Graph index structures integrated into Cassandra
•  Domain model maps more naturally to data model, allowing for greater
understanding between business and IT
•  Traverse millions of relationships in a short period of
time, faster than modeling the data in RDBMS
•  Flexible data model that can be easily adapted to
business changes

DSE Graph: Integrated Search, Analytics and Ops
©2015 DataStax
•  Real-time traversals over complex-structured graph data
•  Integrates with DSE Search to mix search with traversal queries
•  Integrates with DSE Analytics and Spark to support OLAP and breadth-first
graph traversals
•  Iterative graph analytics like PageRank or other centrality measures
•  Reporting and aggregates over graph data
•  Integrated with DataStax OpsCenter

Graph Database Use Cases

Additional Graph Use Cases
360 Degree View of Your Customer
•  Collect massive amounts of data point about your customer
•  Data collected from social networks, web analytics, digital ads, mobile devices, CRM
•  Bring heterogeneous customer data together into DSE Graph
•  Uncover buying patterns and customer behaviors
•  Graph becomes a master data hub for customer data
•  Use the graph customer hub to build better products
for your target customers
•  Keep the customers you already have with
customer intelligence
Customer
360 View
Social
Store
Sensors
Email
CRM
Mobile
Weblogs

Additional Graph Use Cases
IT Network and Device Management
•  Allows IT to monitor, manage and protect corporate networks and devices
(laptops, iPads, mobile phones, etc.)
•  Requires understanding of network topologies and relationships between
devices, interfaces, equipment, people, services …
•  A traditional RDBMS would require
expensive query-time joins
•  A graph model intrinsically knows how
to traverse the topology because the
relationships are already stored
•  This makes for quick & easy recognition
of problems, root cause analysis and
event correlation

DataStax in the Graph Open Source
Community
TinkerPop / Gremlin

DataStax Role in TinkerPop
©2015 DataStax
•  DataStax utilizes the TinkerPop framework for the DSE Graph
product
•  DataStax will contribute to the TinkerPop community and is
heavily invested in the success of the Gremlin language
•  DataStax will provide resource guides, documentation,
samples and training on building and querying graphs
with Gremlin, using DSE Graph as the graph engine

Gremlin
The open-source standard graph query language
•  DataStax contributes and supports the Apache TinkerPop community, along with the Gremlin Graph query
language
•  g.V.hasLabel('person').as('a').out('knows').as('b').select('a','b').by('age')
.by('age')
"for all people in the graph, give me the ages of the people on each end of a friendship relationship“
•  g.V.has('name','marko').out('knows').out('mother').outE('worksFor').
has('time',between(2001,2002)).inV.name 
 
“what are the names of the places that marko's friends' mothers worked for from 2001 to 2002” 
Deep traversal == multiple levels of query-time joins in RDBMS

Recommendation Query – RDBMS vs. Graph
©2015 DataStax
SELECT TOP (5) [t14].[ProductName]
FROM (SELECT COUNT(*) AS [value],
[t13].[ProductName]
FROM [customers] AS [t0]
CROSS APPLY (SELECT [t9].[ProductName]
FROM [orders] AS [t1]
CROSS JOIN [order details] AS [t2]
INNER JOIN [products] AS [t3]
ON [t3].[ProductID] = [t2].[ProductID]
INNER JOIN [orders] AS [t5]
ON [t5].[OrderID] = [t4].[OrderID]
LEFT JOIN [customers] AS [t6]
ON [t6].[CustomerID] = [t5].[CustomerID]
CROSS JOIN ([orders] AS [t7]
ON [t9].[ProductID] = [t8].[ProductID])
WHERE NOT EXISTS(SELECT NULL AS [EMPTY]
FROM [orders] AS [t10]
ON [t12].[ProductID] = [t11].[ProductID]
WHERE [t9].[ProductID] = [t12].[ProductID]
AND [t10].[CustomerID] = [t0].[CustomerID]
AND [t11].[OrderID] = [t10].[OrderID])
AND [t6].[CustomerID] <> [t0].[CustomerID]
AND [t2].[OrderID] = [t1].[OrderID]
AND [t4].[ProductID] = [t3].[ProductID]
AND [t8].[OrderID] = [t7].[OrderID]) AS [t13]
WHERE [t0].[CustomerID] = N'ALFKI'
GROUP BY [t13].[ProductName]) AS [t14]
ORDER BY [t14].[value] DESC
g.V('customerId','ALFKI').as('customer')
.out('ordered').out('contains').out('is').as('products')
.in('is').in('contains').in('ordered').except('customer')
.out('ordered').out('contains').out('is').except('products')
.groupCount().cap().orderMap(T.decr)[0..<5].productName
VS.

DataStax: Datastax Enterprise - The Multi-Model Platform

More Related Content

What's hot

Viewers also liked

Similar to DataStax: Datastax Enterprise - The Multi-Model Platform

More from DataStax Academy

Recently uploaded

DataStax: Datastax Enterprise - The Multi-Model Platform