DataStax Enterprise: The Multi-Model Data Platform
1 Multi-Model Defined
2 The evolution of Models
3 Graph Databases Overview
4 The DataStax Approach to Graph Databases
5 DataStax in the Open Source Graph Community
6 Conclusion
2© 2015. All Rights Reserved.
Multi Model Defined
© 2015. All Rights Reserved. 3
Most database management systems are
organized around a single data model that
determines how data can be organized,
stored, and manipulated. In contrast, a multi-
model database is designed to support
multiple data models against a single,
integrated backend.
Source - https://en.wikipedia.org/wiki/Multi-model_database
The evolution of Models
© 2015. All Rights Reserved. 4
Application
Application
Application
(OLTP and OLAP)
Data Access Abstraction
RDBMS / SQL Polyglot
Persistence
Multi
Model
DSE Multi Model
© 2015. All Rights Reserved. 5
Multiple (polygot) Models exposed via
cohesive interaction mechanisms, (APIs)
for OLTP and OLAP workloads, (mixed workload)
with a unified persistence layer, (Apache Cassandra)
providing GR, always on characteristics
within a TCO efficient data base platform
for a variety of use cases
Cassandra
©2015 DataStax Confidential. Do not distribute without consent.
 6
DataStax Enterprise – Wide Row / MVs
C1 C2
MV1c1
MV1c2
Agg1c1
Agg2c1
©2015 DataStax Confidential. Do not distribute without consent.
 7
DataStax Enterprise - JSON
Inserting JSON data is easy
Reading JSON data is easy
Finding JSON errors is easy
Introduction to Graph Databases
© 2015. All Rights Reserved. 8
What is a Graph Database?
©2015 DataStax Confidential. Do not distribute without consent.
•  Store, manage and query highly connected data
•  Data stored as nodes (vertices), edges and properties to represent
a domain model
•  By explicitly embedding relationships in the data model, you store
a more logical business model using the natural data access
language Gremlin
•  Think of a graph as

a pre-joined

database
Choose the Right Model to Fit your Business Needs
© 2015 DataStax, All Rights Reserved. Company Confidential 10
DSE Wide Row
- Build and maintain models using
CQL’s DDL features
- Super-fast CRUD at scale with CQL
DML features
- Good option for denormalized
schemas with high-throughput
requirements
- Perfect fit for IoT applications that
require consuming enormous
amounts of data with specific
data retrieval requirements:
product catalogs, high-volume
messaging systems, collecting
and storing sensor data
DSE Graph
- Flexible schema that is easy to modify
and maintain with Gremlin
- Clearly maps business semantics to a
logical model for easy maintenance
and understanding
- Ideal model for highly-connected data
models
- Perfect fit for social-engagement models,
recommendation engines and IT
network / device management
- Update and query the graph in real-time
with easy-to-learn open-source
Gremlin language
- Good option for transitioning from slow
RDBMS 3NF models with lots of
JOINs
What is DataStax Enterprise (DSE) Graph? 
©2015 DataStax 
•  Highly scalable graph database for modern web, mobile, and IoT applications that
need to manage highly connected and heterogeneous data
•  Built-in support for real-time search, and analytic graph queries via tight integration
with the DSE platform
•  A property graph model native inside the DataStax product, engineered specifically
for Cassandra
•  Store & find relationships in data fast and easy on huge graphs
DSE Graph: Built-in Scalability 
©2015 DataStax 
•  Scale out Graph vs. Scale up only
•  Graph partitioning built on Cassandra’s scale-out architecture
•  Graph index structures integrated into Cassandra
•  Domain model maps more naturally to data model, allowing for greater
understanding between business and IT
•  Traverse millions of relationships in a short period of
time, faster than modeling the data in RDBMS
•  Flexible data model that can be easily adapted to
business changes
DSE Graph: Integrated Search, Analytics and Ops
©2015 DataStax 
•  Real-time traversals over complex-structured graph data
•  Integrates with DSE Search to mix search with traversal queries
•  Integrates with DSE Analytics and Spark to support OLAP and breadth-first
graph traversals
•  Iterative graph analytics like PageRank or other centrality measures
•  Reporting and aggregates over graph data
•  Integrated with DataStax OpsCenter
Graph Database Use Cases
© 2015. All Rights Reserved. 14
Additional Graph Use Cases
360 Degree View of Your Customer
•  Collect massive amounts of data point about your customer
•  Data collected from social networks, web analytics, digital ads, mobile devices, CRM
•  Bring heterogeneous customer data together into DSE Graph
•  Uncover buying patterns and customer behaviors
•  Graph becomes a master data hub for customer data
•  Use the graph customer hub to build better products
for your target customers
•  Keep the customers you already have with
customer intelligence
Customer
360 View
Social
Store
Sensors
Email
CRM
Mobile
Weblogs
Additional Graph Use Cases
IT Network and Device Management
•  Allows IT to monitor, manage and protect corporate networks and devices
(laptops, iPads, mobile phones, etc.)
•  Requires understanding of network topologies and relationships between
devices, interfaces, equipment, people, services …
•  A traditional RDBMS would require
expensive query-time joins
•  A graph model intrinsically knows how
to traverse the topology because the
relationships are already stored
•  This makes for quick & easy recognition
of problems, root cause analysis and
event correlation
DataStax in the Graph Open Source
Community
TinkerPop / Gremlin
© 2015. All Rights Reserved. 17
DataStax Role in TinkerPop
©2015 DataStax 
•  DataStax utilizes the TinkerPop framework for the DSE Graph
product
•  DataStax will contribute to the TinkerPop community and is
heavily invested in the success of the Gremlin language
•  DataStax will provide resource guides, documentation,
samples and training on building and querying graphs
with Gremlin, using DSE Graph as the graph engine
Gremlin
The open-source standard graph query language
•  DataStax contributes and supports the Apache TinkerPop community, along with the Gremlin Graph query
language
•  g.V.hasLabel('person').as('a').out('knows').as('b').select('a','b').by('age')
.by('age')
"for all people in the graph, give me the ages of the people on each end of a friendship relationship“
•  g.V.has('name','marko').out('knows').out('mother').outE('worksFor').
has('time',between(2001,2002)).inV.name



“what are the names of the places that marko's friends' mothers worked for from 2001 to 2002”

Deep traversal == multiple levels of query-time joins in RDBMS
Recommendation Query – RDBMS vs. Graph
©2015 DataStax 
SELECT TOP (5) [t14].[ProductName]
FROM (SELECT COUNT(*) AS [value],
[t13].[ProductName]
FROM [customers] AS [t0]
CROSS APPLY (SELECT [t9].[ProductName]
FROM [orders] AS [t1]
CROSS JOIN [order details] AS [t2]
INNER JOIN [products] AS [t3]
ON [t3].[ProductID] = [t2].[ProductID]
CROSS JOIN [order details] AS [t4]
INNER JOIN [orders] AS [t5]
ON [t5].[OrderID] = [t4].[OrderID]
LEFT JOIN [customers] AS [t6]
ON [t6].[CustomerID] = [t5].[CustomerID]
CROSS JOIN ([orders] AS [t7]
CROSS JOIN [order details] AS [t8]
INNER JOIN [products] AS [t9]
ON [t9].[ProductID] = [t8].[ProductID])
WHERE NOT EXISTS(SELECT NULL AS [EMPTY]
FROM [orders] AS [t10]
CROSS JOIN [order details] AS [t11]
INNER JOIN [products] AS [t12]
ON [t12].[ProductID] = [t11].[ProductID]
WHERE [t9].[ProductID] = [t12].[ProductID]
AND [t10].[CustomerID] = [t0].[CustomerID]
AND [t11].[OrderID] = [t10].[OrderID])
AND [t6].[CustomerID] <> [t0].[CustomerID]
AND [t1].[CustomerID] = [t0].[CustomerID]
AND [t2].[OrderID] = [t1].[OrderID]
AND [t4].[ProductID] = [t3].[ProductID]
AND [t7].[CustomerID] = [t6].[CustomerID]
AND [t8].[OrderID] = [t7].[OrderID]) AS [t13]
WHERE [t0].[CustomerID] = N'ALFKI'
GROUP BY [t13].[ProductName]) AS [t14]
ORDER BY [t14].[value] DESC
g.V('customerId','ALFKI').as('customer') 
.out('ordered').out('contains').out('is').as('products') 
.in('is').in('contains').in('ordered').except('customer') 
.out('ordered').out('contains').out('is').except('products') 
.groupCount().cap().orderMap(T.decr)[0..<5].productName
VS.
Thank you

DataStax: Datastax Enterprise - The Multi-Model Platform

  • 1.
    DataStax Enterprise: TheMulti-Model Data Platform
  • 2.
    1 Multi-Model Defined 2The evolution of Models 3 Graph Databases Overview 4 The DataStax Approach to Graph Databases 5 DataStax in the Open Source Graph Community 6 Conclusion 2© 2015. All Rights Reserved.
  • 3.
    Multi Model Defined ©2015. All Rights Reserved. 3 Most database management systems are organized around a single data model that determines how data can be organized, stored, and manipulated. In contrast, a multi- model database is designed to support multiple data models against a single, integrated backend. Source - https://en.wikipedia.org/wiki/Multi-model_database
  • 4.
    The evolution ofModels © 2015. All Rights Reserved. 4 Application Application Application (OLTP and OLAP) Data Access Abstraction RDBMS / SQL Polyglot Persistence Multi Model
  • 5.
    DSE Multi Model ©2015. All Rights Reserved. 5 Multiple (polygot) Models exposed via cohesive interaction mechanisms, (APIs) for OLTP and OLAP workloads, (mixed workload) with a unified persistence layer, (Apache Cassandra) providing GR, always on characteristics within a TCO efficient data base platform for a variety of use cases Cassandra
  • 6.
    ©2015 DataStax Confidential.Do not distribute without consent. 6 DataStax Enterprise – Wide Row / MVs C1 C2 MV1c1 MV1c2 Agg1c1 Agg2c1
  • 7.
    ©2015 DataStax Confidential.Do not distribute without consent. 7 DataStax Enterprise - JSON Inserting JSON data is easy Reading JSON data is easy Finding JSON errors is easy
  • 8.
    Introduction to GraphDatabases © 2015. All Rights Reserved. 8
  • 9.
    What is aGraph Database? ©2015 DataStax Confidential. Do not distribute without consent. •  Store, manage and query highly connected data •  Data stored as nodes (vertices), edges and properties to represent a domain model •  By explicitly embedding relationships in the data model, you store a more logical business model using the natural data access language Gremlin •  Think of a graph as
 a pre-joined
 database
  • 10.
    Choose the RightModel to Fit your Business Needs © 2015 DataStax, All Rights Reserved. Company Confidential 10 DSE Wide Row - Build and maintain models using CQL’s DDL features - Super-fast CRUD at scale with CQL DML features - Good option for denormalized schemas with high-throughput requirements - Perfect fit for IoT applications that require consuming enormous amounts of data with specific data retrieval requirements: product catalogs, high-volume messaging systems, collecting and storing sensor data DSE Graph - Flexible schema that is easy to modify and maintain with Gremlin - Clearly maps business semantics to a logical model for easy maintenance and understanding - Ideal model for highly-connected data models - Perfect fit for social-engagement models, recommendation engines and IT network / device management - Update and query the graph in real-time with easy-to-learn open-source Gremlin language - Good option for transitioning from slow RDBMS 3NF models with lots of JOINs
  • 11.
    What is DataStaxEnterprise (DSE) Graph? ©2015 DataStax •  Highly scalable graph database for modern web, mobile, and IoT applications that need to manage highly connected and heterogeneous data •  Built-in support for real-time search, and analytic graph queries via tight integration with the DSE platform •  A property graph model native inside the DataStax product, engineered specifically for Cassandra •  Store & find relationships in data fast and easy on huge graphs
  • 12.
    DSE Graph: Built-inScalability ©2015 DataStax •  Scale out Graph vs. Scale up only •  Graph partitioning built on Cassandra’s scale-out architecture •  Graph index structures integrated into Cassandra •  Domain model maps more naturally to data model, allowing for greater understanding between business and IT •  Traverse millions of relationships in a short period of time, faster than modeling the data in RDBMS •  Flexible data model that can be easily adapted to business changes
  • 13.
    DSE Graph: IntegratedSearch, Analytics and Ops ©2015 DataStax •  Real-time traversals over complex-structured graph data •  Integrates with DSE Search to mix search with traversal queries •  Integrates with DSE Analytics and Spark to support OLAP and breadth-first graph traversals •  Iterative graph analytics like PageRank or other centrality measures •  Reporting and aggregates over graph data •  Integrated with DataStax OpsCenter
  • 14.
    Graph Database UseCases © 2015. All Rights Reserved. 14
  • 15.
    Additional Graph UseCases 360 Degree View of Your Customer •  Collect massive amounts of data point about your customer •  Data collected from social networks, web analytics, digital ads, mobile devices, CRM •  Bring heterogeneous customer data together into DSE Graph •  Uncover buying patterns and customer behaviors •  Graph becomes a master data hub for customer data •  Use the graph customer hub to build better products for your target customers •  Keep the customers you already have with customer intelligence Customer 360 View Social Store Sensors Email CRM Mobile Weblogs
  • 16.
    Additional Graph UseCases IT Network and Device Management •  Allows IT to monitor, manage and protect corporate networks and devices (laptops, iPads, mobile phones, etc.) •  Requires understanding of network topologies and relationships between devices, interfaces, equipment, people, services … •  A traditional RDBMS would require expensive query-time joins •  A graph model intrinsically knows how to traverse the topology because the relationships are already stored •  This makes for quick & easy recognition of problems, root cause analysis and event correlation
  • 17.
    DataStax in theGraph Open Source Community TinkerPop / Gremlin © 2015. All Rights Reserved. 17
  • 18.
    DataStax Role inTinkerPop ©2015 DataStax •  DataStax utilizes the TinkerPop framework for the DSE Graph product •  DataStax will contribute to the TinkerPop community and is heavily invested in the success of the Gremlin language •  DataStax will provide resource guides, documentation, samples and training on building and querying graphs with Gremlin, using DSE Graph as the graph engine
  • 19.
    Gremlin The open-source standardgraph query language •  DataStax contributes and supports the Apache TinkerPop community, along with the Gremlin Graph query language •  g.V.hasLabel('person').as('a').out('knows').as('b').select('a','b').by('age') .by('age') "for all people in the graph, give me the ages of the people on each end of a friendship relationship“ •  g.V.has('name','marko').out('knows').out('mother').outE('worksFor'). has('time',between(2001,2002)).inV.name
 
 “what are the names of the places that marko's friends' mothers worked for from 2001 to 2002”
 Deep traversal == multiple levels of query-time joins in RDBMS
  • 20.
    Recommendation Query –RDBMS vs. Graph ©2015 DataStax SELECT TOP (5) [t14].[ProductName] FROM (SELECT COUNT(*) AS [value], [t13].[ProductName] FROM [customers] AS [t0] CROSS APPLY (SELECT [t9].[ProductName] FROM [orders] AS [t1] CROSS JOIN [order details] AS [t2] INNER JOIN [products] AS [t3] ON [t3].[ProductID] = [t2].[ProductID] CROSS JOIN [order details] AS [t4] INNER JOIN [orders] AS [t5] ON [t5].[OrderID] = [t4].[OrderID] LEFT JOIN [customers] AS [t6] ON [t6].[CustomerID] = [t5].[CustomerID] CROSS JOIN ([orders] AS [t7] CROSS JOIN [order details] AS [t8] INNER JOIN [products] AS [t9] ON [t9].[ProductID] = [t8].[ProductID]) WHERE NOT EXISTS(SELECT NULL AS [EMPTY] FROM [orders] AS [t10] CROSS JOIN [order details] AS [t11] INNER JOIN [products] AS [t12] ON [t12].[ProductID] = [t11].[ProductID] WHERE [t9].[ProductID] = [t12].[ProductID] AND [t10].[CustomerID] = [t0].[CustomerID] AND [t11].[OrderID] = [t10].[OrderID]) AND [t6].[CustomerID] <> [t0].[CustomerID] AND [t1].[CustomerID] = [t0].[CustomerID] AND [t2].[OrderID] = [t1].[OrderID] AND [t4].[ProductID] = [t3].[ProductID] AND [t7].[CustomerID] = [t6].[CustomerID] AND [t8].[OrderID] = [t7].[OrderID]) AS [t13] WHERE [t0].[CustomerID] = N'ALFKI' GROUP BY [t13].[ProductName]) AS [t14] ORDER BY [t14].[value] DESC g.V('customerId','ALFKI').as('customer') .out('ordered').out('contains').out('is').as('products') .in('is').in('contains').in('ordered').except('customer') .out('ordered').out('contains').out('is').except('products') .groupCount().cap().orderMap(T.decr)[0..<5].productName VS.
  • 21.