World’s Best Data Modeling Tool
for Apache Cassandra
1© 2015. All Rights Reserved.
Artem ChebotkoAndrey Kashlev
1 Cassandra Data Modeling Methodology
2 The KDM Tool
3 Live Demo: IoT
4 Live Demo: Media Cataloguing
5 Future Work
2© 2015. All Rights Reserved.
Data Modeling Process
• Data requirements
• Application requirements
• Schema Design
• Optimization
3© 2015. All Rights Reserved.
Cassandra Data Modeling Methodology
© 2015. All Rights Reserved. 4
Conceptual
Data Model
Application
Workflow
Logical
Data Model
Physical
Data Model
Mapping Optimization
Methodology Models
© 2015. All Rights Reserved. 5
Model Representation
Conceptual Data Model ERD
Application Workflow Model Graph
Logical Data Model Chebotko Diagram
Physical Data Model Chebotko Diagram, CQL
Methodology Protocols
© 2015. All Rights Reserved. 6
• Conceptual-to-logical mapping
– Mapping rules
– Mapping patterns
• Physical optimizations
– Partition size analysis
– Duplication factor analysis
– Keys, aggregation, transactions, …
Example
© 2015. All Rights Reserved. 7
SELECT timestamp, value FROM …
WHERE location = ? AND parameter = ? AND timestamp > ?
ORDER BY timestamp DESC
n
parameter value
1
timestampid location
Sensor Measurementrecords
sensor_data
location K
parameter K
timestamp C↓
id C↑
value
1
Example
© 2015. All Rights Reserved. 8
SELECT timestamp, value FROM …
WHERE location = ? AND parameter = ?
AND timestamp > ?
ORDER BY timestamp DESC
n
parameter value
1
timestampid location
Sensor Measurementrecords
Mapping Entity and Relationship Types
sensor_data
location K
parameter K
timestamp C↓
id C↑
value
sensor_data
location K
parameter K
timestamp C↓
id C↑
value
1 2
Example
© 2015. All Rights Reserved. 9
SELECT timestamp, value FROM …
WHERE location = ? AND parameter = ?
AND timestamp > ?
ORDER BY timestamp DESC
n
parameter value
1
timestampid location
Sensor Measurementrecords
Mapping Equality Search Atributes
sensor_data
location K
parameter K
timestamp C↓
id C↑
value
sensor_data
location K
parameter K
timestamp C↓
id C↑
value
sensor_data
location K
parameter K
timestamp C↑
id C↑
value
1 2 3
Example
© 2015. All Rights Reserved. 10
SELECT timestamp, value FROM …
WHERE location = ? AND parameter = ?
AND timestamp > ?
ORDER BY timestamp DESC
n
parameter value
1
timestampid location
Sensor Measurementrecords
Mapping Inequality Search Attributes
sensor_data
location K
parameter K
timestamp C↓
id C↑
value
sensor_data
location K
parameter K
timestamp C↓
id C↑
value
sensor_data
location K
parameter K
timestamp C↑
id C↑
value
sensor_data
location K
parameter K
timestamp C↓
id C↑
value
1 2 3 4
Example
© 2015. All Rights Reserved. 11
SELECT timestamp, value FROM …
WHERE location = ? AND parameter = ?
AND timestamp > ?
ORDER BY timestamp DESC
n
parameter value
1
timestampid location
Sensor Measurementrecords
Mapping Ordering Attributes
sensor_data
location K
parameter K
timestamp C↓
id C↑
value
sensor_data
location K
parameter K
timestamp C↓
id C↑
value
sensor_data
location K
parameter K
timestamp C↓
id C↑
value
sensor_data
location K
parameter K
timestamp C↑
id C↑
value
sensor_data
location K
parameter K
timestamp C↓
id C↑
value
1 2 3 4 5
Example
© 2015. All Rights Reserved. 12
SELECT timestamp, value FROM …
WHERE location = ? AND parameter = ?
AND timestamp > ?
ORDER BY timestamp DESC
n
parameter value
1
timestampid location
Sensor Measurementrecords
Mapping Key Attributes
Methodology Pros and Cons
Correctness
Completeness
© 2015. All Rights Reserved. 13
Complexity
Time investment
Human Errors Happen …
© 2015. All Rights Reserved. 14
Automation
© 2015. All Rights Reserved. 15
Complexity
Time investment
Human Error
1 Cassandra Data Modeling Methodology
2 The KDM Tool
3 Live Demo: IoT
4 Live Demo: Media Cataloguing
5 Future Work
16© 2015. All Rights Reserved.
The KDM Tool
• Streamlines the methodology
• Guides the user
• Automates data modeling tasks:
– Conceptual-to-logical mapping
– Physical optimization
– CQL generation
17© 2015. All Rights Reserved.
KDM Automation Workflow
18© 2015. All Rights Reserved.
KDM Automation Workflow
19© 2015. All Rights Reserved.
Design
Conceptual
Data Model
Step1
Solution
architect
KDM Automation Workflow
20© 2015. All Rights Reserved.
Design
Conceptual
Data Model
Specify
Access
Patterns
Solution
architect
Step1 Step2
Solution
architect
KDM Automation Workflow
21© 2015. All Rights Reserved.
Design
Conceptual
Data Model
Specify
Access
Patterns
Generate
Logical
Data
Models
KDM
Solution
architect
Step1 Step2 Automated
Solution
architect
KDM Automation Workflow
22© 2015. All Rights Reserved.
Design
Conceptual
Data Model
Specify
Access
Patterns
Generate
Logical
Data
Models
Select
Logical
Data
Model
KDM
Solution
architect
Step1 Step2 Step3Automated
Solution
architect
Solution
architect
KDM Automation Workflow
23© 2015. All Rights Reserved.
Design
Conceptual
Data Model
Specify
Access
Patterns
Generate
Logical
Data
Models
Select
Logical
Data
Model
Generate
Physical
Data
Model
KDM
Solution
architect
Step1 Step2 Step3Automated Automated
Solution
architect
Solution
architect
KDM
KDM Automation Workflow
24© 2015. All Rights Reserved.
Design
Conceptual
Data Model
Specify
Access
Patterns
Generate
Logical
Data
Models
Select
Logical
Data
Model
Generate
Physical
Data
Model
Configure
Physical
Data
Model
KDM
Solution
architect
Step1 Step2 Step3 Step4Automated Automated
Solution
architect
Solution
architect
Solution
architect
KDM
KDM Automation Workflow
25© 2015. All Rights Reserved.
Design
Conceptual
Data Model
Specify
Access
Patterns
Generate
Logical
Data
Models
Select
Logical
Data
Model
Generate
Physical
Data
Model
Configure
Physical
Data
Model
Generate
Physical
Schema
KDM
Solution
architect
Step1 Step2 Step3 Step4Automated Automated Automated
Solution
architect
Solution
architect
Solution
architect
KDM KDM
KDM Automation Workflow
26© 2015. All Rights Reserved.
Design
Conceptual
Data Model
Specify
Access
Patterns
Generate
Logical
Data
Models
Select
Logical
Data
Model
Generate
Physical
Data
Model
Configure
Physical
Data
Model
Generate
Physical
Schema
Download
CQL
Script
KDM
Solution
architect
Step1 Step2 Step3 Step4 Step5Automated Automated Automated
Solution
architect
Solution
architect
Solution
architect
Solution
architect
KDM KDM
1 Cassandra Data Modeling Methodology
2 The KDM Tool
3 Live Demo: IoT
4 Live Demo: Media Cataloguing
5 Future Work
27© 2015. All Rights Reserved.
28
29
given
find
Q1.
30
Q1.
Q2.
given
given
range
find
find and
sort DESC
given
31
Q1.
Q2.
Q3.
given
find and
sort DESC
given
find
1 Cassandra Data Modeling Methodology
2 The KDM Tool
3 Live Demo: IoT
4 Live Demo: Media Cataloguing
5 Future Work
32© 2015. All Rights Reserved.
© 2015. All Rights Reserved. 33
34© 2015. All Rights Reserved.
• KDM:
– automates most complex tasks
– eliminates human error
– simplifies data modeling
– guides
Summary
35© 2015. All Rights Reserved.
• build new data models
• verify existing data models
• teach/learn data modeling
How Can KDM Help You?
1 Cassandra Data Modeling Methodology
2 The KDM Tool
3 Live Demo: IoT
4 Live Demo: Media Cataloguing
5 Future Work
36© 2015. All Rights Reserved.
Future Work
• Materialized views
• User Defined Types
© 2015. All Rights Reserved. 37
Future Work
• Analysis and physical optimization
• Support for application workflow design
• Support for Chebotko Diagrams
© 2015. All Rights Reserved. 38
Acknowledgements
• Andrey Kashlev would like to thank:
– Dr. Shiyong Lu
– Anthony Piazza
• Artem Chebotko would like to thank:
– Anthony Piazza
– Patrick McFadin
– Jonathan Ellis
– Tim Berglund
© 2015. All Rights Reserved. 39
Thank you

World’s Best Data Modeling Tool

  • 1.
    World’s Best DataModeling Tool for Apache Cassandra 1© 2015. All Rights Reserved. Artem ChebotkoAndrey Kashlev
  • 2.
    1 Cassandra DataModeling Methodology 2 The KDM Tool 3 Live Demo: IoT 4 Live Demo: Media Cataloguing 5 Future Work 2© 2015. All Rights Reserved.
  • 3.
    Data Modeling Process •Data requirements • Application requirements • Schema Design • Optimization 3© 2015. All Rights Reserved.
  • 4.
    Cassandra Data ModelingMethodology © 2015. All Rights Reserved. 4 Conceptual Data Model Application Workflow Logical Data Model Physical Data Model Mapping Optimization
  • 5.
    Methodology Models © 2015.All Rights Reserved. 5 Model Representation Conceptual Data Model ERD Application Workflow Model Graph Logical Data Model Chebotko Diagram Physical Data Model Chebotko Diagram, CQL
  • 6.
    Methodology Protocols © 2015.All Rights Reserved. 6 • Conceptual-to-logical mapping – Mapping rules – Mapping patterns • Physical optimizations – Partition size analysis – Duplication factor analysis – Keys, aggregation, transactions, …
  • 7.
    Example © 2015. AllRights Reserved. 7 SELECT timestamp, value FROM … WHERE location = ? AND parameter = ? AND timestamp > ? ORDER BY timestamp DESC n parameter value 1 timestampid location Sensor Measurementrecords
  • 8.
    sensor_data location K parameter K timestampC↓ id C↑ value 1 Example © 2015. All Rights Reserved. 8 SELECT timestamp, value FROM … WHERE location = ? AND parameter = ? AND timestamp > ? ORDER BY timestamp DESC n parameter value 1 timestampid location Sensor Measurementrecords Mapping Entity and Relationship Types
  • 9.
    sensor_data location K parameter K timestampC↓ id C↑ value sensor_data location K parameter K timestamp C↓ id C↑ value 1 2 Example © 2015. All Rights Reserved. 9 SELECT timestamp, value FROM … WHERE location = ? AND parameter = ? AND timestamp > ? ORDER BY timestamp DESC n parameter value 1 timestampid location Sensor Measurementrecords Mapping Equality Search Atributes
  • 10.
    sensor_data location K parameter K timestampC↓ id C↑ value sensor_data location K parameter K timestamp C↓ id C↑ value sensor_data location K parameter K timestamp C↑ id C↑ value 1 2 3 Example © 2015. All Rights Reserved. 10 SELECT timestamp, value FROM … WHERE location = ? AND parameter = ? AND timestamp > ? ORDER BY timestamp DESC n parameter value 1 timestampid location Sensor Measurementrecords Mapping Inequality Search Attributes
  • 11.
    sensor_data location K parameter K timestampC↓ id C↑ value sensor_data location K parameter K timestamp C↓ id C↑ value sensor_data location K parameter K timestamp C↑ id C↑ value sensor_data location K parameter K timestamp C↓ id C↑ value 1 2 3 4 Example © 2015. All Rights Reserved. 11 SELECT timestamp, value FROM … WHERE location = ? AND parameter = ? AND timestamp > ? ORDER BY timestamp DESC n parameter value 1 timestampid location Sensor Measurementrecords Mapping Ordering Attributes
  • 12.
    sensor_data location K parameter K timestampC↓ id C↑ value sensor_data location K parameter K timestamp C↓ id C↑ value sensor_data location K parameter K timestamp C↓ id C↑ value sensor_data location K parameter K timestamp C↑ id C↑ value sensor_data location K parameter K timestamp C↓ id C↑ value 1 2 3 4 5 Example © 2015. All Rights Reserved. 12 SELECT timestamp, value FROM … WHERE location = ? AND parameter = ? AND timestamp > ? ORDER BY timestamp DESC n parameter value 1 timestampid location Sensor Measurementrecords Mapping Key Attributes
  • 13.
    Methodology Pros andCons Correctness Completeness © 2015. All Rights Reserved. 13 Complexity Time investment
  • 14.
    Human Errors Happen… © 2015. All Rights Reserved. 14
  • 15.
    Automation © 2015. AllRights Reserved. 15 Complexity Time investment Human Error
  • 16.
    1 Cassandra DataModeling Methodology 2 The KDM Tool 3 Live Demo: IoT 4 Live Demo: Media Cataloguing 5 Future Work 16© 2015. All Rights Reserved.
  • 17.
    The KDM Tool •Streamlines the methodology • Guides the user • Automates data modeling tasks: – Conceptual-to-logical mapping – Physical optimization – CQL generation 17© 2015. All Rights Reserved.
  • 18.
    KDM Automation Workflow 18©2015. All Rights Reserved.
  • 19.
    KDM Automation Workflow 19©2015. All Rights Reserved. Design Conceptual Data Model Step1 Solution architect
  • 20.
    KDM Automation Workflow 20©2015. All Rights Reserved. Design Conceptual Data Model Specify Access Patterns Solution architect Step1 Step2 Solution architect
  • 21.
    KDM Automation Workflow 21©2015. All Rights Reserved. Design Conceptual Data Model Specify Access Patterns Generate Logical Data Models KDM Solution architect Step1 Step2 Automated Solution architect
  • 22.
    KDM Automation Workflow 22©2015. All Rights Reserved. Design Conceptual Data Model Specify Access Patterns Generate Logical Data Models Select Logical Data Model KDM Solution architect Step1 Step2 Step3Automated Solution architect Solution architect
  • 23.
    KDM Automation Workflow 23©2015. All Rights Reserved. Design Conceptual Data Model Specify Access Patterns Generate Logical Data Models Select Logical Data Model Generate Physical Data Model KDM Solution architect Step1 Step2 Step3Automated Automated Solution architect Solution architect KDM
  • 24.
    KDM Automation Workflow 24©2015. All Rights Reserved. Design Conceptual Data Model Specify Access Patterns Generate Logical Data Models Select Logical Data Model Generate Physical Data Model Configure Physical Data Model KDM Solution architect Step1 Step2 Step3 Step4Automated Automated Solution architect Solution architect Solution architect KDM
  • 25.
    KDM Automation Workflow 25©2015. All Rights Reserved. Design Conceptual Data Model Specify Access Patterns Generate Logical Data Models Select Logical Data Model Generate Physical Data Model Configure Physical Data Model Generate Physical Schema KDM Solution architect Step1 Step2 Step3 Step4Automated Automated Automated Solution architect Solution architect Solution architect KDM KDM
  • 26.
    KDM Automation Workflow 26©2015. All Rights Reserved. Design Conceptual Data Model Specify Access Patterns Generate Logical Data Models Select Logical Data Model Generate Physical Data Model Configure Physical Data Model Generate Physical Schema Download CQL Script KDM Solution architect Step1 Step2 Step3 Step4 Step5Automated Automated Automated Solution architect Solution architect Solution architect Solution architect KDM KDM
  • 27.
    1 Cassandra DataModeling Methodology 2 The KDM Tool 3 Live Demo: IoT 4 Live Demo: Media Cataloguing 5 Future Work 27© 2015. All Rights Reserved.
  • 28.
  • 29.
  • 30.
  • 31.
  • 32.
    1 Cassandra DataModeling Methodology 2 The KDM Tool 3 Live Demo: IoT 4 Live Demo: Media Cataloguing 5 Future Work 32© 2015. All Rights Reserved.
  • 33.
    © 2015. AllRights Reserved. 33
  • 34.
    34© 2015. AllRights Reserved. • KDM: – automates most complex tasks – eliminates human error – simplifies data modeling – guides Summary
  • 35.
    35© 2015. AllRights Reserved. • build new data models • verify existing data models • teach/learn data modeling How Can KDM Help You?
  • 36.
    1 Cassandra DataModeling Methodology 2 The KDM Tool 3 Live Demo: IoT 4 Live Demo: Media Cataloguing 5 Future Work 36© 2015. All Rights Reserved.
  • 37.
    Future Work • Materializedviews • User Defined Types © 2015. All Rights Reserved. 37
  • 38.
    Future Work • Analysisand physical optimization • Support for application workflow design • Support for Chebotko Diagrams © 2015. All Rights Reserved. 38
  • 39.
    Acknowledgements • Andrey Kashlevwould like to thank: – Dr. Shiyong Lu – Anthony Piazza • Artem Chebotko would like to thank: – Anthony Piazza – Patrick McFadin – Jonathan Ellis – Tim Berglund © 2015. All Rights Reserved. 39
  • 40.