Patterns for Persistence
and Streaming in Cloud
Architectures
Jeff Carpenter
Director of Developer Advocacy
community.datastax.com | @jscarp
Photoshopping
by Steve
Halladay
community.datastax.com | @jscarp
3 © DataStax, All Rights Reserved.
Agenda
1 Context – Monolith to Microservices, On-Prem to Cloud
2 Selecting Infrastructure, Then and Now
3 Persistence Patterns – Featuring Cassandra
4 Persistence + Streaming – Featuring Kafka
5 Resources
community.datastax.com | @jscarp
4 © DataStax, All Rights Reserved.
Agenda
1 Context – Monolith to Microservices, On-Prem to Cloud
2 Selecting Infrastructure, Then and Now
3 Persistence Patterns – Featuring Cassandra
4 Persistence + Streaming – Featuring Kafka
5 Resources
community.datastax.com | @jscarp
Old School Enterprise Architecture
5
© DataStax, All Rights Reserved.
All tables
ACID Transactions
Joins
Indexes
RDBMS
Monolithi
c
Applicatio
n
Other
AppsIntegration
by
database
community.datastax.com | @jscarp
Transitional Architecture
6
© DataStax, All Rights Reserved.
RDBMS
Monolithi
c
Applicatio
n
Integration by API
Service
s
Other
Apps
NoSQL,
NewSQL,
RDBMS
?
community.datastax.com | @jscarp
On Prem DC
Microservices in the Cloud
Services
Clients
Applications
AWS DC A AWS DC B GCP DC
community.datastax.com | @jscarp
8 © DataStax, All Rights Reserved. community.datastax.com | @jscarp
9 © DataStax, All Rights Reserved.
Agenda
1 Context – Monolith to Microservices, On-Prem to Cloud
2 Selecting Infrastructure, Then and Now
3 Persistence Patterns – Featuring Cassandra
4 Persistence + Streaming – Featuring Kafka
5 Resources
community.datastax.com | @jscarp
Tasks of the Architect
Defining Components and Interfaces
Identifying Patterns
Managing the –ilities
Making tradeoffs
10 © DataStax, All Rights Reserved. community.datastax.com | @jscarp
Infrastructure Selection – Then
11 © DataStax, All Rights Reserved. community.datastax.com | @jscarp
Infrastructure Selection – Now?
12 © DataStax, All Rights Reserved. community.datastax.com | @jscarp
Quality Attribute Bingo - Then
•Performance •Scalability •Availability •Reliability
•Extensibility •Modularity •Reusability •Monitorability
•Deployability •Maintainability •Usability •Cost
13 © DataStax, All Rights Reserved. community.datastax.com | @jscarp
Data Infrastructure Criteria - Now
DX
Performance
Availability
Security
Flexibility
Cost
14 © DataStax, All Rights Reserved. community.datastax.com | @jscarp
16 © DataStax, All Rights Reserved. community.datastax.com | @jscarp
Minimizing Cost of Change - Abstraction
17
© DataStax, All Rights Reserved.
Service
Database
API
Busines
s Logic
Messaging
Data
Access
Queue / Stream
community.datastax.com | @jscarp
18 © DataStax, All Rights Reserved.
Agenda
1 Context – Monolith to Microservices, On-Prem to Cloud
2 Selecting Infrastructure, Then and Now
3 Persistence Patterns – Featuring Cassandra
4 Persistence + Streaming – Featuring Kafka
5 Resources
community.datastax.com | @jscarp
Core
application data
Microservices and Polyglot Persistence
19
© DataStax, All Rights Reserved.
Servic
e A
Service
B
Tabular Key-value (cache)
Servic
e C
RelationalDocument Graph
Service
D
Service
E
Reference data Content
Highly
networked data
Legacy, low
volume data
community.datastax.com | @jscarp
Apache Cassandra Overview
• First developed by Facebook
• Top-level Apache project since 2010
• Partitioned row store
• Distributed, decentralized
• Elastic scalability / high performance
• High availability / fault tolerant
• Tuneable consistency
• Cassandra Query Language (CQL)
© DataStax, All Rights Reserved.20
Apache Cassandra ® Apache Software Foundation
community.datastax.com | @jscarp
KillrVideo – A video sharing application
https://github.com/KillrVideohttps://killrvideo.github.io
community.datastax.com | @jscarp
KillrVideo High Level Architecture
KillrVideo
Services
Your
Browser
Web
Application
Technology Choices
• Node.js
• Falcor
• Java / C# / Node.js / Python
• GRPC
• Etcd
• DataStax Drivers
• DataStax Enterprise
including Apache
Cassandra & Spark, Graph
Deployment
• Download and run locally
via Docker
• Deployed in AWS using
DataStax Managed
Services:
http://killrvideo.com/
community.datastax.com | @jscarp
Application Workflow in KillrVideo
User Logs
into site
Show basic
information
about user
Show videos
added by a
user
Show
comments
posted by a
user
Search for a
video by tag
Show latest
videos added
to the site
Show
comments
for a video
Show ratings
for a video
Show video
and its
details
community.datastax.com | @jscarp
Queries in KillrVideo to Support Workflows
Users
User Logs into
site
Find user by email
address
Show basic
information
about user
Find user by id
Comments
Show
comments for
a video
Find comments by
video (latest first)
Show
comments
posted by a
user
Find comments by
user (latest first)
Ratings
Show ratings
for a video Find ratings by video
community.datastax.com | @jscarp
Designing Tables Based on Queries
Show video
and its
details
Find video by id
Show videos
added by a
user
Find videos by user (latest
first)
CREATE TABLE videos (
videoid uuid,
userid uuid,
name text,
description text,
location text,
location_type int,
preview_image_location text,
tags set<text>,
added_date timestamp,
PRIMARY KEY (videoid)
);
CREATE TABLE user_videos (
userid uuid,
added_date timestamp,
videoid uuid,
name text,
preview_image_location text,
PRIMARY KEY (userid,
added_date, videoid)
)
WITH CLUSTERING ORDER BY (
added_date DESC,
videoid ASC);
community.datastax.com | @jscarp
Delivery Models for Cloud (Data) Infrastructure
Enterprise Versions
• Pro
– Certification and
Support
– Additional features
– Security
• Con
– Licensing cost
– Cost of change
Open Source
• Pro
– Free for dev and
prod
– Visibility and
modifiability
• Con
– Cost to maintain
expertise
– Dependence on
community
Managed Services
• Pro
– Ease of adoption
– Lowest time to
prod
– Pay as you go
• Con
– Observability
obscured
– Cost
management
community.datastax.com | @jscarp
Comparing Scale-out Databases
28 © DataStax, All Rights Reserved. community.datastax.com | @jscarp
Core
application data
Microservices and Polyglot Persistence
29
© DataStax, All Rights Reserved.
Servic
e A
Service
B
Tabular Key-value (cache)
Servic
e C
RelationalDocument Graph
Service
D
Service
E
Reference data Content
Highly
networked data
Legacy, low
volume data
community.datastax.com | @jscarp
Should a Service be Polyglot?
30
© DataStax, All Rights Reserved.
Hotel
Service
Cassandra Key-value
(Redis, etc.)
Name-to-
ID
mapping
?
Primary
store
(tabular)
community.datastax.com | @jscarp
Emerging - Multi-model Databases
31 © DataStax, All Rights Reserved.
Servic
e A
Service
B
DSE database
Key-value
semantics
Servic
e C
Service
D
CQL JSON Gremlin
DSE Graph
community.datastax.com | @jscarp
32 © DataStax, All Rights Reserved.
Agenda
1 Context – Monolith to Microservices, On-Prem to Cloud
2 Selecting Infrastructure, Then and Now
3 Persistence Patterns – Featuring Cassandra
4 Persistence + Streaming – Featuring Kafka
5 Resources
community.datastax.com | @jscarp
Apache Kafka Overview
• First developed by LinkedIn
• Top-level Apache Project since 2012
• Distributed streaming platform
• Used for real-time data pipelines and
streaming applications
• Horizontal scalability / high performance
• High availability / Fault tolerance
• Stream persistence and querying
(KSQL)
• Connect framework
33 © DataStax, All Rights Reserved.
Apache Kafka ® Apache Software Foundation
community.datastax.com | @jscarp
Kafka Concepts
• Topics
– Collection of key/value pairs
– Append-only
– Can be partitioned
• Producers
• Consumers
– Separate offsets
34 © DataStax, All Rights Reserved. community.datastax.com | @jscarp
Kafka Concepts
• Streams applications
– Combined Producer/Consumer
• KSQL
– Query language used by stream
applications
35 © DataStax, All Rights Reserved. community.datastax.com | @jscarp
Kafka Concepts
• Brokers
• Clusters
• Connect Framework
– Sources
– Sinks
36 © DataStax, All Rights Reserved. community.datastax.com | @jscarp
Cassandra + Kafka – Similarities and Distinctives
• Concepts in common
– Distributed Systems
– Partitioning / Hashing
– Replication
• Slight differences in implementation
– Multi-DC
– Log-structured
– TTL / retention
• Cassandra excels at…
– High volume, write intensive data storage
workloads at scale
– Suitable as a system of record
– High performance searching via DSE
• Kafka excels at…
– Streaming data to/from services and legacy
data sources
– Acting upon changes in data from multiple
sources (aka pipelines)
37 © DataStax, All Rights Reserved. community.datastax.com | @jscarp
+
Better Together – using the best of both
community.datastax.com | @jscarp
Pattern 1: Cassandra + Kafka in Microservices
39 © DataStax, All Rights Reserved.
Some
Producer
My
microservice
DataStax Enterprise
• Consume
topic(s)
Other
consumers
• Read /
write data
• Publish to
topic(s)
community.datastax.com | @jscarp
KillrVideo Services Suggested
Videos
Service
DataStax Enterprise
DSE Graph
• UserCreated
• YouTubeVideoAdded
• UserRatedVideo • Populate graph
• Graph recommender
traversal
• Read and
write data
User Management, Video
Catalog, Ratings
Cassandra + Kafka – KillrVideo Example
community.datastax.com | @jscarp
Confidential© DataStax, All Rights Reserved.
Pattern 2: Kafka into Cassandra
41 community.datastax.com | @jscarp
Takeaways
Flexibility in selection of databases per microservice
Select and deploy infrastructure based on scale
Use queues to coordinate data synchronization
Use abstraction to minimize the cost of change
42 © DataStax, All Rights Reserved. community.datastax.com | @jscarp
43 © DataStax, All Rights Reserved.
Agenda
1 Context – Monolith to Microservices, On-Prem to Cloud
2 Selecting Infrastructure, Then and Now
3 Persistence Patterns – Featuring Cassandra
4 Persistence + Streaming – Featuring Kafka
5 Resources
community.datastax.com | @jscarp
DataStax Academy
• Free self-paced courses
• DS201: Apache Cassandra™
• DS210: Operations
• DS220: Data Modeling
• DS310: Search
• DS320: Analytics
• DS330: Graph
• Kafka Connector Getting Started
44 © DataStax, All Rights Reserved.
https://academy.datastax.com
community.datastax.com | @jscarp
Docker and Datastax
45 Confidential
• WHERE
– https://hub.docker.com/u/datastax/
– https://github.com/datastax/docker-
images/tree/master/datastax-docker-image-
examples
• We provide
– Dockers images for DSE, studio, Opscenter
– Docker-compose configuration files
– Sample Deployments
• We support
– Installation on dev before 6.7
– Installation on prod from 6.7 (December 2018)
community.datastax.com | @jscarp
Live Coding on Twitch
• Live coding sessions with advocates and
guests
• Working through the challenges of
building distributed systems
• Join the conversation and ask questions
• Twitch Rewind: Kafka Connector
– https://www.youtube.com/watch?v=2_BidD
K5zGE
https://www.twitch.tv/datastaxacademy
46 © DataStax, All Rights Reserved. community.datastax.com | @jscarp
Resources – DataStax Kafka Connector
• Blog
– https://www.datastax.com/2018/12/introducing-the-datastax-apache-kafka-connector
• Download
– https://academy.datastax.com/downloads#connectors
• Docs
– https://docs.datastax.com/en/kafka/doc/index.html
• Demonstration
– https://github.com/clun/kafka-dse/tree/driver2
• Examples
– https://github.com/datastax/kafka-examples
47 © DataStax, All Rights Reserved. community.datastax.com | @jscarp
Thank You!
Come visit our booth!
community.datastax.com | @jscarp

Data Con LA 2019 - Patterns for Persistence and Streaming in Cloud Architectures by Jeffrey Carpenter

  • 1.
    Patterns for Persistence andStreaming in Cloud Architectures Jeff Carpenter Director of Developer Advocacy community.datastax.com | @jscarp
  • 2.
  • 3.
    3 © DataStax,All Rights Reserved. Agenda 1 Context – Monolith to Microservices, On-Prem to Cloud 2 Selecting Infrastructure, Then and Now 3 Persistence Patterns – Featuring Cassandra 4 Persistence + Streaming – Featuring Kafka 5 Resources community.datastax.com | @jscarp
  • 4.
    4 © DataStax,All Rights Reserved. Agenda 1 Context – Monolith to Microservices, On-Prem to Cloud 2 Selecting Infrastructure, Then and Now 3 Persistence Patterns – Featuring Cassandra 4 Persistence + Streaming – Featuring Kafka 5 Resources community.datastax.com | @jscarp
  • 5.
    Old School EnterpriseArchitecture 5 © DataStax, All Rights Reserved. All tables ACID Transactions Joins Indexes RDBMS Monolithi c Applicatio n Other AppsIntegration by database community.datastax.com | @jscarp
  • 6.
    Transitional Architecture 6 © DataStax,All Rights Reserved. RDBMS Monolithi c Applicatio n Integration by API Service s Other Apps NoSQL, NewSQL, RDBMS ? community.datastax.com | @jscarp
  • 7.
    On Prem DC Microservicesin the Cloud Services Clients Applications AWS DC A AWS DC B GCP DC community.datastax.com | @jscarp
  • 8.
    8 © DataStax,All Rights Reserved. community.datastax.com | @jscarp
  • 9.
    9 © DataStax,All Rights Reserved. Agenda 1 Context – Monolith to Microservices, On-Prem to Cloud 2 Selecting Infrastructure, Then and Now 3 Persistence Patterns – Featuring Cassandra 4 Persistence + Streaming – Featuring Kafka 5 Resources community.datastax.com | @jscarp
  • 10.
    Tasks of theArchitect Defining Components and Interfaces Identifying Patterns Managing the –ilities Making tradeoffs 10 © DataStax, All Rights Reserved. community.datastax.com | @jscarp
  • 11.
    Infrastructure Selection –Then 11 © DataStax, All Rights Reserved. community.datastax.com | @jscarp
  • 12.
    Infrastructure Selection –Now? 12 © DataStax, All Rights Reserved. community.datastax.com | @jscarp
  • 13.
    Quality Attribute Bingo- Then •Performance •Scalability •Availability •Reliability •Extensibility •Modularity •Reusability •Monitorability •Deployability •Maintainability •Usability •Cost 13 © DataStax, All Rights Reserved. community.datastax.com | @jscarp
  • 14.
    Data Infrastructure Criteria- Now DX Performance Availability Security Flexibility Cost 14 © DataStax, All Rights Reserved. community.datastax.com | @jscarp
  • 16.
    16 © DataStax,All Rights Reserved. community.datastax.com | @jscarp
  • 17.
    Minimizing Cost ofChange - Abstraction 17 © DataStax, All Rights Reserved. Service Database API Busines s Logic Messaging Data Access Queue / Stream community.datastax.com | @jscarp
  • 18.
    18 © DataStax,All Rights Reserved. Agenda 1 Context – Monolith to Microservices, On-Prem to Cloud 2 Selecting Infrastructure, Then and Now 3 Persistence Patterns – Featuring Cassandra 4 Persistence + Streaming – Featuring Kafka 5 Resources community.datastax.com | @jscarp
  • 19.
    Core application data Microservices andPolyglot Persistence 19 © DataStax, All Rights Reserved. Servic e A Service B Tabular Key-value (cache) Servic e C RelationalDocument Graph Service D Service E Reference data Content Highly networked data Legacy, low volume data community.datastax.com | @jscarp
  • 20.
    Apache Cassandra Overview •First developed by Facebook • Top-level Apache project since 2010 • Partitioned row store • Distributed, decentralized • Elastic scalability / high performance • High availability / fault tolerant • Tuneable consistency • Cassandra Query Language (CQL) © DataStax, All Rights Reserved.20 Apache Cassandra ® Apache Software Foundation community.datastax.com | @jscarp
  • 22.
    KillrVideo – Avideo sharing application https://github.com/KillrVideohttps://killrvideo.github.io community.datastax.com | @jscarp
  • 23.
    KillrVideo High LevelArchitecture KillrVideo Services Your Browser Web Application Technology Choices • Node.js • Falcor • Java / C# / Node.js / Python • GRPC • Etcd • DataStax Drivers • DataStax Enterprise including Apache Cassandra & Spark, Graph Deployment • Download and run locally via Docker • Deployed in AWS using DataStax Managed Services: http://killrvideo.com/ community.datastax.com | @jscarp
  • 24.
    Application Workflow inKillrVideo User Logs into site Show basic information about user Show videos added by a user Show comments posted by a user Search for a video by tag Show latest videos added to the site Show comments for a video Show ratings for a video Show video and its details community.datastax.com | @jscarp
  • 25.
    Queries in KillrVideoto Support Workflows Users User Logs into site Find user by email address Show basic information about user Find user by id Comments Show comments for a video Find comments by video (latest first) Show comments posted by a user Find comments by user (latest first) Ratings Show ratings for a video Find ratings by video community.datastax.com | @jscarp
  • 26.
    Designing Tables Basedon Queries Show video and its details Find video by id Show videos added by a user Find videos by user (latest first) CREATE TABLE videos ( videoid uuid, userid uuid, name text, description text, location text, location_type int, preview_image_location text, tags set<text>, added_date timestamp, PRIMARY KEY (videoid) ); CREATE TABLE user_videos ( userid uuid, added_date timestamp, videoid uuid, name text, preview_image_location text, PRIMARY KEY (userid, added_date, videoid) ) WITH CLUSTERING ORDER BY ( added_date DESC, videoid ASC); community.datastax.com | @jscarp
  • 27.
    Delivery Models forCloud (Data) Infrastructure Enterprise Versions • Pro – Certification and Support – Additional features – Security • Con – Licensing cost – Cost of change Open Source • Pro – Free for dev and prod – Visibility and modifiability • Con – Cost to maintain expertise – Dependence on community Managed Services • Pro – Ease of adoption – Lowest time to prod – Pay as you go • Con – Observability obscured – Cost management community.datastax.com | @jscarp
  • 28.
    Comparing Scale-out Databases 28© DataStax, All Rights Reserved. community.datastax.com | @jscarp
  • 29.
    Core application data Microservices andPolyglot Persistence 29 © DataStax, All Rights Reserved. Servic e A Service B Tabular Key-value (cache) Servic e C RelationalDocument Graph Service D Service E Reference data Content Highly networked data Legacy, low volume data community.datastax.com | @jscarp
  • 30.
    Should a Servicebe Polyglot? 30 © DataStax, All Rights Reserved. Hotel Service Cassandra Key-value (Redis, etc.) Name-to- ID mapping ? Primary store (tabular) community.datastax.com | @jscarp
  • 31.
    Emerging - Multi-modelDatabases 31 © DataStax, All Rights Reserved. Servic e A Service B DSE database Key-value semantics Servic e C Service D CQL JSON Gremlin DSE Graph community.datastax.com | @jscarp
  • 32.
    32 © DataStax,All Rights Reserved. Agenda 1 Context – Monolith to Microservices, On-Prem to Cloud 2 Selecting Infrastructure, Then and Now 3 Persistence Patterns – Featuring Cassandra 4 Persistence + Streaming – Featuring Kafka 5 Resources community.datastax.com | @jscarp
  • 33.
    Apache Kafka Overview •First developed by LinkedIn • Top-level Apache Project since 2012 • Distributed streaming platform • Used for real-time data pipelines and streaming applications • Horizontal scalability / high performance • High availability / Fault tolerance • Stream persistence and querying (KSQL) • Connect framework 33 © DataStax, All Rights Reserved. Apache Kafka ® Apache Software Foundation community.datastax.com | @jscarp
  • 34.
    Kafka Concepts • Topics –Collection of key/value pairs – Append-only – Can be partitioned • Producers • Consumers – Separate offsets 34 © DataStax, All Rights Reserved. community.datastax.com | @jscarp
  • 35.
    Kafka Concepts • Streamsapplications – Combined Producer/Consumer • KSQL – Query language used by stream applications 35 © DataStax, All Rights Reserved. community.datastax.com | @jscarp
  • 36.
    Kafka Concepts • Brokers •Clusters • Connect Framework – Sources – Sinks 36 © DataStax, All Rights Reserved. community.datastax.com | @jscarp
  • 37.
    Cassandra + Kafka– Similarities and Distinctives • Concepts in common – Distributed Systems – Partitioning / Hashing – Replication • Slight differences in implementation – Multi-DC – Log-structured – TTL / retention • Cassandra excels at… – High volume, write intensive data storage workloads at scale – Suitable as a system of record – High performance searching via DSE • Kafka excels at… – Streaming data to/from services and legacy data sources – Acting upon changes in data from multiple sources (aka pipelines) 37 © DataStax, All Rights Reserved. community.datastax.com | @jscarp
  • 38.
    + Better Together –using the best of both community.datastax.com | @jscarp
  • 39.
    Pattern 1: Cassandra+ Kafka in Microservices 39 © DataStax, All Rights Reserved. Some Producer My microservice DataStax Enterprise • Consume topic(s) Other consumers • Read / write data • Publish to topic(s) community.datastax.com | @jscarp
  • 40.
    KillrVideo Services Suggested Videos Service DataStaxEnterprise DSE Graph • UserCreated • YouTubeVideoAdded • UserRatedVideo • Populate graph • Graph recommender traversal • Read and write data User Management, Video Catalog, Ratings Cassandra + Kafka – KillrVideo Example community.datastax.com | @jscarp
  • 41.
    Confidential© DataStax, AllRights Reserved. Pattern 2: Kafka into Cassandra 41 community.datastax.com | @jscarp
  • 42.
    Takeaways Flexibility in selectionof databases per microservice Select and deploy infrastructure based on scale Use queues to coordinate data synchronization Use abstraction to minimize the cost of change 42 © DataStax, All Rights Reserved. community.datastax.com | @jscarp
  • 43.
    43 © DataStax,All Rights Reserved. Agenda 1 Context – Monolith to Microservices, On-Prem to Cloud 2 Selecting Infrastructure, Then and Now 3 Persistence Patterns – Featuring Cassandra 4 Persistence + Streaming – Featuring Kafka 5 Resources community.datastax.com | @jscarp
  • 44.
    DataStax Academy • Freeself-paced courses • DS201: Apache Cassandra™ • DS210: Operations • DS220: Data Modeling • DS310: Search • DS320: Analytics • DS330: Graph • Kafka Connector Getting Started 44 © DataStax, All Rights Reserved. https://academy.datastax.com community.datastax.com | @jscarp
  • 45.
    Docker and Datastax 45Confidential • WHERE – https://hub.docker.com/u/datastax/ – https://github.com/datastax/docker- images/tree/master/datastax-docker-image- examples • We provide – Dockers images for DSE, studio, Opscenter – Docker-compose configuration files – Sample Deployments • We support – Installation on dev before 6.7 – Installation on prod from 6.7 (December 2018) community.datastax.com | @jscarp
  • 46.
    Live Coding onTwitch • Live coding sessions with advocates and guests • Working through the challenges of building distributed systems • Join the conversation and ask questions • Twitch Rewind: Kafka Connector – https://www.youtube.com/watch?v=2_BidD K5zGE https://www.twitch.tv/datastaxacademy 46 © DataStax, All Rights Reserved. community.datastax.com | @jscarp
  • 47.
    Resources – DataStaxKafka Connector • Blog – https://www.datastax.com/2018/12/introducing-the-datastax-apache-kafka-connector • Download – https://academy.datastax.com/downloads#connectors • Docs – https://docs.datastax.com/en/kafka/doc/index.html • Demonstration – https://github.com/clun/kafka-dse/tree/driver2 • Examples – https://github.com/datastax/kafka-examples 47 © DataStax, All Rights Reserved. community.datastax.com | @jscarp
  • 48.
    Thank You! Come visitour booth! community.datastax.com | @jscarp

Editor's Notes

  • #3 Architect – distributed systems, will share mistakes Author advocate
  • #6 As of just a few years ago, most application development used a single primary data store based on a relational database, plus the occasional file based storage for other data. This seemed great because you could have all your data in one place, and even have transactions with ACID semantics spanning multiple tables. You could add any indexes you wanted and perform complex joins across tables These databases worked so well that sometimes we were even tempted to use them as the interface between systems. This “integration by database” came to be considered an anti-pattern as we realized how brittle these integrations were – usually when we updated an application database only to find that it broke other apps. This architecture served us well for many years, and is still appropriate for some applications. The problem is that its entirely inappropriate for cloud-scale applications.
  • #7 Strangler pattern for getting rid of monoliths
  • #8 Decomposed into microservices More lightweight/flexible front end apps, webapps Independently scalable Deploy across multiple datacenters Data movement becomes an important factor
  • #9 Multi-cloud demo
  • #11 Modified from Eben Components and interfaces – including API definition and defining patterns
  • #12 How I grew up Based on trade studies Based on enterprise license agreements Major contracts, Multi-year agreements Corporate governance, Guidance documents
  • #13 Now, the wild west? DevOps – you build it, you maintain it Not everyone does this
  • #14 Matrixed decision making
  • #15 Now, a progression Developers start the food chain What do we encounter, when
  • #16 The bogeyman We all use it
  • #17 Lock in is lazy Need a better discourse Cost to adopt, cost to operate, cost to change
  • #18 Soapbox – regardless of infrastructure choices Use a modular design within the service to isolate concerns If the API style changes If the database changes If the queue or stream changes Discoverable endpoints, well known names – if the deployment changes
  • #20 So we had to come up with new architectural approaches to deal with this new world of massively scalable, distributed systems. The microservice architecture approach has become very popular for building cloud-based applications, and for good reason. The ability to develop, manage and scale services independently gives us a ton of flexibility. One axis of this flexibility is known by the term “polyglot persistence”, as popularized by Martin Fowler and others. In this approach, services are developed by separate teams, and each team is free to use whatever storage mechanism seems most appropriate to them. So one team developing Service A might choose to use Cassandra because it is managing core application data that really fits that tabular format, while Service B’s data supports very simple semantics of looking up reference values by well known keys. Another service C might be primarily concerned with serving up content for a website and use a document store. Another Service. Service D might be all about navigating complex relationships between data such as customer data and relationships. We might also have a legacy system or service that uses relational technology, or perhaps a service that manages low volume data that doesn’t change often, so a relational database might be a good fit for that. Note that I’m not trying to constrain our trade space or specify a particular design, I’m just trying to highlight the strengths of each of these styles of database and why a multi-model approach to cloud architecture can be attractive.
  • #21 We built DSE to address these cloud application characteristics They help describe why we were major contributors behind Apache Cassandra, and other open source technologies And it’s why this technology has been applied successfully in many companies Cassandra was first developed at Facebook (explain other details) DataStax Enterprise is our distribution of Apache Cassandra. We say this distribution is the best because of the additional security and performance features we put into the core database and the testing and hardening we do
  • #22 What is KillrVideo? A great way to learn concepts
  • #24 So what is behind the system We built a microservice application
  • #25 Using this to demonstrate how we identify data models and services
  • #27 Service identification – wrapping a service around video tables
  • #30 Revisiting this – at what levels do we apply polyglot persistence concepts?
  • #31 It’s also possible that we could design a service that actually sits on top of multiple databases, although in that case I’d definitely want to make sure that we’re not over-engineering or building a service whose scope of work is too large. Probably not a good combination
  • #32 Another way to think about this problem is to consider that our database itself could be a multi-model database, that is a database that supports different models of interaction. For example, Datastax Enterprise is built on top of the most performant, hardened distribution of Apache Cassandra Services could interact with the core cassandra directly using CQL. Although DSE does not provide a key-value API, you can interact with it as a key value store. DSE does provide document-style interaction in terms of JSON documents. DSE Graph is a highly scalable graph database that is built directly on top of DSE core Cassandra and supports the popular Gremlin API. Why would we want something like this? I can think of two primary reasons. First, Cassandra has demonstrated that it can handle the massive scalability and performance required of cloud applications, which is not necessarily true of other databases. DSE provides that hardened, stable, reliable implementation of Cassandra as a solid foundation for your cloud application. Second, using a multi-model database approach with DSE can help us with operational simplicity. Even if different development teams are using different APIs and modes of interaction with the backend database platform, we can gain efficiency by only having a single platform to manage.
  • #38 Replication – but subtle
  • #42 A third pattern is possible – Cassandra as a destination system