SlideShare a Scribd company logo
Five Lessons
in Distributed Databases
Jonathan Ellis
CTO, DataStax
1 © DataStax, All Rights Reserved. Confidential
© DataStax, All Rights Reserved.
1. If it’s not SQL, it’s not a database
© DataStax, All Rights Reserved.
A brief history of NoSQL
● Early 2000s: people hit limits on vertical scaling, start
sharding RDBMSes
● 2006, 2007: BigTable, Dynamo papers
● 2008-2010: Explosion of scale-out systems
○ Voldemort, Riak, Dynomite, FoundationDB, CouchDB
○ Cassandra, HBase, MongoDB
© DataStax, All Rights Reserved.
One small problem
© DataStax, All Rights Reserved.
Cassandra’s experience
● Thrift RPC “drivers” too low level
● Fragmented: Hector, Pelops, Astyanax
● Inconsistent across language ecosystems
© DataStax, All Rights Reserved.
© DataStax, All Rights Reserved.
Solution: CQL
● 2011: Cassandra 0.8 introduces CQL 1.0
● 2012: Cassandra 1.1 introduces CQL 3.0
● 2013: Cassandra 1.2 adds collections
© DataStax, All Rights Reserved.
Today
● Cassandra: CQL
● CosmosDB: “SQL”
● Cloud Spanner: “SQL”
● Couchbase: N1QL
● HBase: Phoenix SQL (Java only)
● DynamoDB: REST/JSON
● MongoDB: BSON
© DataStax, All Rights Reserved.
2. It takes 5+ years to build a database
© DataStax, All Rights Reserved.
Curt Monash
Rule 1: Developing a good DBMS requires 5-7 years and
tens of millions of dollars.
That’s if things go extremely well.
Rule 2: You aren’t an exception to Rule 1.
© DataStax, All Rights Reserved.
Aside: Mistakes I made starting DataStax
● Stayed at Rackspace too long
● Raised a $2.5M series A
● Waited a year to get serious about enterprise sales
● Changed the company name
● Brisk
© DataStax, All Rights Reserved.
Examples (Curt)
● Concurrent workloads benchmarked in the lab are poor
predictors of concurrent performance in real life.
● Mixed workload management is harder than you’re
assuming it is.
● Those minor edge cases in which your Version 1
product works poorly aren’t minor after all.
© DataStax, All Rights Reserved.
Examples (Cassandra)
● Hinted handoff
● Repair
● Counters
● Paxos
● Test suite
© DataStax, All Rights Reserved.
Aside: Fallout (Jepsen at Scale)
● Ensemble - A set of clusters that is brought up/torn
down each test
○ Server Cluster - Cassandra/DSE
○ Client Cluster - Load Generators
○ Observer Cluster - Records live information from clusters (OpsCenter/Graphite)
○ Controller - Fallout
● Workload - The guts of the test
○ Phases - Run sequentially. Contains one or more modules that run in parallel for that
phase
○ Checkers - Run after all phases and verify the data emitted from modules.
○ Artifact Checkers - Runs against collected artifacts to look for correctness/problems
© DataStax, All Rights Reserved.
A simple Fallout workload
ensemble:
server:
node.count: 3
provisioner:
name: local
configuration_manager:
name: ccm
properties:
cassandra.version: 3.0.0
client: server #use server cluster
phases:
- insert_workload:
module: stress
properties:
iterations: 1m
type: write
rf: 3
gossip_updown:
module: nodetool
properties:
command: disablegossip
secondary.command: enablegossip
sleep.seconds: 10
sleep.randomize: 20
- read_workload:
module: stress
properties:
iterations: 1m
type: read
checkers:
verify_success:
checker: nofail
1. Start 3 node ccm cluster.
2. Insert data while bringing
gossip on the nodes up and
down.
3. Read/Check the data.
4. Verify none of the steps
failed.
Note: to move from ccm to ec2
we only need to change the
ensemble section.
© DataStax, All Rights Reserved.
5-7 years?
● Cassandra became Apache TLP in Feb 2010
● 3.0 released Fall 2015
● OSS is about adoption, not saving time/money
© DataStax, All Rights Reserved.
3. The customer is always right
© DataStax, All Rights Reserved.
Example: sequential scans
SELECT * FROM user_purchases
WHERE purchase_date > 2000
© DataStax, All Rights Reserved.
What’s wrong with this query?
For 100,000 purchases, nothing.
For 100,000,000 purchases, you’ll crash the server
(in 2012).
© DataStax, All Rights Reserved.
Solution (2012): ALLOW FILTERING
SELECT * FROM user_purchases
WHERE purchase_date > 2000
ALLOW FILTERING
© DataStax, All Rights Reserved.
Better solution (2013): Paging
● Build resultset incrementally and “page” it to the client
© DataStax, All Rights Reserved.
Example: tombstones
INSERT INTO foo VALUES (1254, …)
DELETE FROM foo WHERE id = 1254
…
SELECT * FROM foo
© DataStax, All Rights Reserved.
Solution (2013)
tombstone_warn_threshold: 1000
tombstone_failure_threshold: 100000
© DataStax, All Rights Reserved.
Better Solution (???): It’s complicated
● Track repair status to get rid of GCGS
● Bring time-to-repair from “days” to “hours”
● Optional: improve time-to-compaction
© DataStax, All Rights Reserved.
Example: joins
● CQL doesn’t support joins
● People still use client-side joins instead of
denormalizing
© DataStax, All Rights Reserved.
Solution (2015-???): MV
● Make it easier to denormalize
© DataStax, All Rights Reserved.
Better solution (???): actually add joins
● Less controversial: shared partition joins
● More controversial: cross-partition
● CosmosDB, Spanner
© DataStax, All Rights Reserved.
A note on configurability
© DataStax, All Rights Reserved.
4. Too much magic is a bad thing
© DataStax, All Rights Reserved.
Not (just) about vendors overpromising
● “Our database isn’t subject to the limits of the CAP
theorem”
● “Our queue can guarantee exactly once delivery”
● “We’ll give you 99.99% uptime*”
© DataStax, All Rights Reserved.
Magic can be bad even when it works
© DataStax, All Rights Reserved.
Cloud Spanner analysis excerpt
Spanner’s architecture implies that writes will be significantly slower
than reads due to the need to coordinate across multiple replicas and
avoid overlapping time bounds, and that is what we see in the original
2012 Spanner paper.
… Besides write performance in isolation, because Spanner uses
pessimistic locking to achieve ACID, reads are locked out of rows
(partitions?) that are in the process of being updated. Thus, write
performance challenges can spread to causing problems with reads as
well.
© DataStax, All Rights Reserved.
Cloud Spanner
© DataStax, All Rights Reserved.
Auto-scaling in DynamoDB
● Request capacity tied to “partitions” [pp]
○ pp count = max (rc / 3000, wc / 1000, st / 10 GB)
● Subtle implication: capacity / pp decreases as storage
volume increases
○ Non-uniform: pp request capacity halved when shard splits
● Subtle implication 2: bulk loads will wreck your planning
© DataStax, All Rights Reserved.
“Best practices for tables”
● Bulk load 200M items = 200 GB
● Target 60 minutes = 55,000 write capacity = 55 pps
● Post bulk load steady state
● 1000 req/s = 2 req/pp = 2 req/(3.6M items)
● No way to reduce partition count
© DataStax, All Rights Reserved.
Ravelin, 2017
You construct a table which uses a customer ID as partition key. You
know your customer ID’s are unique and should be uniformly
distributed across nodes. Your business has millions of customers and
no single customer can do so many actions so quickly that the
individual could create a hot key. Under this key you are storing around
2KB of data.
This sounds reasonable.
This will not work at scale in DynamoDb.
© DataStax, All Rights Reserved.
How much magic is too much?
● Joins: Apparently okay
● Auto-scaling: Apparently also okay
● Automatic partitioning: not okay
● Really slow ACID: not okay (?)
● Why?
● How do we make the system more transparent without
inflicting an unnecessary level of detail on the user?
© DataStax, All Rights Reserved.
5. It’s the cloud, stupid
© DataStax, All Rights Reserved.
September 2011
© DataStax, All Rights Reserved.
March 2012
© DataStax, All Rights Reserved.
March 2012
© DataStax, All Rights Reserved.
March 2012
© DataStax, All Rights Reserved.
The cloud is here. Now what?
© DataStax, All Rights Reserved.
Cloud-first architecture
“The second trend will be the increased
prevalence of shared-disk distributed
DBMS. By “shared-disk” I mean a DBMS
that uses a distributed storage layer as its
primary storage location, such as HDFS or
Amazon’s EBS/S3 services. This
separates the DBMS’s storage layer from
its execution nodes. Contrast this with a
shared-nothing DBMS architecture where
each execution node maintains its own
storage.”
© DataStax, All Rights Reserved.
Cloud-first infrastructure
● What on-premises infrastructure can provide a
cloud-like experience?
● Kubernetes?
● OpenStack?
© DataStax, All Rights Reserved.
Cloud-first development
● Is a yearly (bi-yearly?) release process the right
cadence for companies building cloud services?
© DataStax, All Rights Reserved.
Cloud-first OSS
● What does OSS look like when you don’t work for the
big three clouds?
● “Commons Clause” is an attempt to deal with this
○ (What about AGPL?)
© DataStax, All Rights Reserved.
Summary
1. If it’s not SQL, it’s not a database.
2. It takes 5+ years to build a database.
3. Listen to your users.
4. Too much magic is a bad thing.
5. It’s the cloud, stupid.

More Related Content

What's hot

Productizing a Cassandra-Based Solution (Brij Bhushan Ravat, Ericsson) | C* S...
Productizing a Cassandra-Based Solution (Brij Bhushan Ravat, Ericsson) | C* S...Productizing a Cassandra-Based Solution (Brij Bhushan Ravat, Ericsson) | C* S...
Productizing a Cassandra-Based Solution (Brij Bhushan Ravat, Ericsson) | C* S...
DataStax
 
Everyday I’m scaling... Cassandra
Everyday I’m scaling... CassandraEveryday I’m scaling... Cassandra
Everyday I’m scaling... Cassandra
Instaclustr
 
Clustering van IT-componenten
Clustering van IT-componentenClustering van IT-componenten
Clustering van IT-componenten
Richard Claassens CIPPE
 
Cassandra Summit 2014: Monitor Everything!
Cassandra Summit 2014: Monitor Everything!Cassandra Summit 2014: Monitor Everything!
Cassandra Summit 2014: Monitor Everything!
DataStax Academy
 
Cassandra Tuning - above and beyond
Cassandra Tuning - above and beyondCassandra Tuning - above and beyond
Cassandra Tuning - above and beyond
Matija Gobec
 
Operations, Consistency, Failover for Multi-DC Clusters (Alexander Dejanovski...
Operations, Consistency, Failover for Multi-DC Clusters (Alexander Dejanovski...Operations, Consistency, Failover for Multi-DC Clusters (Alexander Dejanovski...
Operations, Consistency, Failover for Multi-DC Clusters (Alexander Dejanovski...
DataStax
 
Empowering the AWS DynamoDB™ application developer with Alternator
Empowering the AWS DynamoDB™ application developer with AlternatorEmpowering the AWS DynamoDB™ application developer with Alternator
Empowering the AWS DynamoDB™ application developer with Alternator
ScyllaDB
 
Processing 50,000 events per second with Cassandra and Spark
Processing 50,000 events per second with Cassandra and SparkProcessing 50,000 events per second with Cassandra and Spark
Processing 50,000 events per second with Cassandra and Spark
Ben Slater
 
Performance tuning - A key to successful cassandra migration
Performance tuning - A key to successful cassandra migrationPerformance tuning - A key to successful cassandra migration
Performance tuning - A key to successful cassandra migration
Ramkumar Nottath
 
Understanding Cassandra internals to solve real-world problems
Understanding Cassandra internals to solve real-world problemsUnderstanding Cassandra internals to solve real-world problems
Understanding Cassandra internals to solve real-world problemsAcunu
 
AdGear Use Case with Scylla - 1M Queries Per Second with Single-Digit Millise...
AdGear Use Case with Scylla - 1M Queries Per Second with Single-Digit Millise...AdGear Use Case with Scylla - 1M Queries Per Second with Single-Digit Millise...
AdGear Use Case with Scylla - 1M Queries Per Second with Single-Digit Millise...
ScyllaDB
 
Micro-batching: High-performance Writes (Adam Zegelin, Instaclustr) | Cassand...
Micro-batching: High-performance Writes (Adam Zegelin, Instaclustr) | Cassand...Micro-batching: High-performance Writes (Adam Zegelin, Instaclustr) | Cassand...
Micro-batching: High-performance Writes (Adam Zegelin, Instaclustr) | Cassand...
DataStax
 
Cassandra Community Webinar | Data Model on Fire
Cassandra Community Webinar | Data Model on FireCassandra Community Webinar | Data Model on Fire
Cassandra Community Webinar | Data Model on Fire
DataStax
 
Webinar: Getting Started with Apache Cassandra
Webinar: Getting Started with Apache CassandraWebinar: Getting Started with Apache Cassandra
Webinar: Getting Started with Apache Cassandra
DataStax
 
Maintaining Consistency Across Data Centers (Randy Fradin, BlackRock) | Cassa...
Maintaining Consistency Across Data Centers (Randy Fradin, BlackRock) | Cassa...Maintaining Consistency Across Data Centers (Randy Fradin, BlackRock) | Cassa...
Maintaining Consistency Across Data Centers (Randy Fradin, BlackRock) | Cassa...
DataStax
 
Performance Testing: Scylla vs. Cassandra vs. Datastax
Performance Testing: Scylla vs. Cassandra vs. DatastaxPerformance Testing: Scylla vs. Cassandra vs. Datastax
Performance Testing: Scylla vs. Cassandra vs. Datastax
ScyllaDB
 
Tales From the Field: The Wrong Way of Using Cassandra (Carlos Rolo, Pythian)...
Tales From the Field: The Wrong Way of Using Cassandra (Carlos Rolo, Pythian)...Tales From the Field: The Wrong Way of Using Cassandra (Carlos Rolo, Pythian)...
Tales From the Field: The Wrong Way of Using Cassandra (Carlos Rolo, Pythian)...
DataStax
 
Micro-batching: High-performance writes
Micro-batching: High-performance writesMicro-batching: High-performance writes
Micro-batching: High-performance writes
Instaclustr
 
London + Dublin Cassandra 2.0
London + Dublin Cassandra 2.0London + Dublin Cassandra 2.0
London + Dublin Cassandra 2.0jbellis
 

What's hot (19)

Productizing a Cassandra-Based Solution (Brij Bhushan Ravat, Ericsson) | C* S...
Productizing a Cassandra-Based Solution (Brij Bhushan Ravat, Ericsson) | C* S...Productizing a Cassandra-Based Solution (Brij Bhushan Ravat, Ericsson) | C* S...
Productizing a Cassandra-Based Solution (Brij Bhushan Ravat, Ericsson) | C* S...
 
Everyday I’m scaling... Cassandra
Everyday I’m scaling... CassandraEveryday I’m scaling... Cassandra
Everyday I’m scaling... Cassandra
 
Clustering van IT-componenten
Clustering van IT-componentenClustering van IT-componenten
Clustering van IT-componenten
 
Cassandra Summit 2014: Monitor Everything!
Cassandra Summit 2014: Monitor Everything!Cassandra Summit 2014: Monitor Everything!
Cassandra Summit 2014: Monitor Everything!
 
Cassandra Tuning - above and beyond
Cassandra Tuning - above and beyondCassandra Tuning - above and beyond
Cassandra Tuning - above and beyond
 
Operations, Consistency, Failover for Multi-DC Clusters (Alexander Dejanovski...
Operations, Consistency, Failover for Multi-DC Clusters (Alexander Dejanovski...Operations, Consistency, Failover for Multi-DC Clusters (Alexander Dejanovski...
Operations, Consistency, Failover for Multi-DC Clusters (Alexander Dejanovski...
 
Empowering the AWS DynamoDB™ application developer with Alternator
Empowering the AWS DynamoDB™ application developer with AlternatorEmpowering the AWS DynamoDB™ application developer with Alternator
Empowering the AWS DynamoDB™ application developer with Alternator
 
Processing 50,000 events per second with Cassandra and Spark
Processing 50,000 events per second with Cassandra and SparkProcessing 50,000 events per second with Cassandra and Spark
Processing 50,000 events per second with Cassandra and Spark
 
Performance tuning - A key to successful cassandra migration
Performance tuning - A key to successful cassandra migrationPerformance tuning - A key to successful cassandra migration
Performance tuning - A key to successful cassandra migration
 
Understanding Cassandra internals to solve real-world problems
Understanding Cassandra internals to solve real-world problemsUnderstanding Cassandra internals to solve real-world problems
Understanding Cassandra internals to solve real-world problems
 
AdGear Use Case with Scylla - 1M Queries Per Second with Single-Digit Millise...
AdGear Use Case with Scylla - 1M Queries Per Second with Single-Digit Millise...AdGear Use Case with Scylla - 1M Queries Per Second with Single-Digit Millise...
AdGear Use Case with Scylla - 1M Queries Per Second with Single-Digit Millise...
 
Micro-batching: High-performance Writes (Adam Zegelin, Instaclustr) | Cassand...
Micro-batching: High-performance Writes (Adam Zegelin, Instaclustr) | Cassand...Micro-batching: High-performance Writes (Adam Zegelin, Instaclustr) | Cassand...
Micro-batching: High-performance Writes (Adam Zegelin, Instaclustr) | Cassand...
 
Cassandra Community Webinar | Data Model on Fire
Cassandra Community Webinar | Data Model on FireCassandra Community Webinar | Data Model on Fire
Cassandra Community Webinar | Data Model on Fire
 
Webinar: Getting Started with Apache Cassandra
Webinar: Getting Started with Apache CassandraWebinar: Getting Started with Apache Cassandra
Webinar: Getting Started with Apache Cassandra
 
Maintaining Consistency Across Data Centers (Randy Fradin, BlackRock) | Cassa...
Maintaining Consistency Across Data Centers (Randy Fradin, BlackRock) | Cassa...Maintaining Consistency Across Data Centers (Randy Fradin, BlackRock) | Cassa...
Maintaining Consistency Across Data Centers (Randy Fradin, BlackRock) | Cassa...
 
Performance Testing: Scylla vs. Cassandra vs. Datastax
Performance Testing: Scylla vs. Cassandra vs. DatastaxPerformance Testing: Scylla vs. Cassandra vs. Datastax
Performance Testing: Scylla vs. Cassandra vs. Datastax
 
Tales From the Field: The Wrong Way of Using Cassandra (Carlos Rolo, Pythian)...
Tales From the Field: The Wrong Way of Using Cassandra (Carlos Rolo, Pythian)...Tales From the Field: The Wrong Way of Using Cassandra (Carlos Rolo, Pythian)...
Tales From the Field: The Wrong Way of Using Cassandra (Carlos Rolo, Pythian)...
 
Micro-batching: High-performance writes
Micro-batching: High-performance writesMicro-batching: High-performance writes
Micro-batching: High-performance writes
 
London + Dublin Cassandra 2.0
London + Dublin Cassandra 2.0London + Dublin Cassandra 2.0
London + Dublin Cassandra 2.0
 

Similar to Five Lessons in Distributed Databases

Data day texas: Cassandra and the Cloud
Data day texas: Cassandra and the CloudData day texas: Cassandra and the Cloud
Data day texas: Cassandra and the Cloud
jbellis
 
Everyday I'm Scaling... Cassandra (Ben Bromhead, Instaclustr) | C* Summit 2016
Everyday I'm Scaling... Cassandra (Ben Bromhead, Instaclustr) | C* Summit 2016Everyday I'm Scaling... Cassandra (Ben Bromhead, Instaclustr) | C* Summit 2016
Everyday I'm Scaling... Cassandra (Ben Bromhead, Instaclustr) | C* Summit 2016
DataStax
 
Cassandra at Pollfish
Cassandra at PollfishCassandra at Pollfish
Cassandra at Pollfish
Pollfish
 
Cassandra at Pollfish
Cassandra at PollfishCassandra at Pollfish
Cassandra at Pollfish
Stavros Kontopoulos
 
Galaxy Big Data with MariaDB
Galaxy Big Data with MariaDBGalaxy Big Data with MariaDB
Galaxy Big Data with MariaDB
MariaDB Corporation
 
Infosys Ltd: Performance Tuning - A Key to Successful Cassandra Migration
Infosys Ltd: Performance Tuning - A Key to Successful Cassandra MigrationInfosys Ltd: Performance Tuning - A Key to Successful Cassandra Migration
Infosys Ltd: Performance Tuning - A Key to Successful Cassandra Migration
DataStax Academy
 
Extreme Availability using Oracle 12c Features: Your very last system shutdown?
Extreme Availability using Oracle 12c Features: Your very last system shutdown?Extreme Availability using Oracle 12c Features: Your very last system shutdown?
Extreme Availability using Oracle 12c Features: Your very last system shutdown?
Toronto-Oracle-Users-Group
 
Healthcare Claim Reimbursement using Apache Spark
Healthcare Claim Reimbursement using Apache SparkHealthcare Claim Reimbursement using Apache Spark
Healthcare Claim Reimbursement using Apache Spark
Databricks
 
DataStax 6 and Beyond
DataStax 6 and BeyondDataStax 6 and Beyond
DataStax 6 and Beyond
David Jones-Gilardi
 
Safer restarts, faster streaming, and better repair, just a glimpse of cassan...
Safer restarts, faster streaming, and better repair, just a glimpse of cassan...Safer restarts, faster streaming, and better repair, just a glimpse of cassan...
Safer restarts, faster streaming, and better repair, just a glimpse of cassan...
Vinay Kumar Chella
 
MySQL Cluster (NDB) - Best Practices Percona Live 2017
MySQL Cluster (NDB) - Best Practices Percona Live 2017MySQL Cluster (NDB) - Best Practices Percona Live 2017
MySQL Cluster (NDB) - Best Practices Percona Live 2017
Severalnines
 
Webinar | How to Understand Apache Cassandra™ Performance Through Read/Writ...
Webinar  |  How to Understand Apache Cassandra™ Performance Through Read/Writ...Webinar  |  How to Understand Apache Cassandra™ Performance Through Read/Writ...
Webinar | How to Understand Apache Cassandra™ Performance Through Read/Writ...
DataStax
 
Srimanta_Maji_Oracle_DBA
Srimanta_Maji_Oracle_DBASrimanta_Maji_Oracle_DBA
Srimanta_Maji_Oracle_DBASRIMANTA MAJI
 
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
DataStax
 
BigData as a Platform: Cassandra and Current Trends
BigData as a Platform: Cassandra and Current TrendsBigData as a Platform: Cassandra and Current Trends
BigData as a Platform: Cassandra and Current Trends
Matthew Dennis
 
AquaQ Analytics Kx Event - Data Direct Networks Presentation
AquaQ Analytics Kx Event - Data Direct Networks PresentationAquaQ Analytics Kx Event - Data Direct Networks Presentation
AquaQ Analytics Kx Event - Data Direct Networks PresentationAquaQ Analytics
 
Scaling DataStax in Docker
Scaling DataStax in DockerScaling DataStax in Docker
Scaling DataStax in Docker
DataStax
 
MinneBar 2013 - Scaling with Cassandra
MinneBar 2013 - Scaling with CassandraMinneBar 2013 - Scaling with Cassandra
MinneBar 2013 - Scaling with Cassandra
Jeff Smoley
 
Managing 100s of PetaBytes of data in Cloud
Managing 100s of PetaBytes of data in CloudManaging 100s of PetaBytes of data in Cloud
Managing 100s of PetaBytes of data in Cloud
lohitvijayarenu
 

Similar to Five Lessons in Distributed Databases (20)

Data day texas: Cassandra and the Cloud
Data day texas: Cassandra and the CloudData day texas: Cassandra and the Cloud
Data day texas: Cassandra and the Cloud
 
Everyday I'm Scaling... Cassandra (Ben Bromhead, Instaclustr) | C* Summit 2016
Everyday I'm Scaling... Cassandra (Ben Bromhead, Instaclustr) | C* Summit 2016Everyday I'm Scaling... Cassandra (Ben Bromhead, Instaclustr) | C* Summit 2016
Everyday I'm Scaling... Cassandra (Ben Bromhead, Instaclustr) | C* Summit 2016
 
Cassandra at Pollfish
Cassandra at PollfishCassandra at Pollfish
Cassandra at Pollfish
 
Cassandra at Pollfish
Cassandra at PollfishCassandra at Pollfish
Cassandra at Pollfish
 
Galaxy Big Data with MariaDB
Galaxy Big Data with MariaDBGalaxy Big Data with MariaDB
Galaxy Big Data with MariaDB
 
Infosys Ltd: Performance Tuning - A Key to Successful Cassandra Migration
Infosys Ltd: Performance Tuning - A Key to Successful Cassandra MigrationInfosys Ltd: Performance Tuning - A Key to Successful Cassandra Migration
Infosys Ltd: Performance Tuning - A Key to Successful Cassandra Migration
 
Extreme Availability using Oracle 12c Features: Your very last system shutdown?
Extreme Availability using Oracle 12c Features: Your very last system shutdown?Extreme Availability using Oracle 12c Features: Your very last system shutdown?
Extreme Availability using Oracle 12c Features: Your very last system shutdown?
 
Healthcare Claim Reimbursement using Apache Spark
Healthcare Claim Reimbursement using Apache SparkHealthcare Claim Reimbursement using Apache Spark
Healthcare Claim Reimbursement using Apache Spark
 
DataStax 6 and Beyond
DataStax 6 and BeyondDataStax 6 and Beyond
DataStax 6 and Beyond
 
Safer restarts, faster streaming, and better repair, just a glimpse of cassan...
Safer restarts, faster streaming, and better repair, just a glimpse of cassan...Safer restarts, faster streaming, and better repair, just a glimpse of cassan...
Safer restarts, faster streaming, and better repair, just a glimpse of cassan...
 
MySQL Cluster (NDB) - Best Practices Percona Live 2017
MySQL Cluster (NDB) - Best Practices Percona Live 2017MySQL Cluster (NDB) - Best Practices Percona Live 2017
MySQL Cluster (NDB) - Best Practices Percona Live 2017
 
Webinar | How to Understand Apache Cassandra™ Performance Through Read/Writ...
Webinar  |  How to Understand Apache Cassandra™ Performance Through Read/Writ...Webinar  |  How to Understand Apache Cassandra™ Performance Through Read/Writ...
Webinar | How to Understand Apache Cassandra™ Performance Through Read/Writ...
 
Srimanta_Maji_Oracle_DBA
Srimanta_Maji_Oracle_DBASrimanta_Maji_Oracle_DBA
Srimanta_Maji_Oracle_DBA
 
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
 
BigData as a Platform: Cassandra and Current Trends
BigData as a Platform: Cassandra and Current TrendsBigData as a Platform: Cassandra and Current Trends
BigData as a Platform: Cassandra and Current Trends
 
AquaQ Analytics Kx Event - Data Direct Networks Presentation
AquaQ Analytics Kx Event - Data Direct Networks PresentationAquaQ Analytics Kx Event - Data Direct Networks Presentation
AquaQ Analytics Kx Event - Data Direct Networks Presentation
 
Running MySQL in AWS
Running MySQL in AWSRunning MySQL in AWS
Running MySQL in AWS
 
Scaling DataStax in Docker
Scaling DataStax in DockerScaling DataStax in Docker
Scaling DataStax in Docker
 
MinneBar 2013 - Scaling with Cassandra
MinneBar 2013 - Scaling with CassandraMinneBar 2013 - Scaling with Cassandra
MinneBar 2013 - Scaling with Cassandra
 
Managing 100s of PetaBytes of data in Cloud
Managing 100s of PetaBytes of data in CloudManaging 100s of PetaBytes of data in Cloud
Managing 100s of PetaBytes of data in Cloud
 

More from jbellis

Vector Search @ sw2con for slideshare.pptx
Vector Search @ sw2con for slideshare.pptxVector Search @ sw2con for slideshare.pptx
Vector Search @ sw2con for slideshare.pptx
jbellis
 
Cassandra Summit 2015
Cassandra Summit 2015Cassandra Summit 2015
Cassandra Summit 2015
jbellis
 
Cassandra summit keynote 2014
Cassandra summit keynote 2014Cassandra summit keynote 2014
Cassandra summit keynote 2014
jbellis
 
Cassandra 2.1
Cassandra 2.1Cassandra 2.1
Cassandra 2.1jbellis
 
Tokyo cassandra conference 2014
Tokyo cassandra conference 2014Tokyo cassandra conference 2014
Tokyo cassandra conference 2014jbellis
 
Cassandra Summit EU 2013
Cassandra Summit EU 2013Cassandra Summit EU 2013
Cassandra Summit EU 2013jbellis
 
Cassandra Summit 2013 Keynote
Cassandra Summit 2013 KeynoteCassandra Summit 2013 Keynote
Cassandra Summit 2013 Keynotejbellis
 
Cassandra at NoSql Matters 2012
Cassandra at NoSql Matters 2012Cassandra at NoSql Matters 2012
Cassandra at NoSql Matters 2012jbellis
 
Top five questions to ask when choosing a big data solution
Top five questions to ask when choosing a big data solutionTop five questions to ask when choosing a big data solution
Top five questions to ask when choosing a big data solutionjbellis
 
State of Cassandra 2012
State of Cassandra 2012State of Cassandra 2012
State of Cassandra 2012jbellis
 
Massively Scalable NoSQL with Apache Cassandra
Massively Scalable NoSQL with Apache CassandraMassively Scalable NoSQL with Apache Cassandra
Massively Scalable NoSQL with Apache Cassandrajbellis
 
Cassandra 1.1
Cassandra 1.1Cassandra 1.1
Cassandra 1.1jbellis
 
Pycon 2012 What Python can learn from Java
Pycon 2012 What Python can learn from JavaPycon 2012 What Python can learn from Java
Pycon 2012 What Python can learn from Javajbellis
 
Apache Cassandra: NoSQL in the enterprise
Apache Cassandra: NoSQL in the enterpriseApache Cassandra: NoSQL in the enterprise
Apache Cassandra: NoSQL in the enterprisejbellis
 
Dealing with JVM limitations in Apache Cassandra (Fosdem 2012)
Dealing with JVM limitations in Apache Cassandra (Fosdem 2012)Dealing with JVM limitations in Apache Cassandra (Fosdem 2012)
Dealing with JVM limitations in Apache Cassandra (Fosdem 2012)jbellis
 
Cassandra at High Performance Transaction Systems 2011
Cassandra at High Performance Transaction Systems 2011Cassandra at High Performance Transaction Systems 2011
Cassandra at High Performance Transaction Systems 2011jbellis
 
Cassandra 1.0 and the future of big data (Cassandra Tokyo 2011)
Cassandra 1.0 and the future of big data (Cassandra Tokyo 2011)Cassandra 1.0 and the future of big data (Cassandra Tokyo 2011)
Cassandra 1.0 and the future of big data (Cassandra Tokyo 2011)jbellis
 
What python can learn from java
What python can learn from javaWhat python can learn from java
What python can learn from javajbellis
 
State of Cassandra, 2011
State of Cassandra, 2011State of Cassandra, 2011
State of Cassandra, 2011jbellis
 
Brisk: more powerful Hadoop powered by Cassandra
Brisk: more powerful Hadoop powered by CassandraBrisk: more powerful Hadoop powered by Cassandra
Brisk: more powerful Hadoop powered by Cassandrajbellis
 

More from jbellis (20)

Vector Search @ sw2con for slideshare.pptx
Vector Search @ sw2con for slideshare.pptxVector Search @ sw2con for slideshare.pptx
Vector Search @ sw2con for slideshare.pptx
 
Cassandra Summit 2015
Cassandra Summit 2015Cassandra Summit 2015
Cassandra Summit 2015
 
Cassandra summit keynote 2014
Cassandra summit keynote 2014Cassandra summit keynote 2014
Cassandra summit keynote 2014
 
Cassandra 2.1
Cassandra 2.1Cassandra 2.1
Cassandra 2.1
 
Tokyo cassandra conference 2014
Tokyo cassandra conference 2014Tokyo cassandra conference 2014
Tokyo cassandra conference 2014
 
Cassandra Summit EU 2013
Cassandra Summit EU 2013Cassandra Summit EU 2013
Cassandra Summit EU 2013
 
Cassandra Summit 2013 Keynote
Cassandra Summit 2013 KeynoteCassandra Summit 2013 Keynote
Cassandra Summit 2013 Keynote
 
Cassandra at NoSql Matters 2012
Cassandra at NoSql Matters 2012Cassandra at NoSql Matters 2012
Cassandra at NoSql Matters 2012
 
Top five questions to ask when choosing a big data solution
Top five questions to ask when choosing a big data solutionTop five questions to ask when choosing a big data solution
Top five questions to ask when choosing a big data solution
 
State of Cassandra 2012
State of Cassandra 2012State of Cassandra 2012
State of Cassandra 2012
 
Massively Scalable NoSQL with Apache Cassandra
Massively Scalable NoSQL with Apache CassandraMassively Scalable NoSQL with Apache Cassandra
Massively Scalable NoSQL with Apache Cassandra
 
Cassandra 1.1
Cassandra 1.1Cassandra 1.1
Cassandra 1.1
 
Pycon 2012 What Python can learn from Java
Pycon 2012 What Python can learn from JavaPycon 2012 What Python can learn from Java
Pycon 2012 What Python can learn from Java
 
Apache Cassandra: NoSQL in the enterprise
Apache Cassandra: NoSQL in the enterpriseApache Cassandra: NoSQL in the enterprise
Apache Cassandra: NoSQL in the enterprise
 
Dealing with JVM limitations in Apache Cassandra (Fosdem 2012)
Dealing with JVM limitations in Apache Cassandra (Fosdem 2012)Dealing with JVM limitations in Apache Cassandra (Fosdem 2012)
Dealing with JVM limitations in Apache Cassandra (Fosdem 2012)
 
Cassandra at High Performance Transaction Systems 2011
Cassandra at High Performance Transaction Systems 2011Cassandra at High Performance Transaction Systems 2011
Cassandra at High Performance Transaction Systems 2011
 
Cassandra 1.0 and the future of big data (Cassandra Tokyo 2011)
Cassandra 1.0 and the future of big data (Cassandra Tokyo 2011)Cassandra 1.0 and the future of big data (Cassandra Tokyo 2011)
Cassandra 1.0 and the future of big data (Cassandra Tokyo 2011)
 
What python can learn from java
What python can learn from javaWhat python can learn from java
What python can learn from java
 
State of Cassandra, 2011
State of Cassandra, 2011State of Cassandra, 2011
State of Cassandra, 2011
 
Brisk: more powerful Hadoop powered by Cassandra
Brisk: more powerful Hadoop powered by CassandraBrisk: more powerful Hadoop powered by Cassandra
Brisk: more powerful Hadoop powered by Cassandra
 

Recently uploaded

Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIEnchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Vladimir Iglovikov, Ph.D.
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
Alex Pruden
 
GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...
ThomasParaiso2
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
RinaMondal9
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
Kumud Singh
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 

Recently uploaded (20)

Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIEnchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
 
GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 

Five Lessons in Distributed Databases

  • 1. Five Lessons in Distributed Databases Jonathan Ellis CTO, DataStax 1 © DataStax, All Rights Reserved. Confidential
  • 2. © DataStax, All Rights Reserved. 1. If it’s not SQL, it’s not a database
  • 3. © DataStax, All Rights Reserved. A brief history of NoSQL ● Early 2000s: people hit limits on vertical scaling, start sharding RDBMSes ● 2006, 2007: BigTable, Dynamo papers ● 2008-2010: Explosion of scale-out systems ○ Voldemort, Riak, Dynomite, FoundationDB, CouchDB ○ Cassandra, HBase, MongoDB
  • 4. © DataStax, All Rights Reserved. One small problem
  • 5. © DataStax, All Rights Reserved. Cassandra’s experience ● Thrift RPC “drivers” too low level ● Fragmented: Hector, Pelops, Astyanax ● Inconsistent across language ecosystems
  • 6. © DataStax, All Rights Reserved.
  • 7. © DataStax, All Rights Reserved. Solution: CQL ● 2011: Cassandra 0.8 introduces CQL 1.0 ● 2012: Cassandra 1.1 introduces CQL 3.0 ● 2013: Cassandra 1.2 adds collections
  • 8. © DataStax, All Rights Reserved. Today ● Cassandra: CQL ● CosmosDB: “SQL” ● Cloud Spanner: “SQL” ● Couchbase: N1QL ● HBase: Phoenix SQL (Java only) ● DynamoDB: REST/JSON ● MongoDB: BSON
  • 9. © DataStax, All Rights Reserved. 2. It takes 5+ years to build a database
  • 10. © DataStax, All Rights Reserved. Curt Monash Rule 1: Developing a good DBMS requires 5-7 years and tens of millions of dollars. That’s if things go extremely well. Rule 2: You aren’t an exception to Rule 1.
  • 11. © DataStax, All Rights Reserved. Aside: Mistakes I made starting DataStax ● Stayed at Rackspace too long ● Raised a $2.5M series A ● Waited a year to get serious about enterprise sales ● Changed the company name ● Brisk
  • 12. © DataStax, All Rights Reserved. Examples (Curt) ● Concurrent workloads benchmarked in the lab are poor predictors of concurrent performance in real life. ● Mixed workload management is harder than you’re assuming it is. ● Those minor edge cases in which your Version 1 product works poorly aren’t minor after all.
  • 13. © DataStax, All Rights Reserved. Examples (Cassandra) ● Hinted handoff ● Repair ● Counters ● Paxos ● Test suite
  • 14. © DataStax, All Rights Reserved. Aside: Fallout (Jepsen at Scale) ● Ensemble - A set of clusters that is brought up/torn down each test ○ Server Cluster - Cassandra/DSE ○ Client Cluster - Load Generators ○ Observer Cluster - Records live information from clusters (OpsCenter/Graphite) ○ Controller - Fallout ● Workload - The guts of the test ○ Phases - Run sequentially. Contains one or more modules that run in parallel for that phase ○ Checkers - Run after all phases and verify the data emitted from modules. ○ Artifact Checkers - Runs against collected artifacts to look for correctness/problems
  • 15. © DataStax, All Rights Reserved. A simple Fallout workload ensemble: server: node.count: 3 provisioner: name: local configuration_manager: name: ccm properties: cassandra.version: 3.0.0 client: server #use server cluster phases: - insert_workload: module: stress properties: iterations: 1m type: write rf: 3 gossip_updown: module: nodetool properties: command: disablegossip secondary.command: enablegossip sleep.seconds: 10 sleep.randomize: 20 - read_workload: module: stress properties: iterations: 1m type: read checkers: verify_success: checker: nofail 1. Start 3 node ccm cluster. 2. Insert data while bringing gossip on the nodes up and down. 3. Read/Check the data. 4. Verify none of the steps failed. Note: to move from ccm to ec2 we only need to change the ensemble section.
  • 16. © DataStax, All Rights Reserved. 5-7 years? ● Cassandra became Apache TLP in Feb 2010 ● 3.0 released Fall 2015 ● OSS is about adoption, not saving time/money
  • 17. © DataStax, All Rights Reserved. 3. The customer is always right
  • 18. © DataStax, All Rights Reserved. Example: sequential scans SELECT * FROM user_purchases WHERE purchase_date > 2000
  • 19. © DataStax, All Rights Reserved. What’s wrong with this query? For 100,000 purchases, nothing. For 100,000,000 purchases, you’ll crash the server (in 2012).
  • 20. © DataStax, All Rights Reserved. Solution (2012): ALLOW FILTERING SELECT * FROM user_purchases WHERE purchase_date > 2000 ALLOW FILTERING
  • 21. © DataStax, All Rights Reserved. Better solution (2013): Paging ● Build resultset incrementally and “page” it to the client
  • 22. © DataStax, All Rights Reserved. Example: tombstones INSERT INTO foo VALUES (1254, …) DELETE FROM foo WHERE id = 1254 … SELECT * FROM foo
  • 23. © DataStax, All Rights Reserved. Solution (2013) tombstone_warn_threshold: 1000 tombstone_failure_threshold: 100000
  • 24. © DataStax, All Rights Reserved. Better Solution (???): It’s complicated ● Track repair status to get rid of GCGS ● Bring time-to-repair from “days” to “hours” ● Optional: improve time-to-compaction
  • 25. © DataStax, All Rights Reserved. Example: joins ● CQL doesn’t support joins ● People still use client-side joins instead of denormalizing
  • 26. © DataStax, All Rights Reserved. Solution (2015-???): MV ● Make it easier to denormalize
  • 27. © DataStax, All Rights Reserved. Better solution (???): actually add joins ● Less controversial: shared partition joins ● More controversial: cross-partition ● CosmosDB, Spanner
  • 28. © DataStax, All Rights Reserved. A note on configurability
  • 29. © DataStax, All Rights Reserved. 4. Too much magic is a bad thing
  • 30. © DataStax, All Rights Reserved. Not (just) about vendors overpromising ● “Our database isn’t subject to the limits of the CAP theorem” ● “Our queue can guarantee exactly once delivery” ● “We’ll give you 99.99% uptime*”
  • 31. © DataStax, All Rights Reserved. Magic can be bad even when it works
  • 32. © DataStax, All Rights Reserved. Cloud Spanner analysis excerpt Spanner’s architecture implies that writes will be significantly slower than reads due to the need to coordinate across multiple replicas and avoid overlapping time bounds, and that is what we see in the original 2012 Spanner paper. … Besides write performance in isolation, because Spanner uses pessimistic locking to achieve ACID, reads are locked out of rows (partitions?) that are in the process of being updated. Thus, write performance challenges can spread to causing problems with reads as well.
  • 33. © DataStax, All Rights Reserved. Cloud Spanner
  • 34. © DataStax, All Rights Reserved. Auto-scaling in DynamoDB ● Request capacity tied to “partitions” [pp] ○ pp count = max (rc / 3000, wc / 1000, st / 10 GB) ● Subtle implication: capacity / pp decreases as storage volume increases ○ Non-uniform: pp request capacity halved when shard splits ● Subtle implication 2: bulk loads will wreck your planning
  • 35. © DataStax, All Rights Reserved. “Best practices for tables” ● Bulk load 200M items = 200 GB ● Target 60 minutes = 55,000 write capacity = 55 pps ● Post bulk load steady state ● 1000 req/s = 2 req/pp = 2 req/(3.6M items) ● No way to reduce partition count
  • 36. © DataStax, All Rights Reserved. Ravelin, 2017 You construct a table which uses a customer ID as partition key. You know your customer ID’s are unique and should be uniformly distributed across nodes. Your business has millions of customers and no single customer can do so many actions so quickly that the individual could create a hot key. Under this key you are storing around 2KB of data. This sounds reasonable. This will not work at scale in DynamoDb.
  • 37. © DataStax, All Rights Reserved. How much magic is too much? ● Joins: Apparently okay ● Auto-scaling: Apparently also okay ● Automatic partitioning: not okay ● Really slow ACID: not okay (?) ● Why? ● How do we make the system more transparent without inflicting an unnecessary level of detail on the user?
  • 38. © DataStax, All Rights Reserved. 5. It’s the cloud, stupid
  • 39. © DataStax, All Rights Reserved. September 2011
  • 40. © DataStax, All Rights Reserved. March 2012
  • 41. © DataStax, All Rights Reserved. March 2012
  • 42. © DataStax, All Rights Reserved. March 2012
  • 43. © DataStax, All Rights Reserved. The cloud is here. Now what?
  • 44. © DataStax, All Rights Reserved. Cloud-first architecture “The second trend will be the increased prevalence of shared-disk distributed DBMS. By “shared-disk” I mean a DBMS that uses a distributed storage layer as its primary storage location, such as HDFS or Amazon’s EBS/S3 services. This separates the DBMS’s storage layer from its execution nodes. Contrast this with a shared-nothing DBMS architecture where each execution node maintains its own storage.”
  • 45. © DataStax, All Rights Reserved. Cloud-first infrastructure ● What on-premises infrastructure can provide a cloud-like experience? ● Kubernetes? ● OpenStack?
  • 46. © DataStax, All Rights Reserved. Cloud-first development ● Is a yearly (bi-yearly?) release process the right cadence for companies building cloud services?
  • 47. © DataStax, All Rights Reserved. Cloud-first OSS ● What does OSS look like when you don’t work for the big three clouds? ● “Commons Clause” is an attempt to deal with this ○ (What about AGPL?)
  • 48. © DataStax, All Rights Reserved. Summary 1. If it’s not SQL, it’s not a database. 2. It takes 5+ years to build a database. 3. Listen to your users. 4. Too much magic is a bad thing. 5. It’s the cloud, stupid.