Addressing the High Cost
of Apache Cassandra
Eyal Gutkind, VP of Solutions Engineering
2
Presenter
Eyal Gutkind
Eyal Gutkind is VP of Solutions at ScyllaDB. Prior to joining
ScyllaDB, Eyal held product management roles at Mirantis
and DataStax, and spent 12 years with Mellanox
Technologies in various engineering management and
product marketing roles. Eyal holds a BSc. degree in
Electrical and Computer Engineering from Ben Gurion
University in Israel and an MBA from Fuqua School of
Business at Duke University.
+ Brief Scylla overview
+ Detailed benchmark comparisons with Cassandra and cost implications
+ The cost reduction of using large nodes
+ Storage cost benefits of Incremental Compaction Strategy
+ Using Workload Prioritization to support multiple workloads in a single cluster
Agenda
4
+ The Real-Time Big Data Database
+ Drop-in replacement for Apache Cassandra
and Amazon DynamoDB
+ 10X the performance & low tail latency
+ Open Source, Enterprise and Cloud options
+ Founded by the creators of KVM hypervisor
+ HQs: Palo Alto, CA, USA; Herzelia, Israel;
Warsaw, Poland
About ScyllaDB
 
Cassandra
Node Count
Scylla
Node Count
Datacenter
Savings
Recordings Ring 432 18 65%
Reminders Ring 96 18 41%
Recordings Secondary Ring 70 18 65%
History Ring 96 6 61%
Instruction and Lookup Ring 268 18 45%
Total 962 78 53%
Pre Scylla
Node Count
962
Post Scylla
Node Count
78
(m4.2xlarge) (i3.4xlarge & i3.8xlarge)
Don’t Take Our Word for It...
Poll Questions
The High Cost of Node Sprawl
+ Heavy administration
+ More things to fail
+ Expensive
+ Complex
“Just Add More Nodes”
…..
8
$900k/yr Datacenter Savings
Rules:
+ Need to meet an SLA of 100k/200k/300k ops at P99 < 10msec
+ Use as little as possible hardware
+ Hardware chosen ideally for each database
+ More details on https://www.scylladb.com/product/benchmarks/aws-i3-metal-benchmark/
+ 4 x i3.metal cost: $112,100
+ 40 x i3.4xlarge cost: $278,560
4 i3.metal
Scylla nodes
40 i3.4xl Cassandra
3.11 nodes
4 i3.metal
Scylla nodes
40 i3.4xl Cassandra
3.11 nodes
4 i3.metal
Scylla nodes
40 i3.4xl Cassandra
3.11 nodes
100% cpu != alert
Compaction != problem
Knowledge == Power
Summary:
+ Cost is 2.5X cheaper
+ 10X reduction in administration overhead.
+ Scylla’s P99 latency was up to 45X better.
+ Cassandra could not meet the SLA in 200k, 300k cases
+ Scylla provides 10x higher up time (MTTF)
+ Scylla is automatically tuned
Are availability and real time important to your business?
4 i3.metal
Scylla nodes
40 i3.4xl Cassandra
3.11 nodes
I3en 60TB Node
+ Reduce number of nodes by using
large nodes
+ Increase storage size when needed,
not nodes
+ Node replacement not an issue, when
your system has resources to handle
the streaming
ICS - 37% Savings and More
Major compaction
starts at this point
+ Available in Scylla Enterprise
+ Help increase storage utilization
+ Reduce compaction workload
+ Combine workloads into one cluster
+ Reduce latency penalties
+ Increase CPU and cluster utilization
Workload Prioritization
We Will Use Cloud Vendors’ DBaaS
Scylla Cloud vs. C* DBaaS Solutions
AWS CMS Vs.
Azure Cosmos Vs.
Scylla Cloud
Storage Cost [$/month/TB] 0.3 0.25
Hassle free
3 x i3.8xlarge instances
Unit Read Cost [$] 0.1095 5.84
Unit Write Cost [$] 0.5475 5.84
Total Storage cost [$/month] $614.40 $343.04
Total Write Cost [$/month] $32,850.00 $34,432.64
Total Read Cost [$/month] $54,750.00 $34,041.36
Total$ / Month $88,214 $68,817 $9,450
+ We have shown how users like Comcast and Kiwi are able to save big $$
+ Benchmarks shows time after time how we provide 10x reductions in number of nodes
+ Using large nodes is not an issue when using the right system resources
+ Compaction are smoother when using ICS
+ Using advanced features as workload prioritization helps users combine workloads into
fewer clusters, saving them money and operational resources
+ With Scylla you are able to reduce TCOs
Summary
Poll Question
Q&A
Eyal Gutkind
VP of Solutions
Book a session with me
+ If you are interested in evaluating your current workloads to learn how you can
save more, you can sign up for a Technical Evaluation session with me.
Link : https://www.scylladb.com/product/technical-consultation/
+ Or email me directly to eyal@scylladb.com if you have any questions.
We will send you these links via email along with the session recording.
22
eyal@scylladb.com
Stay in touch
United States
545 Faber Place
Palo Alto, CA 94303
Israel
11 Galgalei Haplada
Herzelia, Israel
www.scylladb.com
@scylladb
Thank you
United States
545 Faber Place
Palo Alto, CA 94303
Israel
11 Galgalei Haplada
Herzelia, Israel
www.scylladb.com
@scylladb
Poll Question
Poll Question #1
If you are using C* today what is your biggest challenge
A. Operational complexity
B. Finding talent/consultant to manage cluster
C. Cost of support contracts/consultants
D. I have no issues with my C* deployment
Poll Question #2
What is your next database deployment platform
A. On-premise, self managed
B. Cloud, self managed
C. Database as a service
D. No plans for new database deployment
Poll Question #3
What was your expectation from this session?
A.It was technical enough
B.Expected it to be more Technical
C.Was expecting more of a business perspective

Addressing the High Cost of Apache Cassandra

  • 1.
    Addressing the HighCost of Apache Cassandra Eyal Gutkind, VP of Solutions Engineering
  • 2.
    2 Presenter Eyal Gutkind Eyal Gutkindis VP of Solutions at ScyllaDB. Prior to joining ScyllaDB, Eyal held product management roles at Mirantis and DataStax, and spent 12 years with Mellanox Technologies in various engineering management and product marketing roles. Eyal holds a BSc. degree in Electrical and Computer Engineering from Ben Gurion University in Israel and an MBA from Fuqua School of Business at Duke University.
  • 3.
    + Brief Scyllaoverview + Detailed benchmark comparisons with Cassandra and cost implications + The cost reduction of using large nodes + Storage cost benefits of Incremental Compaction Strategy + Using Workload Prioritization to support multiple workloads in a single cluster Agenda
  • 4.
    4 + The Real-TimeBig Data Database + Drop-in replacement for Apache Cassandra and Amazon DynamoDB + 10X the performance & low tail latency + Open Source, Enterprise and Cloud options + Founded by the creators of KVM hypervisor + HQs: Palo Alto, CA, USA; Herzelia, Israel; Warsaw, Poland About ScyllaDB
  • 5.
      Cassandra Node Count Scylla Node Count Datacenter Savings RecordingsRing 432 18 65% Reminders Ring 96 18 41% Recordings Secondary Ring 70 18 65% History Ring 96 6 61% Instruction and Lookup Ring 268 18 45% Total 962 78 53% Pre Scylla Node Count 962 Post Scylla Node Count 78 (m4.2xlarge) (i3.4xlarge & i3.8xlarge) Don’t Take Our Word for It...
  • 6.
  • 7.
    The High Costof Node Sprawl + Heavy administration + More things to fail + Expensive + Complex “Just Add More Nodes” …..
  • 8.
  • 9.
    Rules: + Need tomeet an SLA of 100k/200k/300k ops at P99 < 10msec + Use as little as possible hardware + Hardware chosen ideally for each database + More details on https://www.scylladb.com/product/benchmarks/aws-i3-metal-benchmark/ + 4 x i3.metal cost: $112,100 + 40 x i3.4xlarge cost: $278,560 4 i3.metal Scylla nodes 40 i3.4xl Cassandra 3.11 nodes
  • 10.
    4 i3.metal Scylla nodes 40i3.4xl Cassandra 3.11 nodes
  • 11.
    4 i3.metal Scylla nodes 40i3.4xl Cassandra 3.11 nodes
  • 12.
    100% cpu !=alert Compaction != problem Knowledge == Power
  • 13.
    Summary: + Cost is2.5X cheaper + 10X reduction in administration overhead. + Scylla’s P99 latency was up to 45X better. + Cassandra could not meet the SLA in 200k, 300k cases + Scylla provides 10x higher up time (MTTF) + Scylla is automatically tuned Are availability and real time important to your business? 4 i3.metal Scylla nodes 40 i3.4xl Cassandra 3.11 nodes
  • 14.
    I3en 60TB Node +Reduce number of nodes by using large nodes + Increase storage size when needed, not nodes + Node replacement not an issue, when your system has resources to handle the streaming
  • 15.
    ICS - 37%Savings and More Major compaction starts at this point + Available in Scylla Enterprise + Help increase storage utilization + Reduce compaction workload
  • 16.
    + Combine workloadsinto one cluster + Reduce latency penalties + Increase CPU and cluster utilization Workload Prioritization
  • 17.
    We Will UseCloud Vendors’ DBaaS
  • 18.
    Scylla Cloud vs.C* DBaaS Solutions AWS CMS Vs. Azure Cosmos Vs. Scylla Cloud Storage Cost [$/month/TB] 0.3 0.25 Hassle free 3 x i3.8xlarge instances Unit Read Cost [$] 0.1095 5.84 Unit Write Cost [$] 0.5475 5.84 Total Storage cost [$/month] $614.40 $343.04 Total Write Cost [$/month] $32,850.00 $34,432.64 Total Read Cost [$/month] $54,750.00 $34,041.36 Total$ / Month $88,214 $68,817 $9,450
  • 19.
    + We haveshown how users like Comcast and Kiwi are able to save big $$ + Benchmarks shows time after time how we provide 10x reductions in number of nodes + Using large nodes is not an issue when using the right system resources + Compaction are smoother when using ICS + Using advanced features as workload prioritization helps users combine workloads into fewer clusters, saving them money and operational resources + With Scylla you are able to reduce TCOs Summary
  • 20.
  • 21.
  • 22.
    Book a sessionwith me + If you are interested in evaluating your current workloads to learn how you can save more, you can sign up for a Technical Evaluation session with me. Link : https://www.scylladb.com/product/technical-consultation/ + Or email me directly to eyal@scylladb.com if you have any questions. We will send you these links via email along with the session recording. 22
  • 23.
  • 24.
    United States 545 FaberPlace Palo Alto, CA 94303 Israel 11 Galgalei Haplada Herzelia, Israel www.scylladb.com @scylladb Thank you
  • 25.
    United States 545 FaberPlace Palo Alto, CA 94303 Israel 11 Galgalei Haplada Herzelia, Israel www.scylladb.com @scylladb
  • 26.
  • 27.
    Poll Question #1 Ifyou are using C* today what is your biggest challenge A. Operational complexity B. Finding talent/consultant to manage cluster C. Cost of support contracts/consultants D. I have no issues with my C* deployment
  • 28.
    Poll Question #2 Whatis your next database deployment platform A. On-premise, self managed B. Cloud, self managed C. Database as a service D. No plans for new database deployment
  • 29.
    Poll Question #3 Whatwas your expectation from this session? A.It was technical enough B.Expected it to be more Technical C.Was expecting more of a business perspective