SlideShare a Scribd company logo
1 of 74
Download to read offline
© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Jim Plush Sr Director of Engineering, CrowdStrike
Dennis Opacki, Sr Cloud Systems Architect, CrowdStrike
October 2015
BDT323
Amazon EBS and Cassandra
1 Million Writes Per Second on 60 Nodes
An Introduction
to CrowdStrike
We Are CyberSecurity Technology Company
We Detect, Prevent And Respond To All Attack Types In Real Time,
Protecting Organizations From Catastrophic Breaches
We Provide Next Generation Endpoint Protection, Threat Intelligence & Pre &Post
IR Services
http://www.crowdstrike.com/introduction-to-crowdstrike-falcon-host/
CrowdStrike Scale
• Cloud-based endpoint protection
• Single customer can generate > 2 TB daily
• 500K+ events per second
• Multi-petabytes of managed data
© 2015. All Rights Reserved.
Truisms???
• HTTPS is too slow to run everywhere
• All you need is anti-virus
• Never run Cassandra on Amazon EBS
© 2015. All Rights Reserved.
© 2015. All Rights Reserved.
What is Amazon EBS?
EBS data volume
EBS data volume
/mnt/foo
/mnt/bar
EC2 Instance
 Network mounted hard drive
 Ability to snapshot data
 Data encryption at rest & in flight
Existing Amazon EBS Assumptions
• Jittery I/O a.k.a: Noisy neighbors
• Single point of failure in a region
• Cost is too damn high
• Bad volumes (dd and destroy)
© 2015. All Rights Reserved.
A Recent Project: Initial Requirements
• 1PB of incoming event data from millions of devices
• Modeled as a graph
• 1 million writes per second (burst)
• Age data out after x days
• 95% write 5% read
© 2015. All Rights Reserved.
We Tried
• Cassandra + Titan
• Sharding?
• Neo4J
• PostgreSQL, MySQL, SQLite
• LevelDB/RocksDB
© 2015. All Rights Reserved.
We Have to Make This Work
Cassandra had the properties we needed
Time for a new approach?
© 2015. All Rights Reserved. http://techblog.netflix.com/2014/07/revisiting-1-million-writes-per-second.html
Number of Machines for 1PB
© 2015. All Rights Reserved.
0.
450.
900.
1350.
1800.
2250.
I2.xlarge c4.2XL EBS
Yearly Cost for 1PB Cluster
© 2015. All Rights Reserved.
0.
4.
8.
12.
16.
I2.xlarge-on demand I2.xlarge-reserved c4.2xl - on demand c4.2xl - reserved
Millionsof$
With Amazon EBS
Initial Launch
Date Tiered Compaction
© 2015. All Rights Reserved.
…more details by Jeff Jirsa, CrowdStrike
Cassandra Summit 2015 - DTCS
http://www.slideshare.net/JeffJirsa1/cassandra-summit-2015-real-world-dtcs-for-operators
Initial Launch
• Cassandra 2.0.12 (DSE)
• m3.2xlarge 8 core
• Single 4TB EBS GP2 ~10,000 IOPS
• Default tunings
© 2015. All Rights Reserved.
Performance Was Terrible
• 12 node cluster
• ~60K writes per second RF2
• ~10K writes per 8 core box
• We went to the experts
© 2015. All Rights Reserved.
© 2015. All Rights Reserved.
Cassandra Summit 2014
Family Search asked the
same question:
Where’s the bottleneck?
https://www.youtube.com/watch?v=Qfzg7gcSK-g
IOPS Available
© 2015. All Rights Reserved.
0.
12500.
25000.
37500.
50000.
I2.xlarge c4.2xlarge
© 2015. All Rights Reserved.
1.3K IOPS?
© 2015. All Rights Reserved.
IOPS
I see you there,
but I can’t reach you!
© 2015. All Rights Reserved.
The magic gates
opened…
We hit 1 million
writes per second
RF3 on 60 nodes
© 2015. All Rights Reserved.
Testing Setup
Testing Methodology
• Each test run
• clean C* instances
• old test keyspaces dropped
• 13+TBs of data loaded during read testing
• 20 C4.4XL Stress Writers each with their own 1BB sequence
© 2015. All Rights Reserved.
Cluster Topology
© 2015. All Rights Reserved.
Stress Node
10 Instances
AZ: 1A
Stress Nodes
10 Instances
AZ: 1B
20 C* Nodes
AZ: 1A
20 C* Nodes
AZ: 1B
20 C* Nodes
AZ: 1C
OpsCenter
Amazon EBS
© 2015. All Rights Reserved.
Cassandra Stress 2.1.x
© 2015. All Rights Reserved.
bin/cassandra-stress user duration=100000m cl=ONE profile=/home/ubuntu/summit_stress.yaml ops(insert=1) no-warmup -pop
seq=1..1000000000 -mode native cql3 -node 10.10.10.XX -rate threads=1000 -errors ignore
© 2015. All Rights Reserved.
PCSTAT - Al Tobey
http://www.datastax.com/dev/blog/compaction-improvements-in-cassandra-21
https://github.com/tobert/pcstat
© 2015. All Rights Reserved.
Netflix Test - What is C* capable of?
Netflix Test
© 2015. All Rights Reserved.
1+ million writes per second RF:3 3+ million local writes per second
NICE!
Netflix Test
© 2015. All Rights Reserved.
Netflix Test
© 2015. All Rights Reserved.
No dropped mutations, system healthy at 1.1M after 50 mins
Netflix Test
© 2015. All Rights Reserved.
I/O util is not peggedCommit disk = steady!
Netflix Test
© 2015. All Rights Reserved.
Low I/O wait
Netflix Test
© 2015. All Rights Reserved.
95th Latency = Reasonable
Netflix Test - Read Fail
© 2015. All Rights Reserved.
compression={'chunk_length_kb': '64', 'sstable_compression': 'LZ4Compressor'}
https://issues.apache.org/jira/browse/CASSANDRA-10249
https://issues.apache.org/jira/browse/CASSANDRA-8894
Data Drive Pegged 
Reading Data
• 24-hour read test
• over 10 TBs of data in the CF
• sustained > 350K reads per
second over 24 hours
• 1M reads/per sec peak
• CL ONE
• 12 C4.4XL stress boxes
© 2015. All Rights Reserved.
Reading Data
© 2015. All Rights Reserved.
Reading Data
© 2015. All Rights Reserved.
Reading Data
© 2015. All Rights Reserved.
Not Pegged 
Reading Data
© 2015. All Rights Reserved.
7.2ms 95th latency
180 less cores (45 less i2.xlarge instances)
• C4.4XL vs. i2.XLarge
24 hour test (sans data transfer cost)
• Netflix cluster/stress
• Cost: ~$6300
• 285 i2.xlarge $0.85 per hour
• CrowdStrike cluster/stress with Amazon EBS cost
• Cost: ~$2600
• 60 C4.4XL $0.88 per hour
VS Netflix Blog Post
• Our test was a single 10K IOPS volume
• More/bigger reads?
• PIOPS gives you as much throughput as you need
• RAID0 multiple Amazon EBS volumes
Read Notes with Amazon EBS
EBS Data
Volume
EBS Data
Volume
/mnt
/foo
/mnt/bar
EC2 Instance
© 2015. All Rights Reserved.
What Unlocked Performance
Major Tweaks
• Ubuntu HVM types
• Enhanced networking
• Now faster than PV
• Ubuntu distro tuned for cloud workloads
• XFS Filesystem
© 2015. All Rights Reserved.
Major Tweaks
Major Tweaks
• Cassandra 2.1
• Java 8
• G1 Garbage Collector
© 2015. All Rights Reserved.
https://issues.apache.org/jira/browse/CASSANDRA-7486
Major Tweaks
• C4.4XL 16 core, EBS Optimized
• 4TB, 10,000 IOPS EBS GP2 Encrypted Data Drive
• 160MB/s throughput
• 1TB 3000 IOPS EBS GP2 Encrypted Commit Log Drive
© 2015. All Rights Reserved.
Major Tweaks
cassandra-env.sh
• MAX_HEAP_SIZE=8G
• JVM_OPTS=“$JVM_OPTS —XX:+UseG1GC”
• Lots of other minor tweaks in crowdstrike-tools
© 2015. All Rights Reserved.
cassandra-env.sh
© 2015. All Rights Reserved.
Put PID in batch mode
Mask CPU0 from the process to reduce context switching
Magic From Al Tobey
YAML Settings
cassandra.yaml (based on 16 core)
• concurrent_reads: 32
• concurrent_writes: 64
• memtable_flush_writers: 8
• trickle_fsync: true
• trickle_fsync_interval_in_kb: 1000
• native_transport_max_threads: 256
• concurrent_compactors: 4
© 2015. All Rights Reserved.
cassandra.yaml
© 2015. All Rights Reserved.
We found a good portion of the CPU load was
being used for internode compression which
reduced write throughput
internode_compression: none
Lessons Learned
• Amazon EBS was never the bottleneck during testing, GP2 is legit
• Built-in types like list and map come at a performance penalty
• 30% hit on our writes using Map type
• DTCS is very young (see Jeff Jirsa’s talk)
• 2.1 Stress Tool is tricky but great for modeling workloads
• How will compression affect your read path?
© 2015. All Rights Reserved.
© 2015. All Rights Reserved.
Test Your Own!
https://github.com/CrowdStrike/cassandra-tools
It’s Just Python
launch 20 nodes in us-east-1
• python launch.py launch --nodes=20 —config=c4-ebs-hvm
—az=us-east-1a
bootstrap the new nodes with C*, RAID/Format disks, etc…
• fab -u ubuntu bootstrapcass21:config=c4-highperf
run arbitrary commands
• fab -u ubuntu cmd:config=c4-highperf,cmd="sudo rm -rf
/mnt/cassandra/data/summit_stress"
© 2015. All Rights Reserved.
Run Custom Stress Profiles… Multi-Node Support
ubuntu@ip-10-10-10.XX:~$ python runstress.py --profile=stress10 —seednode=10.10.10.XX —-threads=50
Going to run: /home/ubuntu/apache-cassandra-2.1.5/tools/bin/cassandra-stress user duration=100000m cl=ONE
profile=/home/ubuntu/summit_stress.yaml ops(insert=1,simple=9) no-warmup -pop seq=1..1000000000 -mode native cql3 -node
10.10.10.XX -rate threads=50 -errors ignore
© 2015. All Rights Reserved.
ubuntu@ip-10-10-10.XX:~$ python runstress.py --profile=stress10 --seednode=10.10.10.XX --threads=50
Going to run: /home/ubuntu/apache-cassandra-2.1.5/tools/bin/cassandra-stress user duration=100000m cl=ONE
profile=/home/ubuntu/summit_stress.yaml ops(insert=1,simple=9) no-warmup -pop seq=1000000001..2000000000 -mode native cql3
-node 10.10.10.XX -rate threads=50 -errors ignore
export NODENUM=1
export NODENUM=2
• ~3 months on our Amazon EBS–based cluster
• Hundreds of TBs of graph data and growing in C*
• Billions of vertices/edges
• Changing perceptions?
• DataStax - Planning an Amazon EC2 cluster
Where Are We Today?
Al Tobey’s Tuning Guide for Cassandra 2.1
https://tobert.github.io/pages/als-cassandra-21-tuning-
guide.html
Resources
Special Thanks To
Leif Jackson
Marcus King
Alan Hannan
Jeff Jirsa
© 2015. All Rights Reserved.
• Al Tobey
• Nick Panahi
• J.B. Langston
• Marcus Eriksson
• Iian Finlayson
• Dani Traphagen
Amazon EBS Heading Into 2016
© 2015. All Rights Reserved.
4TB (10k IOPS) GP2
I/O Hit? Not enough to phase C*
© 2015. All Rights Reserved.
So why the hate for
Amazon EBS?
© 2015. All Rights Reserved.
• Used instance-store image and
ephemeral drives
• Painful to stop/start instances, resize
• Couldn’t avoid scheduled maintenance
(i.e., Reboot-a-palooza)
• Encryption required shenanigans
Following the Crowd – Trust Issues
© 2015. All Rights Reserved.
• We still had failures
• Now we get to rebuild from scratch
Guess What
© 2015. All Rights Reserved.
What do you mean my volume is “stuck”?
• April 2011 – Netflix, Reddit, and Quora
• October 2012 – Reddit, Imgur, Heroku
• August 2013 – Vine, Airbnb
Amazon EBS’s Troubled Childhood
© 2015. All Rights Reserved.
http://techblog.netflix.com/2011/04/lessons-
netflix-learned-from-aws-outage.html
Spread services across multiple regions
Test failure scenarios regularly (Chaos Monkey)
Make Cassandra databases more resilient by
avoiding Amazon EBS
Kiss of Death
© 2015. All Rights Reserved.
Amazon moves quickly and quietly:
• March 2011 – New Amazon EBS GM
• July 2012 – Provisioned IOPs
• May 2014 – Native encryption
• Jun 2014 – GP2 (game changer)
• Mar 2015 – 16TB / 10K GP2/ 20K PIOPS
Redemption
© 2015. All Rights Reserved.
• Prioritized Amazon EBS availability and consistency beyond
features and functionality
• Compartmentalized the control plane – removed cross-AZ
dependencies for running volumes
• Simplified workflows to favor sustained operation
• Tested and simulated via TLA+/PlusCal - better understood
corner cases
• Dedicated a large fraction of engineering resources to reliability
and performance
Redemption
© 2015. All Rights Reserved.
Amazon EBS team targets 99.999%
availability
exceeding expectations
Reliability
© 2015. All Rights Reserved.
• In past 12 months, zero Amazon EBS–
related failures
• Thousands of GP2 data volumes (~2PB
data)
• Transitioning all systems to Amazon EBS
root drives
• Moved all data stores to Amazon EBS
(C*, Kafka, Elasticsearch, Postgres, etc.)
CrowdStrike Today
© 2015. All Rights Reserved.
• Select a region with >2 AZs (e.g.,
us-east-1 or us-west-2)
• Use Amazon EBS GP2 or PIOPs storage
• Separate volumes for data and commit
logs
Staying Safe - Architecture
© 2015. All Rights Reserved.
• Use Amazon EBS volume monitoring
• Pre-warm Amazon EBS volumes?
• Schedule snapshots for consistent backups
Staying Safe - Ops
© 2015. All Rights Reserved.
• Challenge assumptions
• Stay current on AWS blog
• Talk with your peers
Most Importantly
http://aws.amazon.com/ebs/nosql/
Remember to complete
your evaluations!
BDT323
Thank you!
@jimplush
@opacki

More Related Content

What's hot

debugging openstack neutron /w openvswitch
debugging openstack neutron /w openvswitchdebugging openstack neutron /w openvswitch
debugging openstack neutron /w openvswitch어형 이
 
Using the KVMhypervisor in CloudStack
Using the KVMhypervisor in CloudStackUsing the KVMhypervisor in CloudStack
Using the KVMhypervisor in CloudStackShapeBlue
 
Pulsar - Distributed pub/sub platform
Pulsar - Distributed pub/sub platformPulsar - Distributed pub/sub platform
Pulsar - Distributed pub/sub platformMatteo Merli
 
OpenStack Quantum Intro (OS Meetup 3-26-12)
OpenStack Quantum Intro (OS Meetup 3-26-12)OpenStack Quantum Intro (OS Meetup 3-26-12)
OpenStack Quantum Intro (OS Meetup 3-26-12)Dan Wendlandt
 
Windows MSCS 운영 및 기타 설치 가이드
Windows MSCS 운영 및 기타 설치 가이드Windows MSCS 운영 및 기타 설치 가이드
Windows MSCS 운영 및 기타 설치 가이드CheolHee Han
 
Building a redundant CloudStack management cluster - Vladimir Melnik
Building a redundant CloudStack management cluster - Vladimir MelnikBuilding a redundant CloudStack management cluster - Vladimir Melnik
Building a redundant CloudStack management cluster - Vladimir MelnikShapeBlue
 
Achieving a 50% Reduction in Cross-AZ Network Costs from Kafka (Uday Sagar Si...
Achieving a 50% Reduction in Cross-AZ Network Costs from Kafka (Uday Sagar Si...Achieving a 50% Reduction in Cross-AZ Network Costs from Kafka (Uday Sagar Si...
Achieving a 50% Reduction in Cross-AZ Network Costs from Kafka (Uday Sagar Si...confluent
 
Disk health prediction for Ceph
Disk health prediction for CephDisk health prediction for Ceph
Disk health prediction for CephCeph Community
 
Ceph Tech Talk -- Ceph Benchmarking Tool
Ceph Tech Talk -- Ceph Benchmarking ToolCeph Tech Talk -- Ceph Benchmarking Tool
Ceph Tech Talk -- Ceph Benchmarking ToolCeph Community
 
MariaDB ColumnStore
MariaDB ColumnStoreMariaDB ColumnStore
MariaDB ColumnStoreMariaDB plc
 
VM Autoscaling With CloudStack VR As Network Provider
VM Autoscaling With CloudStack VR As Network ProviderVM Autoscaling With CloudStack VR As Network Provider
VM Autoscaling With CloudStack VR As Network ProviderShapeBlue
 
Domino Server Health - Monitoring and Managing
 Domino Server Health - Monitoring and Managing Domino Server Health - Monitoring and Managing
Domino Server Health - Monitoring and ManagingGabriella Davis
 
Storage Capacity Management on Multi-tenant Kafka Cluster with Nurettin Omeroglu
Storage Capacity Management on Multi-tenant Kafka Cluster with Nurettin OmerogluStorage Capacity Management on Multi-tenant Kafka Cluster with Nurettin Omeroglu
Storage Capacity Management on Multi-tenant Kafka Cluster with Nurettin OmerogluHostedbyConfluent
 
Visualizing Kafka Security
Visualizing Kafka SecurityVisualizing Kafka Security
Visualizing Kafka SecurityDataWorks Summit
 
Namespaces and cgroups - the basis of Linux containers
Namespaces and cgroups - the basis of Linux containersNamespaces and cgroups - the basis of Linux containers
Namespaces and cgroups - the basis of Linux containersKernel TLV
 
Cloudstack for beginners
Cloudstack for beginnersCloudstack for beginners
Cloudstack for beginnersJoseph Amirani
 
20230511 - PGConf Nepal - Clustering in PostgreSQL_ Because one database serv...
20230511 - PGConf Nepal - Clustering in PostgreSQL_ Because one database serv...20230511 - PGConf Nepal - Clustering in PostgreSQL_ Because one database serv...
20230511 - PGConf Nepal - Clustering in PostgreSQL_ Because one database serv...Umair Shahid
 
The Best and Worst of Cassandra-stress Tool (Christopher Batey, The Last Pick...
The Best and Worst of Cassandra-stress Tool (Christopher Batey, The Last Pick...The Best and Worst of Cassandra-stress Tool (Christopher Batey, The Last Pick...
The Best and Worst of Cassandra-stress Tool (Christopher Batey, The Last Pick...DataStax
 
Cloud computing and OpenStack
Cloud computing and OpenStackCloud computing and OpenStack
Cloud computing and OpenStackEdgar Magana
 

What's hot (20)

debugging openstack neutron /w openvswitch
debugging openstack neutron /w openvswitchdebugging openstack neutron /w openvswitch
debugging openstack neutron /w openvswitch
 
Using the KVMhypervisor in CloudStack
Using the KVMhypervisor in CloudStackUsing the KVMhypervisor in CloudStack
Using the KVMhypervisor in CloudStack
 
Pulsar - Distributed pub/sub platform
Pulsar - Distributed pub/sub platformPulsar - Distributed pub/sub platform
Pulsar - Distributed pub/sub platform
 
OpenStack Quantum Intro (OS Meetup 3-26-12)
OpenStack Quantum Intro (OS Meetup 3-26-12)OpenStack Quantum Intro (OS Meetup 3-26-12)
OpenStack Quantum Intro (OS Meetup 3-26-12)
 
Windows MSCS 운영 및 기타 설치 가이드
Windows MSCS 운영 및 기타 설치 가이드Windows MSCS 운영 및 기타 설치 가이드
Windows MSCS 운영 및 기타 설치 가이드
 
Building a redundant CloudStack management cluster - Vladimir Melnik
Building a redundant CloudStack management cluster - Vladimir MelnikBuilding a redundant CloudStack management cluster - Vladimir Melnik
Building a redundant CloudStack management cluster - Vladimir Melnik
 
Achieving a 50% Reduction in Cross-AZ Network Costs from Kafka (Uday Sagar Si...
Achieving a 50% Reduction in Cross-AZ Network Costs from Kafka (Uday Sagar Si...Achieving a 50% Reduction in Cross-AZ Network Costs from Kafka (Uday Sagar Si...
Achieving a 50% Reduction in Cross-AZ Network Costs from Kafka (Uday Sagar Si...
 
Disk health prediction for Ceph
Disk health prediction for CephDisk health prediction for Ceph
Disk health prediction for Ceph
 
Ceph Tech Talk -- Ceph Benchmarking Tool
Ceph Tech Talk -- Ceph Benchmarking ToolCeph Tech Talk -- Ceph Benchmarking Tool
Ceph Tech Talk -- Ceph Benchmarking Tool
 
MariaDB ColumnStore
MariaDB ColumnStoreMariaDB ColumnStore
MariaDB ColumnStore
 
VM Autoscaling With CloudStack VR As Network Provider
VM Autoscaling With CloudStack VR As Network ProviderVM Autoscaling With CloudStack VR As Network Provider
VM Autoscaling With CloudStack VR As Network Provider
 
Domino Server Health - Monitoring and Managing
 Domino Server Health - Monitoring and Managing Domino Server Health - Monitoring and Managing
Domino Server Health - Monitoring and Managing
 
Storage Capacity Management on Multi-tenant Kafka Cluster with Nurettin Omeroglu
Storage Capacity Management on Multi-tenant Kafka Cluster with Nurettin OmerogluStorage Capacity Management on Multi-tenant Kafka Cluster with Nurettin Omeroglu
Storage Capacity Management on Multi-tenant Kafka Cluster with Nurettin Omeroglu
 
Visualizing Kafka Security
Visualizing Kafka SecurityVisualizing Kafka Security
Visualizing Kafka Security
 
Namespaces and cgroups - the basis of Linux containers
Namespaces and cgroups - the basis of Linux containersNamespaces and cgroups - the basis of Linux containers
Namespaces and cgroups - the basis of Linux containers
 
Cloudstack for beginners
Cloudstack for beginnersCloudstack for beginners
Cloudstack for beginners
 
20230511 - PGConf Nepal - Clustering in PostgreSQL_ Because one database serv...
20230511 - PGConf Nepal - Clustering in PostgreSQL_ Because one database serv...20230511 - PGConf Nepal - Clustering in PostgreSQL_ Because one database serv...
20230511 - PGConf Nepal - Clustering in PostgreSQL_ Because one database serv...
 
Introduce to Terraform
Introduce to TerraformIntroduce to Terraform
Introduce to Terraform
 
The Best and Worst of Cassandra-stress Tool (Christopher Batey, The Last Pick...
The Best and Worst of Cassandra-stress Tool (Christopher Batey, The Last Pick...The Best and Worst of Cassandra-stress Tool (Christopher Batey, The Last Pick...
The Best and Worst of Cassandra-stress Tool (Christopher Batey, The Last Pick...
 
Cloud computing and OpenStack
Cloud computing and OpenStackCloud computing and OpenStack
Cloud computing and OpenStack
 

Viewers also liked

Running Cassandra on Amazon EC2
Running Cassandra on Amazon EC2Running Cassandra on Amazon EC2
Running Cassandra on Amazon EC2Dave Gardner
 
RocksDB compaction
RocksDB compactionRocksDB compaction
RocksDB compactionMIJIN AN
 
Securing Cassandra for Compliance
Securing Cassandra for ComplianceSecuring Cassandra for Compliance
Securing Cassandra for ComplianceDataStax
 
PostgreSQL and CockroachDB SQL
PostgreSQL and CockroachDB SQLPostgreSQL and CockroachDB SQL
PostgreSQL and CockroachDB SQLCockroachDB
 
RocksDB detail
RocksDB detailRocksDB detail
RocksDB detailMIJIN AN
 
Maximizing Amazon EC2 and Amazon EBS performance
Maximizing Amazon EC2 and Amazon EBS performanceMaximizing Amazon EC2 and Amazon EBS performance
Maximizing Amazon EC2 and Amazon EBS performanceAmazon Web Services
 
Securing Cassandra The Right Way
Securing Cassandra The Right WaySecuring Cassandra The Right Way
Securing Cassandra The Right WayDataStax Academy
 
AWS re:Invent 2016: Case Study: Librato's Experience Running Cassandra Using ...
AWS re:Invent 2016: Case Study: Librato's Experience Running Cassandra Using ...AWS re:Invent 2016: Case Study: Librato's Experience Running Cassandra Using ...
AWS re:Invent 2016: Case Study: Librato's Experience Running Cassandra Using ...Amazon Web Services
 
Deep Dive: Maximizing EC2 and EBS Performance
Deep Dive: Maximizing EC2 and EBS PerformanceDeep Dive: Maximizing EC2 and EBS Performance
Deep Dive: Maximizing EC2 and EBS PerformanceAmazon Web Services
 
Introduction to Cassandra Basics
Introduction to Cassandra BasicsIntroduction to Cassandra Basics
Introduction to Cassandra Basicsnickmbailey
 
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of Facebook
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of FacebookTech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of Facebook
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of FacebookThe Hive
 
Kerberos, Token and Hadoop
Kerberos, Token and HadoopKerberos, Token and Hadoop
Kerberos, Token and HadoopKai Zheng
 
DataStax: Extreme Cassandra Optimization: The Sequel
DataStax: Extreme Cassandra Optimization: The SequelDataStax: Extreme Cassandra Optimization: The Sequel
DataStax: Extreme Cassandra Optimization: The SequelDataStax Academy
 
(SEC201) How Should We All Think About Security?
(SEC201) How Should We All Think About Security?(SEC201) How Should We All Think About Security?
(SEC201) How Should We All Think About Security?Amazon Web Services
 
(STG403) Amazon EBS: Designing for Performance
(STG403) Amazon EBS: Designing for Performance(STG403) Amazon EBS: Designing for Performance
(STG403) Amazon EBS: Designing for PerformanceAmazon Web Services
 
Amazon Cassandra Basics & Guidelines for AWS/EC2/VPC/EBS
Amazon Cassandra Basics & Guidelines for AWS/EC2/VPC/EBSAmazon Cassandra Basics & Guidelines for AWS/EC2/VPC/EBS
Amazon Cassandra Basics & Guidelines for AWS/EC2/VPC/EBSJean-Paul Azar
 

Viewers also liked (18)

Running Cassandra on Amazon EC2
Running Cassandra on Amazon EC2Running Cassandra on Amazon EC2
Running Cassandra on Amazon EC2
 
Running Cassandra in AWS
Running Cassandra in AWSRunning Cassandra in AWS
Running Cassandra in AWS
 
RocksDB compaction
RocksDB compactionRocksDB compaction
RocksDB compaction
 
Securing Cassandra for Compliance
Securing Cassandra for ComplianceSecuring Cassandra for Compliance
Securing Cassandra for Compliance
 
PostgreSQL and CockroachDB SQL
PostgreSQL and CockroachDB SQLPostgreSQL and CockroachDB SQL
PostgreSQL and CockroachDB SQL
 
RocksDB detail
RocksDB detailRocksDB detail
RocksDB detail
 
Maximizing Amazon EC2 and Amazon EBS performance
Maximizing Amazon EC2 and Amazon EBS performanceMaximizing Amazon EC2 and Amazon EBS performance
Maximizing Amazon EC2 and Amazon EBS performance
 
Securing Cassandra The Right Way
Securing Cassandra The Right WaySecuring Cassandra The Right Way
Securing Cassandra The Right Way
 
AWS re:Invent 2016: Case Study: Librato's Experience Running Cassandra Using ...
AWS re:Invent 2016: Case Study: Librato's Experience Running Cassandra Using ...AWS re:Invent 2016: Case Study: Librato's Experience Running Cassandra Using ...
AWS re:Invent 2016: Case Study: Librato's Experience Running Cassandra Using ...
 
Deep Dive: Maximizing EC2 and EBS Performance
Deep Dive: Maximizing EC2 and EBS PerformanceDeep Dive: Maximizing EC2 and EBS Performance
Deep Dive: Maximizing EC2 and EBS Performance
 
Introduction to Cassandra Basics
Introduction to Cassandra BasicsIntroduction to Cassandra Basics
Introduction to Cassandra Basics
 
MyRocks Deep Dive
MyRocks Deep DiveMyRocks Deep Dive
MyRocks Deep Dive
 
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of Facebook
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of FacebookTech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of Facebook
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of Facebook
 
Kerberos, Token and Hadoop
Kerberos, Token and HadoopKerberos, Token and Hadoop
Kerberos, Token and Hadoop
 
DataStax: Extreme Cassandra Optimization: The Sequel
DataStax: Extreme Cassandra Optimization: The SequelDataStax: Extreme Cassandra Optimization: The Sequel
DataStax: Extreme Cassandra Optimization: The Sequel
 
(SEC201) How Should We All Think About Security?
(SEC201) How Should We All Think About Security?(SEC201) How Should We All Think About Security?
(SEC201) How Should We All Think About Security?
 
(STG403) Amazon EBS: Designing for Performance
(STG403) Amazon EBS: Designing for Performance(STG403) Amazon EBS: Designing for Performance
(STG403) Amazon EBS: Designing for Performance
 
Amazon Cassandra Basics & Guidelines for AWS/EC2/VPC/EBS
Amazon Cassandra Basics & Guidelines for AWS/EC2/VPC/EBSAmazon Cassandra Basics & Guidelines for AWS/EC2/VPC/EBS
Amazon Cassandra Basics & Guidelines for AWS/EC2/VPC/EBS
 

Similar to (BDT323) Amazon EBS & Cassandra: 1 Million Writes Per Second

Approaching hyperconvergedopenstack
Approaching hyperconvergedopenstackApproaching hyperconvergedopenstack
Approaching hyperconvergedopenstackIkuo Kumagai
 
Your Linux AMI: Optimization and Performance (CPN302) | AWS re:Invent 2013
Your Linux AMI: Optimization and Performance (CPN302) | AWS re:Invent 2013Your Linux AMI: Optimization and Performance (CPN302) | AWS re:Invent 2013
Your Linux AMI: Optimization and Performance (CPN302) | AWS re:Invent 2013Amazon Web Services
 
Trying and evaluating the new features of GlusterFS 3.5
Trying and evaluating the new features of GlusterFS 3.5Trying and evaluating the new features of GlusterFS 3.5
Trying and evaluating the new features of GlusterFS 3.5Keisuke Takahashi
 
Using Databases and Containers From Development to Deployment
Using Databases and Containers  From Development to DeploymentUsing Databases and Containers  From Development to Deployment
Using Databases and Containers From Development to DeploymentAerospike, Inc.
 
CPN302 your-linux-ami-optimization-and-performance
CPN302 your-linux-ami-optimization-and-performanceCPN302 your-linux-ami-optimization-and-performance
CPN302 your-linux-ami-optimization-and-performanceCoburn Watson
 
OpenStack and NetApp - Chen Reuven - OpenStack Day Israel 2017
OpenStack and NetApp - Chen Reuven - OpenStack Day Israel 2017OpenStack and NetApp - Chen Reuven - OpenStack Day Israel 2017
OpenStack and NetApp - Chen Reuven - OpenStack Day Israel 2017Cloud Native Day Tel Aviv
 
Amazon EC2 deepdive and a sprinkel of AWS Compute | AWS Floor28
Amazon EC2 deepdive and a sprinkel of AWS Compute | AWS Floor28Amazon EC2 deepdive and a sprinkel of AWS Compute | AWS Floor28
Amazon EC2 deepdive and a sprinkel of AWS Compute | AWS Floor28Amazon Web Services
 
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...Amazon Web Services
 
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...Amazon Web Services
 
To Build My Own Cloud with Blackjack…
To Build My Own Cloud with Blackjack…To Build My Own Cloud with Blackjack…
To Build My Own Cloud with Blackjack…Sergey Dzyuban
 
Netflix Open Source Meetup Season 4 Episode 2
Netflix Open Source Meetup Season 4 Episode 2Netflix Open Source Meetup Season 4 Episode 2
Netflix Open Source Meetup Season 4 Episode 2aspyker
 
Ceph Day Beijing - Ceph all-flash array design based on NUMA architecture
Ceph Day Beijing - Ceph all-flash array design based on NUMA architectureCeph Day Beijing - Ceph all-flash array design based on NUMA architecture
Ceph Day Beijing - Ceph all-flash array design based on NUMA architectureCeph Community
 
Ceph Day Beijing - Ceph All-Flash Array Design Based on NUMA Architecture
Ceph Day Beijing - Ceph All-Flash Array Design Based on NUMA ArchitectureCeph Day Beijing - Ceph All-Flash Array Design Based on NUMA Architecture
Ceph Day Beijing - Ceph All-Flash Array Design Based on NUMA ArchitectureDanielle Womboldt
 
(WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014
(WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014(WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014
(WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014Amazon Web Services
 
Anton Moldovan "Building an efficient replication system for thousands of ter...
Anton Moldovan "Building an efficient replication system for thousands of ter...Anton Moldovan "Building an efficient replication system for thousands of ter...
Anton Moldovan "Building an efficient replication system for thousands of ter...Fwdays
 
CMP301_Deep Dive on Amazon EC2 Instances
CMP301_Deep Dive on Amazon EC2 InstancesCMP301_Deep Dive on Amazon EC2 Instances
CMP301_Deep Dive on Amazon EC2 InstancesAmazon Web Services
 
Azure + DataStax Enterprise (DSE) Powers Office365 Per User Store
Azure + DataStax Enterprise (DSE) Powers Office365 Per User StoreAzure + DataStax Enterprise (DSE) Powers Office365 Per User Store
Azure + DataStax Enterprise (DSE) Powers Office365 Per User StoreDataStax Academy
 
Support of containerized workloads in ONAP
Support of containerized workloads in ONAPSupport of containerized workloads in ONAP
Support of containerized workloads in ONAPVictor Morales
 
Beaming flink to the cloud @ netflix ff 2016-monal-daxini
Beaming flink to the cloud @ netflix   ff 2016-monal-daxiniBeaming flink to the cloud @ netflix   ff 2016-monal-daxini
Beaming flink to the cloud @ netflix ff 2016-monal-daxiniMonal Daxini
 

Similar to (BDT323) Amazon EBS & Cassandra: 1 Million Writes Per Second (20)

Approaching hyperconvergedopenstack
Approaching hyperconvergedopenstackApproaching hyperconvergedopenstack
Approaching hyperconvergedopenstack
 
Your Linux AMI: Optimization and Performance (CPN302) | AWS re:Invent 2013
Your Linux AMI: Optimization and Performance (CPN302) | AWS re:Invent 2013Your Linux AMI: Optimization and Performance (CPN302) | AWS re:Invent 2013
Your Linux AMI: Optimization and Performance (CPN302) | AWS re:Invent 2013
 
Trying and evaluating the new features of GlusterFS 3.5
Trying and evaluating the new features of GlusterFS 3.5Trying and evaluating the new features of GlusterFS 3.5
Trying and evaluating the new features of GlusterFS 3.5
 
Using Databases and Containers From Development to Deployment
Using Databases and Containers  From Development to DeploymentUsing Databases and Containers  From Development to Deployment
Using Databases and Containers From Development to Deployment
 
CPN302 your-linux-ami-optimization-and-performance
CPN302 your-linux-ami-optimization-and-performanceCPN302 your-linux-ami-optimization-and-performance
CPN302 your-linux-ami-optimization-and-performance
 
OpenStack and NetApp - Chen Reuven - OpenStack Day Israel 2017
OpenStack and NetApp - Chen Reuven - OpenStack Day Israel 2017OpenStack and NetApp - Chen Reuven - OpenStack Day Israel 2017
OpenStack and NetApp - Chen Reuven - OpenStack Day Israel 2017
 
Amazon EC2 deepdive and a sprinkel of AWS Compute | AWS Floor28
Amazon EC2 deepdive and a sprinkel of AWS Compute | AWS Floor28Amazon EC2 deepdive and a sprinkel of AWS Compute | AWS Floor28
Amazon EC2 deepdive and a sprinkel of AWS Compute | AWS Floor28
 
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
 
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
 
SD Times - Docker v2
SD Times - Docker v2SD Times - Docker v2
SD Times - Docker v2
 
To Build My Own Cloud with Blackjack…
To Build My Own Cloud with Blackjack…To Build My Own Cloud with Blackjack…
To Build My Own Cloud with Blackjack…
 
Netflix Open Source Meetup Season 4 Episode 2
Netflix Open Source Meetup Season 4 Episode 2Netflix Open Source Meetup Season 4 Episode 2
Netflix Open Source Meetup Season 4 Episode 2
 
Ceph Day Beijing - Ceph all-flash array design based on NUMA architecture
Ceph Day Beijing - Ceph all-flash array design based on NUMA architectureCeph Day Beijing - Ceph all-flash array design based on NUMA architecture
Ceph Day Beijing - Ceph all-flash array design based on NUMA architecture
 
Ceph Day Beijing - Ceph All-Flash Array Design Based on NUMA Architecture
Ceph Day Beijing - Ceph All-Flash Array Design Based on NUMA ArchitectureCeph Day Beijing - Ceph All-Flash Array Design Based on NUMA Architecture
Ceph Day Beijing - Ceph All-Flash Array Design Based on NUMA Architecture
 
(WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014
(WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014(WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014
(WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014
 
Anton Moldovan "Building an efficient replication system for thousands of ter...
Anton Moldovan "Building an efficient replication system for thousands of ter...Anton Moldovan "Building an efficient replication system for thousands of ter...
Anton Moldovan "Building an efficient replication system for thousands of ter...
 
CMP301_Deep Dive on Amazon EC2 Instances
CMP301_Deep Dive on Amazon EC2 InstancesCMP301_Deep Dive on Amazon EC2 Instances
CMP301_Deep Dive on Amazon EC2 Instances
 
Azure + DataStax Enterprise (DSE) Powers Office365 Per User Store
Azure + DataStax Enterprise (DSE) Powers Office365 Per User StoreAzure + DataStax Enterprise (DSE) Powers Office365 Per User Store
Azure + DataStax Enterprise (DSE) Powers Office365 Per User Store
 
Support of containerized workloads in ONAP
Support of containerized workloads in ONAPSupport of containerized workloads in ONAP
Support of containerized workloads in ONAP
 
Beaming flink to the cloud @ netflix ff 2016-monal-daxini
Beaming flink to the cloud @ netflix   ff 2016-monal-daxiniBeaming flink to the cloud @ netflix   ff 2016-monal-daxini
Beaming flink to the cloud @ netflix ff 2016-monal-daxini
 

More from Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSAmazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 

More from Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Recently uploaded

Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxRemote DBA Services
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Bhuvaneswari Subramani
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 

Recently uploaded (20)

Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 

(BDT323) Amazon EBS & Cassandra: 1 Million Writes Per Second

  • 1. © 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Jim Plush Sr Director of Engineering, CrowdStrike Dennis Opacki, Sr Cloud Systems Architect, CrowdStrike October 2015 BDT323 Amazon EBS and Cassandra 1 Million Writes Per Second on 60 Nodes
  • 2. An Introduction to CrowdStrike We Are CyberSecurity Technology Company We Detect, Prevent And Respond To All Attack Types In Real Time, Protecting Organizations From Catastrophic Breaches We Provide Next Generation Endpoint Protection, Threat Intelligence & Pre &Post IR Services http://www.crowdstrike.com/introduction-to-crowdstrike-falcon-host/
  • 3. CrowdStrike Scale • Cloud-based endpoint protection • Single customer can generate > 2 TB daily • 500K+ events per second • Multi-petabytes of managed data © 2015. All Rights Reserved.
  • 4. Truisms??? • HTTPS is too slow to run everywhere • All you need is anti-virus • Never run Cassandra on Amazon EBS © 2015. All Rights Reserved.
  • 5. © 2015. All Rights Reserved. What is Amazon EBS? EBS data volume EBS data volume /mnt/foo /mnt/bar EC2 Instance  Network mounted hard drive  Ability to snapshot data  Data encryption at rest & in flight
  • 6. Existing Amazon EBS Assumptions • Jittery I/O a.k.a: Noisy neighbors • Single point of failure in a region • Cost is too damn high • Bad volumes (dd and destroy) © 2015. All Rights Reserved.
  • 7. A Recent Project: Initial Requirements • 1PB of incoming event data from millions of devices • Modeled as a graph • 1 million writes per second (burst) • Age data out after x days • 95% write 5% read © 2015. All Rights Reserved.
  • 8. We Tried • Cassandra + Titan • Sharding? • Neo4J • PostgreSQL, MySQL, SQLite • LevelDB/RocksDB © 2015. All Rights Reserved.
  • 9. We Have to Make This Work Cassandra had the properties we needed Time for a new approach? © 2015. All Rights Reserved. http://techblog.netflix.com/2014/07/revisiting-1-million-writes-per-second.html
  • 10. Number of Machines for 1PB © 2015. All Rights Reserved. 0. 450. 900. 1350. 1800. 2250. I2.xlarge c4.2XL EBS
  • 11. Yearly Cost for 1PB Cluster © 2015. All Rights Reserved. 0. 4. 8. 12. 16. I2.xlarge-on demand I2.xlarge-reserved c4.2xl - on demand c4.2xl - reserved Millionsof$ With Amazon EBS
  • 12. Initial Launch Date Tiered Compaction © 2015. All Rights Reserved. …more details by Jeff Jirsa, CrowdStrike Cassandra Summit 2015 - DTCS http://www.slideshare.net/JeffJirsa1/cassandra-summit-2015-real-world-dtcs-for-operators
  • 13. Initial Launch • Cassandra 2.0.12 (DSE) • m3.2xlarge 8 core • Single 4TB EBS GP2 ~10,000 IOPS • Default tunings © 2015. All Rights Reserved.
  • 14. Performance Was Terrible • 12 node cluster • ~60K writes per second RF2 • ~10K writes per 8 core box • We went to the experts © 2015. All Rights Reserved.
  • 15. © 2015. All Rights Reserved. Cassandra Summit 2014 Family Search asked the same question: Where’s the bottleneck? https://www.youtube.com/watch?v=Qfzg7gcSK-g
  • 16. IOPS Available © 2015. All Rights Reserved. 0. 12500. 25000. 37500. 50000. I2.xlarge c4.2xlarge
  • 17. © 2015. All Rights Reserved. 1.3K IOPS?
  • 18. © 2015. All Rights Reserved. IOPS I see you there, but I can’t reach you!
  • 19.
  • 20. © 2015. All Rights Reserved. The magic gates opened… We hit 1 million writes per second RF3 on 60 nodes
  • 21. © 2015. All Rights Reserved. Testing Setup
  • 22. Testing Methodology • Each test run • clean C* instances • old test keyspaces dropped • 13+TBs of data loaded during read testing • 20 C4.4XL Stress Writers each with their own 1BB sequence © 2015. All Rights Reserved.
  • 23. Cluster Topology © 2015. All Rights Reserved. Stress Node 10 Instances AZ: 1A Stress Nodes 10 Instances AZ: 1B 20 C* Nodes AZ: 1A 20 C* Nodes AZ: 1B 20 C* Nodes AZ: 1C OpsCenter
  • 24. Amazon EBS © 2015. All Rights Reserved.
  • 25. Cassandra Stress 2.1.x © 2015. All Rights Reserved. bin/cassandra-stress user duration=100000m cl=ONE profile=/home/ubuntu/summit_stress.yaml ops(insert=1) no-warmup -pop seq=1..1000000000 -mode native cql3 -node 10.10.10.XX -rate threads=1000 -errors ignore
  • 26. © 2015. All Rights Reserved. PCSTAT - Al Tobey http://www.datastax.com/dev/blog/compaction-improvements-in-cassandra-21 https://github.com/tobert/pcstat
  • 27. © 2015. All Rights Reserved. Netflix Test - What is C* capable of?
  • 28. Netflix Test © 2015. All Rights Reserved. 1+ million writes per second RF:3 3+ million local writes per second NICE!
  • 29. Netflix Test © 2015. All Rights Reserved.
  • 30. Netflix Test © 2015. All Rights Reserved. No dropped mutations, system healthy at 1.1M after 50 mins
  • 31. Netflix Test © 2015. All Rights Reserved. I/O util is not peggedCommit disk = steady!
  • 32. Netflix Test © 2015. All Rights Reserved. Low I/O wait
  • 33. Netflix Test © 2015. All Rights Reserved. 95th Latency = Reasonable
  • 34. Netflix Test - Read Fail © 2015. All Rights Reserved. compression={'chunk_length_kb': '64', 'sstable_compression': 'LZ4Compressor'} https://issues.apache.org/jira/browse/CASSANDRA-10249 https://issues.apache.org/jira/browse/CASSANDRA-8894 Data Drive Pegged 
  • 35. Reading Data • 24-hour read test • over 10 TBs of data in the CF • sustained > 350K reads per second over 24 hours • 1M reads/per sec peak • CL ONE • 12 C4.4XL stress boxes © 2015. All Rights Reserved.
  • 36. Reading Data © 2015. All Rights Reserved.
  • 37. Reading Data © 2015. All Rights Reserved.
  • 38. Reading Data © 2015. All Rights Reserved. Not Pegged 
  • 39. Reading Data © 2015. All Rights Reserved. 7.2ms 95th latency
  • 40. 180 less cores (45 less i2.xlarge instances) • C4.4XL vs. i2.XLarge 24 hour test (sans data transfer cost) • Netflix cluster/stress • Cost: ~$6300 • 285 i2.xlarge $0.85 per hour • CrowdStrike cluster/stress with Amazon EBS cost • Cost: ~$2600 • 60 C4.4XL $0.88 per hour VS Netflix Blog Post
  • 41. • Our test was a single 10K IOPS volume • More/bigger reads? • PIOPS gives you as much throughput as you need • RAID0 multiple Amazon EBS volumes Read Notes with Amazon EBS EBS Data Volume EBS Data Volume /mnt /foo /mnt/bar EC2 Instance
  • 42. © 2015. All Rights Reserved. What Unlocked Performance
  • 43. Major Tweaks • Ubuntu HVM types • Enhanced networking • Now faster than PV • Ubuntu distro tuned for cloud workloads • XFS Filesystem © 2015. All Rights Reserved.
  • 44. Major Tweaks Major Tweaks • Cassandra 2.1 • Java 8 • G1 Garbage Collector © 2015. All Rights Reserved. https://issues.apache.org/jira/browse/CASSANDRA-7486
  • 45. Major Tweaks • C4.4XL 16 core, EBS Optimized • 4TB, 10,000 IOPS EBS GP2 Encrypted Data Drive • 160MB/s throughput • 1TB 3000 IOPS EBS GP2 Encrypted Commit Log Drive © 2015. All Rights Reserved.
  • 46. Major Tweaks cassandra-env.sh • MAX_HEAP_SIZE=8G • JVM_OPTS=“$JVM_OPTS —XX:+UseG1GC” • Lots of other minor tweaks in crowdstrike-tools © 2015. All Rights Reserved.
  • 47. cassandra-env.sh © 2015. All Rights Reserved. Put PID in batch mode Mask CPU0 from the process to reduce context switching Magic From Al Tobey
  • 48. YAML Settings cassandra.yaml (based on 16 core) • concurrent_reads: 32 • concurrent_writes: 64 • memtable_flush_writers: 8 • trickle_fsync: true • trickle_fsync_interval_in_kb: 1000 • native_transport_max_threads: 256 • concurrent_compactors: 4 © 2015. All Rights Reserved.
  • 49. cassandra.yaml © 2015. All Rights Reserved. We found a good portion of the CPU load was being used for internode compression which reduced write throughput internode_compression: none
  • 50. Lessons Learned • Amazon EBS was never the bottleneck during testing, GP2 is legit • Built-in types like list and map come at a performance penalty • 30% hit on our writes using Map type • DTCS is very young (see Jeff Jirsa’s talk) • 2.1 Stress Tool is tricky but great for modeling workloads • How will compression affect your read path? © 2015. All Rights Reserved.
  • 51. © 2015. All Rights Reserved. Test Your Own! https://github.com/CrowdStrike/cassandra-tools
  • 52. It’s Just Python launch 20 nodes in us-east-1 • python launch.py launch --nodes=20 —config=c4-ebs-hvm —az=us-east-1a bootstrap the new nodes with C*, RAID/Format disks, etc… • fab -u ubuntu bootstrapcass21:config=c4-highperf run arbitrary commands • fab -u ubuntu cmd:config=c4-highperf,cmd="sudo rm -rf /mnt/cassandra/data/summit_stress" © 2015. All Rights Reserved.
  • 53. Run Custom Stress Profiles… Multi-Node Support ubuntu@ip-10-10-10.XX:~$ python runstress.py --profile=stress10 —seednode=10.10.10.XX —-threads=50 Going to run: /home/ubuntu/apache-cassandra-2.1.5/tools/bin/cassandra-stress user duration=100000m cl=ONE profile=/home/ubuntu/summit_stress.yaml ops(insert=1,simple=9) no-warmup -pop seq=1..1000000000 -mode native cql3 -node 10.10.10.XX -rate threads=50 -errors ignore © 2015. All Rights Reserved. ubuntu@ip-10-10-10.XX:~$ python runstress.py --profile=stress10 --seednode=10.10.10.XX --threads=50 Going to run: /home/ubuntu/apache-cassandra-2.1.5/tools/bin/cassandra-stress user duration=100000m cl=ONE profile=/home/ubuntu/summit_stress.yaml ops(insert=1,simple=9) no-warmup -pop seq=1000000001..2000000000 -mode native cql3 -node 10.10.10.XX -rate threads=50 -errors ignore export NODENUM=1 export NODENUM=2
  • 54. • ~3 months on our Amazon EBS–based cluster • Hundreds of TBs of graph data and growing in C* • Billions of vertices/edges • Changing perceptions? • DataStax - Planning an Amazon EC2 cluster Where Are We Today?
  • 55. Al Tobey’s Tuning Guide for Cassandra 2.1 https://tobert.github.io/pages/als-cassandra-21-tuning- guide.html Resources
  • 56. Special Thanks To Leif Jackson Marcus King Alan Hannan Jeff Jirsa © 2015. All Rights Reserved. • Al Tobey • Nick Panahi • J.B. Langston • Marcus Eriksson • Iian Finlayson • Dani Traphagen
  • 57. Amazon EBS Heading Into 2016 © 2015. All Rights Reserved.
  • 58. 4TB (10k IOPS) GP2 I/O Hit? Not enough to phase C*
  • 59. © 2015. All Rights Reserved. So why the hate for Amazon EBS?
  • 60. © 2015. All Rights Reserved. • Used instance-store image and ephemeral drives • Painful to stop/start instances, resize • Couldn’t avoid scheduled maintenance (i.e., Reboot-a-palooza) • Encryption required shenanigans Following the Crowd – Trust Issues
  • 61. © 2015. All Rights Reserved. • We still had failures • Now we get to rebuild from scratch Guess What
  • 62. © 2015. All Rights Reserved. What do you mean my volume is “stuck”? • April 2011 – Netflix, Reddit, and Quora • October 2012 – Reddit, Imgur, Heroku • August 2013 – Vine, Airbnb Amazon EBS’s Troubled Childhood
  • 63. © 2015. All Rights Reserved. http://techblog.netflix.com/2011/04/lessons- netflix-learned-from-aws-outage.html Spread services across multiple regions Test failure scenarios regularly (Chaos Monkey) Make Cassandra databases more resilient by avoiding Amazon EBS Kiss of Death
  • 64. © 2015. All Rights Reserved. Amazon moves quickly and quietly: • March 2011 – New Amazon EBS GM • July 2012 – Provisioned IOPs • May 2014 – Native encryption • Jun 2014 – GP2 (game changer) • Mar 2015 – 16TB / 10K GP2/ 20K PIOPS Redemption
  • 65. © 2015. All Rights Reserved. • Prioritized Amazon EBS availability and consistency beyond features and functionality • Compartmentalized the control plane – removed cross-AZ dependencies for running volumes • Simplified workflows to favor sustained operation • Tested and simulated via TLA+/PlusCal - better understood corner cases • Dedicated a large fraction of engineering resources to reliability and performance Redemption
  • 66. © 2015. All Rights Reserved. Amazon EBS team targets 99.999% availability exceeding expectations Reliability
  • 67. © 2015. All Rights Reserved. • In past 12 months, zero Amazon EBS– related failures • Thousands of GP2 data volumes (~2PB data) • Transitioning all systems to Amazon EBS root drives • Moved all data stores to Amazon EBS (C*, Kafka, Elasticsearch, Postgres, etc.) CrowdStrike Today
  • 68.
  • 69. © 2015. All Rights Reserved. • Select a region with >2 AZs (e.g., us-east-1 or us-west-2) • Use Amazon EBS GP2 or PIOPs storage • Separate volumes for data and commit logs Staying Safe - Architecture
  • 70. © 2015. All Rights Reserved. • Use Amazon EBS volume monitoring • Pre-warm Amazon EBS volumes? • Schedule snapshots for consistent backups Staying Safe - Ops
  • 71. © 2015. All Rights Reserved. • Challenge assumptions • Stay current on AWS blog • Talk with your peers Most Importantly http://aws.amazon.com/ebs/nosql/
  • 72.
  • 73. Remember to complete your evaluations! BDT323