SlideShare a Scribd company logo
COUNTERS AT SCALE
A Cautionary Tale
Eric Lubow @elubow #CassandraSummit
PERSONAL VANITY
๏ CTO of SimpleReach
๏ Co-Author of Practical
Cassandra
๏ Skydiver, Mixed Martial
Artist, Motorcyclist, Dog Dad
(IG: @charliedognyc), NY
Giants fan
Eric Lubow @elubow #CassandraSummit
SIMPLEREACH
๏ Identify the best content
๏ Use engagement metrics
๏ Stream processing ingest
๏ Many metrics, time sliced
๏ Lots of counting
Eric Lubow @elubow #CassandraSummit
SIMPLEREACH CONTEXT
๏ 100 million URLs
๏ 350 million Tweets
๏ 50k - 100k events per second (tens of billions of events per day)
๏ Average 250k-300k counter writes per second
๏ 225G new per hour
๏ 800T of total compressed data (10T per month)
๏ 10T of hot data
๏ 72 nodes Cassandra cluster
๏ 52 i2.2xlarge Realtime Nodes
๏ 9 i2.xlarge Search Nodes
๏ 11 i2.2xlarge Spark Nodes
Eric Lubow @elubow #CassandraSummit
LET’S LOOK AT THE USE-CASE
Eric Lubow @elubow #CassandraSummit
Solr
Solr
Vertica + Cassandra
Vertica + Cassandra
Vertica
Mongo
Eric Lubow @elubow #CassandraSummit
EARLY CHALLENGES
๏ Startup with a serious budget
๏ Had to think through how we would scale (can’t throw money at it)
๏ Told not to use counters, but there was nothing better
๏ Knew nothing about Cassandra, knew more about Mongo and Redis
๏ Didn’t want to write our own (support)
๏ Were no drivers for our languages
๏ No ideas about counter failure scenarios
๏ Neither did Datastax
Eric Lubow @elubow #CassandraSummit
HOW DID SIMPLEREACH GET FROM …
๏ Server/DB level locking
๏ Shards/Replica sets
๏ Mongostat
๏ Leader election headaches
๏ Locking counters
๏ Read before write
๏ All in one (no master/slave)
๏ JMX instrumentation/monitoring
๏ Better fault tolerance
๏ Better counters
Eric Lubow @elubow #CassandraSummit
๏ Originally one large table
๏ Each row is a URL
๏ 0-200 cells per hour per row that saw activity
๏ Counter tables are now broken down by month (avoid wide rows)
๏ Counters are primarily CPU bound operations
๏ Requires SSD nodes and many core machines (memory not a factor)
WHAT DO COUNTERS LOOK LIKE?
Eric Lubow @elubow #CassandraSummit
HOW DID WE MAKE IT WORK?
Eric Lubow @elubow #CassandraSummit
1. All things possible through monitoring
2. Pre-aggregate writes (saved us 10x the writes)
3. Use counter batches
4. Trying to defeat the counter time bomb
5. Breaking the rules with CASSANDRA-8150, much JVM tuning
6. Upgraded every node in the cluster by hand one at a time
7. Upgrading to 2.1 definitely sealed the deal
BEAT COUNTERS WITH OUR 7 STEP PROCESS
Eric Lubow @elubow #CassandraSummit
1. MONITOR EVERYTHING
Eric Lubow @elubow #CassandraSummit
2. HELPERS FOR AN AFFORDABLE CLUSTER
Aggregator
Mongo Writer
Broadcast
Redis Writer
Cassandra Writer
Solr Writer
Calculator
NSQ
Vertica Writer
10x
Improvement
Eric Lubow @elubow #CassandraSummit
๏ Roughly 1k batches per second
๏ Each batch contains approximately 100 statements
๏ Totals 100k/sec
๏ With an RF=3, that’s 300k counter writes per second
๏ Without batches (using async writes), this would be 4x the load
3. WE DON’T NEED NO STINKIN’ BATCHES
Eric Lubow @elubow #CassandraSummit
ASYNC WRITES
Eric Lubow @elubow #CassandraSummit
BATCH WRITES
Eric Lubow @elubow #CassandraSummit
๏ A slow node might make the entire cluster unusable
๏ A poorly gossiping node might overwork itself out of the cluster
๏ ReplicateOnWrite (and others) shared thread with gossip
๏ Occasional problematic GC pause durations
๏ Potential accuracy issues due to timeouts, retries, and non-
idempotent writes
๏ Counter time bomb
4. FAILURE IS INEVITABLE
Eric Lubow @elubow #CassandraSummit
๏ Normally writes return to client and background jobs occur (SEDA)
๏ Counters write, return to client, and then reconcile
๏ Similar in work to every write doing a read repair
๏ Overloaded ReplicateOnWrite thread (now called Counter
Mutation)
๏ Also (sometimes) overloaded the event thread backing up gossip
๏ Could happen at any time (seemingly randomly)
4. COUNTER TIME BOMB
Eric Lubow @elubow #CassandraSummit
๏ internode_compression: turned off for additional CPU cycles to be dedicated to
counters
๏ heap_new_size: 3G: created a larger young gen to handle the objects allocated for
counters
๏ -UseBiasedLocking: Biased locking limits an objects lock to a single thread. If
many threads are using that object, then removing biased locking will likely
increase performance.
๏ +UseGCTaskAffinity: Allocates GC tasks to threads using an affinity parameter
๏ +BindGCTaskThreadsToCPUs: Binds GC threads to individual CPUs.
๏ ParGCCardsPerStrideChunk=32768: Greatly increases the number of chunks that
ParNew GC is doing. This allows GC threads to be more efficient with the work to
be done and steal yet to be completed work from other GC threads.
5. BREAKING THE RULES WITH 8150 1/2
Eric Lubow @elubow #CassandraSummit
๏ +CMSScavengeBeforeRemark: Forces a minor collection to occur before
the remark thus shortening the remark phase.
๏ +CMSMaxAbortablePrecleanTime=60000: The fixed amount of time
before starting the remark during the precleaning phase where the top of
the young gen is sampled.
๏ CMSWaitDuration=30000: Sets a cap on the amount of time CMS cycle
should work. CMS can slow down with very large objects.
๏ MaxGCPauseMillis=5: prevents long GC pauses from making the machine
become unavailable
๏ +PerfDisableSharedMem: Prevents the writing of JVM stats to an
MMAP’d file (hsperfdata) from blocking I/O
5. BREAKING THE RULES WITH 8150 2/2
Eric Lubow @elubow #CassandraSummit
5. CAN YOU SPOT THE CHANGE?
New JVM Settings
๏ Each message has roughly 100 counter operations
๏ 100 operations * 52 million messages = 5.2 billion operations per hour
๏ 1.5 million counter operations per second
Eric Lubow @elubow #CassandraSummit
๏ 50+ nodes w/ 500+ gigs per node
๏ i2.xlarge => i2.2xlarge
๏ Additional cores and CPU
๏ Additional space on each node
๏ Better networking throughput
๏ Upgrading 10+ nodes at a time
๏ Would have been much easier with static internal IP addresses
6. MAKE IT BIGGER
Eric Lubow @elubow #CassandraSummit
๏ < 2.1: Counter deltas written directly to the commit logs
๏ Contentious counters created large problems for GC
๏ >= 2.1: Reads the current value for the counter, applies delta and
adds final value to the MemTable
๏ Better garbage collection of outstanding objects (fewer objects)
๏ Created CounterCache for hot counters used for conflict resolution
only (for better performance on contentious counters)
๏ For full details, read this post: http://www.datastax.com/dev/blog/
whats-new-in-cassandra-2-1-a-better-implementation-of-counters
7. COUNTERS THEN AND NOW
Eric Lubow @elubow #CassandraSummit
WHAT SHOULD YOU WALK AWAY WITH?
๏ Incredibly important to have a deep
understanding around your cases
๏ Sometimes database tuning has nothing to do
with database settings
๏ Understand failure scenarios
๏ #monitoring / #instrumentation
๏ Ignoring best practices is ALMOST never a good
idea
Eric Lubow @elubow #CassandraSummit
THANKS FOR LISTENING
Eric Lubow @elubow #CassandraSummit
QUESTIONS IN LIFE ARE GUARANTEED,
ANSWERS AREN’T.
Eric Lubow
@elubow
#CassandraSummit

More Related Content

What's hot

Node.JS: Do you know the dependency of your dependencies dependency
Node.JS: Do you know the dependency of your dependencies dependencyNode.JS: Do you know the dependency of your dependencies dependency
Node.JS: Do you know the dependency of your dependencies dependency
Wim Selles
 
ESNext, service workers, and the future of the web
ESNext, service workers, and the future of the webESNext, service workers, and the future of the web
ESNext, service workers, and the future of the web
Jemuel Young
 
MJ Berends talk - Women & Non-Binary Focused Intro to AWS
 MJ Berends talk - Women & Non-Binary Focused Intro to AWS MJ Berends talk - Women & Non-Binary Focused Intro to AWS
MJ Berends talk - Women & Non-Binary Focused Intro to AWS
AWS Chicago
 
Realtime MVC with Sails.js
Realtime MVC with Sails.jsRealtime MVC with Sails.js
Realtime MVC with Sails.js
Serdar Dogruyol
 
About Clack
About ClackAbout Clack
About Clack
fukamachi
 
Container and microservices: a love story
Container and microservices: a love storyContainer and microservices: a love story
Container and microservices: a love story
Thomas Rossetto
 
Sensu and Sensibility - Puppetconf 2014
Sensu and Sensibility - Puppetconf 2014Sensu and Sensibility - Puppetconf 2014
Sensu and Sensibility - Puppetconf 2014
Tomas Doran
 
ReactJS.NET - Fast and Scalable Single Page Applications
ReactJS.NET - Fast and Scalable Single Page ApplicationsReactJS.NET - Fast and Scalable Single Page Applications
ReactJS.NET - Fast and Scalable Single Page Applications
Rick Beerendonk
 
Webconf nodejs-production-architecture
Webconf nodejs-production-architectureWebconf nodejs-production-architecture
Webconf nodejs-production-architecture
Ben Lin
 
Woo: Writing a fast web server @ ELS2015
Woo: Writing a fast web server @ ELS2015Woo: Writing a fast web server @ ELS2015
Woo: Writing a fast web server @ ELS2015
fukamachi
 
Planning to Fail #phpuk13
Planning to Fail #phpuk13Planning to Fail #phpuk13
Planning to Fail #phpuk13
Dave Gardner
 
自分をClojure化する方法
自分をClojure化する方法自分をClojure化する方法
自分をClojure化する方法
fukamachi
 
Symfony Deployments on Heroku
Symfony Deployments on HerokuSymfony Deployments on Heroku
Symfony Deployments on Heroku
Stefan Adolf
 
Lisp in the Cloud
Lisp in the CloudLisp in the Cloud
Lisp in the Cloud
Mike Travers
 
Akka.net versus microsoft orleans
Akka.net versus microsoft orleansAkka.net versus microsoft orleans
Akka.net versus microsoft orleans
Bill Tulloch
 
Asterisk, HTML5 and NodeJS; a world of endless possibilities
Asterisk, HTML5 and NodeJS; a world of endless possibilitiesAsterisk, HTML5 and NodeJS; a world of endless possibilities
Asterisk, HTML5 and NodeJS; a world of endless possibilities
Dan Jenkins
 
Security Testing with OWASP ZAP in CI/CD - Simon Bennetts - Codemotion Amster...
Security Testing with OWASP ZAP in CI/CD - Simon Bennetts - Codemotion Amster...Security Testing with OWASP ZAP in CI/CD - Simon Bennetts - Codemotion Amster...
Security Testing with OWASP ZAP in CI/CD - Simon Bennetts - Codemotion Amster...
Codemotion
 
Altitude SF 2017: Advanced VCL: Shielding and Clustering
Altitude SF 2017: Advanced VCL: Shielding and ClusteringAltitude SF 2017: Advanced VCL: Shielding and Clustering
Altitude SF 2017: Advanced VCL: Shielding and Clustering
Fastly
 
(WEB307) Scalable Site Management Using AWS OpsWorks | AWS re:Invent 2014
(WEB307) Scalable Site Management Using AWS OpsWorks | AWS re:Invent 2014(WEB307) Scalable Site Management Using AWS OpsWorks | AWS re:Invent 2014
(WEB307) Scalable Site Management Using AWS OpsWorks | AWS re:Invent 2014
Amazon Web Services
 

What's hot (19)

Node.JS: Do you know the dependency of your dependencies dependency
Node.JS: Do you know the dependency of your dependencies dependencyNode.JS: Do you know the dependency of your dependencies dependency
Node.JS: Do you know the dependency of your dependencies dependency
 
ESNext, service workers, and the future of the web
ESNext, service workers, and the future of the webESNext, service workers, and the future of the web
ESNext, service workers, and the future of the web
 
MJ Berends talk - Women & Non-Binary Focused Intro to AWS
 MJ Berends talk - Women & Non-Binary Focused Intro to AWS MJ Berends talk - Women & Non-Binary Focused Intro to AWS
MJ Berends talk - Women & Non-Binary Focused Intro to AWS
 
Realtime MVC with Sails.js
Realtime MVC with Sails.jsRealtime MVC with Sails.js
Realtime MVC with Sails.js
 
About Clack
About ClackAbout Clack
About Clack
 
Container and microservices: a love story
Container and microservices: a love storyContainer and microservices: a love story
Container and microservices: a love story
 
Sensu and Sensibility - Puppetconf 2014
Sensu and Sensibility - Puppetconf 2014Sensu and Sensibility - Puppetconf 2014
Sensu and Sensibility - Puppetconf 2014
 
ReactJS.NET - Fast and Scalable Single Page Applications
ReactJS.NET - Fast and Scalable Single Page ApplicationsReactJS.NET - Fast and Scalable Single Page Applications
ReactJS.NET - Fast and Scalable Single Page Applications
 
Webconf nodejs-production-architecture
Webconf nodejs-production-architectureWebconf nodejs-production-architecture
Webconf nodejs-production-architecture
 
Woo: Writing a fast web server @ ELS2015
Woo: Writing a fast web server @ ELS2015Woo: Writing a fast web server @ ELS2015
Woo: Writing a fast web server @ ELS2015
 
Planning to Fail #phpuk13
Planning to Fail #phpuk13Planning to Fail #phpuk13
Planning to Fail #phpuk13
 
自分をClojure化する方法
自分をClojure化する方法自分をClojure化する方法
自分をClojure化する方法
 
Symfony Deployments on Heroku
Symfony Deployments on HerokuSymfony Deployments on Heroku
Symfony Deployments on Heroku
 
Lisp in the Cloud
Lisp in the CloudLisp in the Cloud
Lisp in the Cloud
 
Akka.net versus microsoft orleans
Akka.net versus microsoft orleansAkka.net versus microsoft orleans
Akka.net versus microsoft orleans
 
Asterisk, HTML5 and NodeJS; a world of endless possibilities
Asterisk, HTML5 and NodeJS; a world of endless possibilitiesAsterisk, HTML5 and NodeJS; a world of endless possibilities
Asterisk, HTML5 and NodeJS; a world of endless possibilities
 
Security Testing with OWASP ZAP in CI/CD - Simon Bennetts - Codemotion Amster...
Security Testing with OWASP ZAP in CI/CD - Simon Bennetts - Codemotion Amster...Security Testing with OWASP ZAP in CI/CD - Simon Bennetts - Codemotion Amster...
Security Testing with OWASP ZAP in CI/CD - Simon Bennetts - Codemotion Amster...
 
Altitude SF 2017: Advanced VCL: Shielding and Clustering
Altitude SF 2017: Advanced VCL: Shielding and ClusteringAltitude SF 2017: Advanced VCL: Shielding and Clustering
Altitude SF 2017: Advanced VCL: Shielding and Clustering
 
(WEB307) Scalable Site Management Using AWS OpsWorks | AWS re:Invent 2014
(WEB307) Scalable Site Management Using AWS OpsWorks | AWS re:Invent 2014(WEB307) Scalable Site Management Using AWS OpsWorks | AWS re:Invent 2014
(WEB307) Scalable Site Management Using AWS OpsWorks | AWS re:Invent 2014
 

Viewers also liked

Manage your compactions before they manage you!
Manage your compactions before they manage you!Manage your compactions before they manage you!
Manage your compactions before they manage you!
Carlos Juzarte Rolo
 
Cassandra from tarball to production
Cassandra   from tarball to productionCassandra   from tarball to production
Cassandra from tarball to production
Ron Kuris
 
Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra
Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable CassandraCassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra
Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra
aaronmorton
 
A deep look at the cql where clause
A deep look at the cql where clauseA deep look at the cql where clause
A deep look at the cql where clause
Benjamin Lerer
 
Hardening cassandra for compliance or paranoia
Hardening cassandra for compliance or paranoiaHardening cassandra for compliance or paranoia
Hardening cassandra for compliance or paranoia
zznate
 
Devoxx France: Fault tolerant microservices on the JVM with Cassandra
Devoxx France: Fault tolerant microservices on the JVM with CassandraDevoxx France: Fault tolerant microservices on the JVM with Cassandra
Devoxx France: Fault tolerant microservices on the JVM with Cassandra
Christopher Batey
 
Case Study: Troubleshooting Cassandra performance issues as a developer
Case Study: Troubleshooting Cassandra performance issues as a developerCase Study: Troubleshooting Cassandra performance issues as a developer
Case Study: Troubleshooting Cassandra performance issues as a developer
Carlos Alonso Pérez
 
Tombstones and Compaction
Tombstones and CompactionTombstones and Compaction
Tombstones and Compaction
DataStax Academy
 
Cassandra Summit 2015: Real World DTCS For Operators
Cassandra Summit 2015: Real World DTCS For OperatorsCassandra Summit 2015: Real World DTCS For Operators
Cassandra Summit 2015: Real World DTCS For Operators
Jeff Jirsa
 
Indexing in Cassandra
Indexing in CassandraIndexing in Cassandra
Indexing in Cassandra
Ed Anuff
 

Viewers also liked (10)

Manage your compactions before they manage you!
Manage your compactions before they manage you!Manage your compactions before they manage you!
Manage your compactions before they manage you!
 
Cassandra from tarball to production
Cassandra   from tarball to productionCassandra   from tarball to production
Cassandra from tarball to production
 
Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra
Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable CassandraCassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra
Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra
 
A deep look at the cql where clause
A deep look at the cql where clauseA deep look at the cql where clause
A deep look at the cql where clause
 
Hardening cassandra for compliance or paranoia
Hardening cassandra for compliance or paranoiaHardening cassandra for compliance or paranoia
Hardening cassandra for compliance or paranoia
 
Devoxx France: Fault tolerant microservices on the JVM with Cassandra
Devoxx France: Fault tolerant microservices on the JVM with CassandraDevoxx France: Fault tolerant microservices on the JVM with Cassandra
Devoxx France: Fault tolerant microservices on the JVM with Cassandra
 
Case Study: Troubleshooting Cassandra performance issues as a developer
Case Study: Troubleshooting Cassandra performance issues as a developerCase Study: Troubleshooting Cassandra performance issues as a developer
Case Study: Troubleshooting Cassandra performance issues as a developer
 
Tombstones and Compaction
Tombstones and CompactionTombstones and Compaction
Tombstones and Compaction
 
Cassandra Summit 2015: Real World DTCS For Operators
Cassandra Summit 2015: Real World DTCS For OperatorsCassandra Summit 2015: Real World DTCS For Operators
Cassandra Summit 2015: Real World DTCS For Operators
 
Indexing in Cassandra
Indexing in CassandraIndexing in Cassandra
Indexing in Cassandra
 

Similar to Counters At Scale - A Cautionary Tale

Optimizing Your Cluster with Coordinator Nodes (Eric Lubow, SimpleReach) | Ca...
Optimizing Your Cluster with Coordinator Nodes (Eric Lubow, SimpleReach) | Ca...Optimizing Your Cluster with Coordinator Nodes (Eric Lubow, SimpleReach) | Ca...
Optimizing Your Cluster with Coordinator Nodes (Eric Lubow, SimpleReach) | Ca...
DataStax
 
Making It To Veteren Cassandra Status
Making It To Veteren Cassandra StatusMaking It To Veteren Cassandra Status
Making It To Veteren Cassandra Status
Eric Lubow
 
Scaling an ELK stack at bol.com
Scaling an ELK stack at bol.comScaling an ELK stack at bol.com
Scaling an ELK stack at bol.com
Renzo Tomà
 
Tuning Solr for Logs: Presented by Radu Gheorghe, Sematext
Tuning Solr for Logs: Presented by Radu Gheorghe, SematextTuning Solr for Logs: Presented by Radu Gheorghe, Sematext
Tuning Solr for Logs: Presented by Radu Gheorghe, Sematext
Lucidworks
 
Tweaking performance on high-load projects
Tweaking performance on high-load projectsTweaking performance on high-load projects
Tweaking performance on high-load projects
Dmitriy Dumanskiy
 
Running ElasticSearch on Google Compute Engine in Production
Running ElasticSearch on Google Compute Engine in ProductionRunning ElasticSearch on Google Compute Engine in Production
Running ElasticSearch on Google Compute Engine in Production
Searce Inc
 
Optimizing elastic search on google compute engine
Optimizing elastic search on google compute engineOptimizing elastic search on google compute engine
Optimizing elastic search on google compute engine
Bhuvaneshwaran R
 
Cassandra Day Atlanta 2015: Recording the Web: High-Fidelity Storage and Play...
Cassandra Day Atlanta 2015: Recording the Web: High-Fidelity Storage and Play...Cassandra Day Atlanta 2015: Recording the Web: High-Fidelity Storage and Play...
Cassandra Day Atlanta 2015: Recording the Web: High-Fidelity Storage and Play...
DataStax Academy
 
SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...
SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...
SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...
Fred de Villamil
 
Super scaling singleton inserts
Super scaling singleton insertsSuper scaling singleton inserts
Super scaling singleton inserts
Chris Adkin
 
Real Time Big Data (w/ NoSQL)
Real Time Big Data (w/ NoSQL)Real Time Big Data (w/ NoSQL)
Real Time Big Data (w/ NoSQL)
Stein Writes Inc.
 
Collecting 600M events/day
Collecting 600M events/dayCollecting 600M events/day
Collecting 600M events/day
Lars Marius Garshol
 
Low latency in java 8 by Peter Lawrey
Low latency in java 8 by Peter Lawrey Low latency in java 8 by Peter Lawrey
Low latency in java 8 by Peter Lawrey
J On The Beach
 
Embulk - 進化するバルクデータローダ
Embulk - 進化するバルクデータローダEmbulk - 進化するバルクデータローダ
Embulk - 進化するバルクデータローダ
Sadayuki Furuhashi
 
1 Million Writes per second on 60 nodes with Cassandra and EBS
1 Million Writes per second on 60 nodes with Cassandra and EBS1 Million Writes per second on 60 nodes with Cassandra and EBS
1 Million Writes per second on 60 nodes with Cassandra and EBS
Jim Plush
 
EVCache & Moneta (GoSF)
EVCache & Moneta (GoSF)EVCache & Moneta (GoSF)
EVCache & Moneta (GoSF)
Scott Mansfield
 
SfDay 2019: Head first into Symfony Cache, Redis & Redis Cluster
SfDay 2019: Head first into Symfony Cache, Redis & Redis ClusterSfDay 2019: Head first into Symfony Cache, Redis & Redis Cluster
SfDay 2019: Head first into Symfony Cache, Redis & Redis Cluster
André Rømcke
 
(BDT323) Amazon EBS & Cassandra: 1 Million Writes Per Second
(BDT323) Amazon EBS & Cassandra: 1 Million Writes Per Second(BDT323) Amazon EBS & Cassandra: 1 Million Writes Per Second
(BDT323) Amazon EBS & Cassandra: 1 Million Writes Per Second
Amazon Web Services
 
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
Amazon Web Services
 
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
Amazon Web Services
 

Similar to Counters At Scale - A Cautionary Tale (20)

Optimizing Your Cluster with Coordinator Nodes (Eric Lubow, SimpleReach) | Ca...
Optimizing Your Cluster with Coordinator Nodes (Eric Lubow, SimpleReach) | Ca...Optimizing Your Cluster with Coordinator Nodes (Eric Lubow, SimpleReach) | Ca...
Optimizing Your Cluster with Coordinator Nodes (Eric Lubow, SimpleReach) | Ca...
 
Making It To Veteren Cassandra Status
Making It To Veteren Cassandra StatusMaking It To Veteren Cassandra Status
Making It To Veteren Cassandra Status
 
Scaling an ELK stack at bol.com
Scaling an ELK stack at bol.comScaling an ELK stack at bol.com
Scaling an ELK stack at bol.com
 
Tuning Solr for Logs: Presented by Radu Gheorghe, Sematext
Tuning Solr for Logs: Presented by Radu Gheorghe, SematextTuning Solr for Logs: Presented by Radu Gheorghe, Sematext
Tuning Solr for Logs: Presented by Radu Gheorghe, Sematext
 
Tweaking performance on high-load projects
Tweaking performance on high-load projectsTweaking performance on high-load projects
Tweaking performance on high-load projects
 
Running ElasticSearch on Google Compute Engine in Production
Running ElasticSearch on Google Compute Engine in ProductionRunning ElasticSearch on Google Compute Engine in Production
Running ElasticSearch on Google Compute Engine in Production
 
Optimizing elastic search on google compute engine
Optimizing elastic search on google compute engineOptimizing elastic search on google compute engine
Optimizing elastic search on google compute engine
 
Cassandra Day Atlanta 2015: Recording the Web: High-Fidelity Storage and Play...
Cassandra Day Atlanta 2015: Recording the Web: High-Fidelity Storage and Play...Cassandra Day Atlanta 2015: Recording the Web: High-Fidelity Storage and Play...
Cassandra Day Atlanta 2015: Recording the Web: High-Fidelity Storage and Play...
 
SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...
SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...
SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...
 
Super scaling singleton inserts
Super scaling singleton insertsSuper scaling singleton inserts
Super scaling singleton inserts
 
Real Time Big Data (w/ NoSQL)
Real Time Big Data (w/ NoSQL)Real Time Big Data (w/ NoSQL)
Real Time Big Data (w/ NoSQL)
 
Collecting 600M events/day
Collecting 600M events/dayCollecting 600M events/day
Collecting 600M events/day
 
Low latency in java 8 by Peter Lawrey
Low latency in java 8 by Peter Lawrey Low latency in java 8 by Peter Lawrey
Low latency in java 8 by Peter Lawrey
 
Embulk - 進化するバルクデータローダ
Embulk - 進化するバルクデータローダEmbulk - 進化するバルクデータローダ
Embulk - 進化するバルクデータローダ
 
1 Million Writes per second on 60 nodes with Cassandra and EBS
1 Million Writes per second on 60 nodes with Cassandra and EBS1 Million Writes per second on 60 nodes with Cassandra and EBS
1 Million Writes per second on 60 nodes with Cassandra and EBS
 
EVCache & Moneta (GoSF)
EVCache & Moneta (GoSF)EVCache & Moneta (GoSF)
EVCache & Moneta (GoSF)
 
SfDay 2019: Head first into Symfony Cache, Redis & Redis Cluster
SfDay 2019: Head first into Symfony Cache, Redis & Redis ClusterSfDay 2019: Head first into Symfony Cache, Redis & Redis Cluster
SfDay 2019: Head first into Symfony Cache, Redis & Redis Cluster
 
(BDT323) Amazon EBS & Cassandra: 1 Million Writes Per Second
(BDT323) Amazon EBS & Cassandra: 1 Million Writes Per Second(BDT323) Amazon EBS & Cassandra: 1 Million Writes Per Second
(BDT323) Amazon EBS & Cassandra: 1 Million Writes Per Second
 
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
 
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
 

Recently uploaded

Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Crescat
 
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of CodeA Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
Aftab Hussain
 
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
Adele Miller
 
DDS-Security 1.2 - What's New? Stronger security for long-running systems
DDS-Security 1.2 - What's New? Stronger security for long-running systemsDDS-Security 1.2 - What's New? Stronger security for long-running systems
DDS-Security 1.2 - What's New? Stronger security for long-running systems
Gerardo Pardo-Castellote
 
Webinar On-Demand: Using Flutter for Embedded
Webinar On-Demand: Using Flutter for EmbeddedWebinar On-Demand: Using Flutter for Embedded
Webinar On-Demand: Using Flutter for Embedded
ICS
 
Microservice Teams - How the cloud changes the way we work
Microservice Teams - How the cloud changes the way we workMicroservice Teams - How the cloud changes the way we work
Microservice Teams - How the cloud changes the way we work
Sven Peters
 
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CDKuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
rodomar2
 
E-commerce Development Services- Hornet Dynamics
E-commerce Development Services- Hornet DynamicsE-commerce Development Services- Hornet Dynamics
E-commerce Development Services- Hornet Dynamics
Hornet Dynamics
 
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdfAutomated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
timtebeek1
 
OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024
OpenMetadata
 
Empowering Growth with Best Software Development Company in Noida - Deuglo
Empowering Growth with Best Software  Development Company in Noida - DeugloEmpowering Growth with Best Software  Development Company in Noida - Deuglo
Empowering Growth with Best Software Development Company in Noida - Deuglo
Deuglo Infosystem Pvt Ltd
 
E-commerce Application Development Company.pdf
E-commerce Application Development Company.pdfE-commerce Application Development Company.pdf
E-commerce Application Development Company.pdf
Hornet Dynamics
 
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit ParisNeo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j
 
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling ExtensionsUI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
Peter Muessig
 
Enterprise Resource Planning System in Telangana
Enterprise Resource Planning System in TelanganaEnterprise Resource Planning System in Telangana
Enterprise Resource Planning System in Telangana
NYGGS Automation Suite
 
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI AppAI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
Google
 
Hand Rolled Applicative User Validation Code Kata
Hand Rolled Applicative User ValidationCode KataHand Rolled Applicative User ValidationCode Kata
Hand Rolled Applicative User Validation Code Kata
Philip Schwarz
 
GraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph TechnologyGraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph Technology
Neo4j
 
What is Augmented Reality Image Tracking
What is Augmented Reality Image TrackingWhat is Augmented Reality Image Tracking
What is Augmented Reality Image Tracking
pavan998932
 
Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Łukasz Chruściel
 

Recently uploaded (20)

Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
 
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of CodeA Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
 
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
 
DDS-Security 1.2 - What's New? Stronger security for long-running systems
DDS-Security 1.2 - What's New? Stronger security for long-running systemsDDS-Security 1.2 - What's New? Stronger security for long-running systems
DDS-Security 1.2 - What's New? Stronger security for long-running systems
 
Webinar On-Demand: Using Flutter for Embedded
Webinar On-Demand: Using Flutter for EmbeddedWebinar On-Demand: Using Flutter for Embedded
Webinar On-Demand: Using Flutter for Embedded
 
Microservice Teams - How the cloud changes the way we work
Microservice Teams - How the cloud changes the way we workMicroservice Teams - How the cloud changes the way we work
Microservice Teams - How the cloud changes the way we work
 
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CDKuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
 
E-commerce Development Services- Hornet Dynamics
E-commerce Development Services- Hornet DynamicsE-commerce Development Services- Hornet Dynamics
E-commerce Development Services- Hornet Dynamics
 
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdfAutomated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
 
OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024
 
Empowering Growth with Best Software Development Company in Noida - Deuglo
Empowering Growth with Best Software  Development Company in Noida - DeugloEmpowering Growth with Best Software  Development Company in Noida - Deuglo
Empowering Growth with Best Software Development Company in Noida - Deuglo
 
E-commerce Application Development Company.pdf
E-commerce Application Development Company.pdfE-commerce Application Development Company.pdf
E-commerce Application Development Company.pdf
 
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit ParisNeo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
 
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling ExtensionsUI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
 
Enterprise Resource Planning System in Telangana
Enterprise Resource Planning System in TelanganaEnterprise Resource Planning System in Telangana
Enterprise Resource Planning System in Telangana
 
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI AppAI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
 
Hand Rolled Applicative User Validation Code Kata
Hand Rolled Applicative User ValidationCode KataHand Rolled Applicative User ValidationCode Kata
Hand Rolled Applicative User Validation Code Kata
 
GraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph TechnologyGraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph Technology
 
What is Augmented Reality Image Tracking
What is Augmented Reality Image TrackingWhat is Augmented Reality Image Tracking
What is Augmented Reality Image Tracking
 
Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️
 

Counters At Scale - A Cautionary Tale

  • 1. COUNTERS AT SCALE A Cautionary Tale
  • 2. Eric Lubow @elubow #CassandraSummit PERSONAL VANITY ๏ CTO of SimpleReach ๏ Co-Author of Practical Cassandra ๏ Skydiver, Mixed Martial Artist, Motorcyclist, Dog Dad (IG: @charliedognyc), NY Giants fan
  • 3. Eric Lubow @elubow #CassandraSummit SIMPLEREACH ๏ Identify the best content ๏ Use engagement metrics ๏ Stream processing ingest ๏ Many metrics, time sliced ๏ Lots of counting
  • 4. Eric Lubow @elubow #CassandraSummit SIMPLEREACH CONTEXT ๏ 100 million URLs ๏ 350 million Tweets ๏ 50k - 100k events per second (tens of billions of events per day) ๏ Average 250k-300k counter writes per second ๏ 225G new per hour ๏ 800T of total compressed data (10T per month) ๏ 10T of hot data ๏ 72 nodes Cassandra cluster ๏ 52 i2.2xlarge Realtime Nodes ๏ 9 i2.xlarge Search Nodes ๏ 11 i2.2xlarge Spark Nodes
  • 5. Eric Lubow @elubow #CassandraSummit LET’S LOOK AT THE USE-CASE
  • 6. Eric Lubow @elubow #CassandraSummit Solr Solr Vertica + Cassandra Vertica + Cassandra Vertica Mongo
  • 7. Eric Lubow @elubow #CassandraSummit EARLY CHALLENGES ๏ Startup with a serious budget ๏ Had to think through how we would scale (can’t throw money at it) ๏ Told not to use counters, but there was nothing better ๏ Knew nothing about Cassandra, knew more about Mongo and Redis ๏ Didn’t want to write our own (support) ๏ Were no drivers for our languages ๏ No ideas about counter failure scenarios ๏ Neither did Datastax
  • 8. Eric Lubow @elubow #CassandraSummit HOW DID SIMPLEREACH GET FROM … ๏ Server/DB level locking ๏ Shards/Replica sets ๏ Mongostat ๏ Leader election headaches ๏ Locking counters ๏ Read before write ๏ All in one (no master/slave) ๏ JMX instrumentation/monitoring ๏ Better fault tolerance ๏ Better counters
  • 9. Eric Lubow @elubow #CassandraSummit ๏ Originally one large table ๏ Each row is a URL ๏ 0-200 cells per hour per row that saw activity ๏ Counter tables are now broken down by month (avoid wide rows) ๏ Counters are primarily CPU bound operations ๏ Requires SSD nodes and many core machines (memory not a factor) WHAT DO COUNTERS LOOK LIKE?
  • 10. Eric Lubow @elubow #CassandraSummit HOW DID WE MAKE IT WORK?
  • 11. Eric Lubow @elubow #CassandraSummit 1. All things possible through monitoring 2. Pre-aggregate writes (saved us 10x the writes) 3. Use counter batches 4. Trying to defeat the counter time bomb 5. Breaking the rules with CASSANDRA-8150, much JVM tuning 6. Upgraded every node in the cluster by hand one at a time 7. Upgrading to 2.1 definitely sealed the deal BEAT COUNTERS WITH OUR 7 STEP PROCESS
  • 12. Eric Lubow @elubow #CassandraSummit 1. MONITOR EVERYTHING
  • 13. Eric Lubow @elubow #CassandraSummit 2. HELPERS FOR AN AFFORDABLE CLUSTER Aggregator Mongo Writer Broadcast Redis Writer Cassandra Writer Solr Writer Calculator NSQ Vertica Writer 10x Improvement
  • 14. Eric Lubow @elubow #CassandraSummit ๏ Roughly 1k batches per second ๏ Each batch contains approximately 100 statements ๏ Totals 100k/sec ๏ With an RF=3, that’s 300k counter writes per second ๏ Without batches (using async writes), this would be 4x the load 3. WE DON’T NEED NO STINKIN’ BATCHES
  • 15. Eric Lubow @elubow #CassandraSummit ASYNC WRITES
  • 16. Eric Lubow @elubow #CassandraSummit BATCH WRITES
  • 17. Eric Lubow @elubow #CassandraSummit ๏ A slow node might make the entire cluster unusable ๏ A poorly gossiping node might overwork itself out of the cluster ๏ ReplicateOnWrite (and others) shared thread with gossip ๏ Occasional problematic GC pause durations ๏ Potential accuracy issues due to timeouts, retries, and non- idempotent writes ๏ Counter time bomb 4. FAILURE IS INEVITABLE
  • 18. Eric Lubow @elubow #CassandraSummit ๏ Normally writes return to client and background jobs occur (SEDA) ๏ Counters write, return to client, and then reconcile ๏ Similar in work to every write doing a read repair ๏ Overloaded ReplicateOnWrite thread (now called Counter Mutation) ๏ Also (sometimes) overloaded the event thread backing up gossip ๏ Could happen at any time (seemingly randomly) 4. COUNTER TIME BOMB
  • 19. Eric Lubow @elubow #CassandraSummit ๏ internode_compression: turned off for additional CPU cycles to be dedicated to counters ๏ heap_new_size: 3G: created a larger young gen to handle the objects allocated for counters ๏ -UseBiasedLocking: Biased locking limits an objects lock to a single thread. If many threads are using that object, then removing biased locking will likely increase performance. ๏ +UseGCTaskAffinity: Allocates GC tasks to threads using an affinity parameter ๏ +BindGCTaskThreadsToCPUs: Binds GC threads to individual CPUs. ๏ ParGCCardsPerStrideChunk=32768: Greatly increases the number of chunks that ParNew GC is doing. This allows GC threads to be more efficient with the work to be done and steal yet to be completed work from other GC threads. 5. BREAKING THE RULES WITH 8150 1/2
  • 20. Eric Lubow @elubow #CassandraSummit ๏ +CMSScavengeBeforeRemark: Forces a minor collection to occur before the remark thus shortening the remark phase. ๏ +CMSMaxAbortablePrecleanTime=60000: The fixed amount of time before starting the remark during the precleaning phase where the top of the young gen is sampled. ๏ CMSWaitDuration=30000: Sets a cap on the amount of time CMS cycle should work. CMS can slow down with very large objects. ๏ MaxGCPauseMillis=5: prevents long GC pauses from making the machine become unavailable ๏ +PerfDisableSharedMem: Prevents the writing of JVM stats to an MMAP’d file (hsperfdata) from blocking I/O 5. BREAKING THE RULES WITH 8150 2/2
  • 21. Eric Lubow @elubow #CassandraSummit 5. CAN YOU SPOT THE CHANGE? New JVM Settings ๏ Each message has roughly 100 counter operations ๏ 100 operations * 52 million messages = 5.2 billion operations per hour ๏ 1.5 million counter operations per second
  • 22. Eric Lubow @elubow #CassandraSummit ๏ 50+ nodes w/ 500+ gigs per node ๏ i2.xlarge => i2.2xlarge ๏ Additional cores and CPU ๏ Additional space on each node ๏ Better networking throughput ๏ Upgrading 10+ nodes at a time ๏ Would have been much easier with static internal IP addresses 6. MAKE IT BIGGER
  • 23. Eric Lubow @elubow #CassandraSummit ๏ < 2.1: Counter deltas written directly to the commit logs ๏ Contentious counters created large problems for GC ๏ >= 2.1: Reads the current value for the counter, applies delta and adds final value to the MemTable ๏ Better garbage collection of outstanding objects (fewer objects) ๏ Created CounterCache for hot counters used for conflict resolution only (for better performance on contentious counters) ๏ For full details, read this post: http://www.datastax.com/dev/blog/ whats-new-in-cassandra-2-1-a-better-implementation-of-counters 7. COUNTERS THEN AND NOW
  • 24. Eric Lubow @elubow #CassandraSummit WHAT SHOULD YOU WALK AWAY WITH? ๏ Incredibly important to have a deep understanding around your cases ๏ Sometimes database tuning has nothing to do with database settings ๏ Understand failure scenarios ๏ #monitoring / #instrumentation ๏ Ignoring best practices is ALMOST never a good idea
  • 25. Eric Lubow @elubow #CassandraSummit THANKS FOR LISTENING
  • 26. Eric Lubow @elubow #CassandraSummit QUESTIONS IN LIFE ARE GUARANTEED, ANSWERS AREN’T. Eric Lubow @elubow #CassandraSummit