SlideShare a Scribd company logo
©2015 DataStax Confidential. Do not distribute without consent.
Joel Jacobson, Solutions Engineer
@joeljacobson
Diagnosing Problems in Production
1
©2014 DataStax Confidential. Do not distribute without consent.
Apache Cassandra
•  Distributed NoSQL Database
•  Google Big Table
•  Amazon Dynamo
•  Continuous Availability
•  Disaster Avoidance
•  Linear Scale Performance
•  Add nodes to scale
•  Runs on Commodity Hardware
•  Cloud or on Premise
San
Francisco
New York
Dubai
First Step: Preparation
DataStax OpsCenter
• Will help with 90% of problems you
encounter
• Should be first place you look when
there's an issue
• Community version is free
• Enterprise version has additional
features
Server Monitoring & Alerts
• Monit
•  monitor processes
•  monitor disk usage
•  send alerts
• Munin / collectd
•  system perf statistics
• Nagios / Icinga
• Various 3rd party services
• Use whatever works for
you
Application Metrics
• Statsd / Graphite
• Grafana
• Gather constant metrics from
your application
• Measure anything & everything
• Microtimers, counters
• Graph events
•  user signup
•  error rates
• Cassandra Metrics Integration
• jmxtrans
Log Aggregation
• Hosted - Splunk, Loggly
• OSS - Logstash + Kibana, Greylog
• Many more…
• For best results all logs should be
aggregated here
• Oh yeah, and log your errors.
Gotchas
Incorrect Server Times
• Everything is written with a timestamp
• Last write wins
• Usually supplied by coordinator
• Can also be supplied by client
• What if your timestamps are wrong
because your clocks are off?
• Always install ntpd!
Tombstones
• Tombstones are a marker that data
no longer exists
• Tombstones have a timestamp just
like normal data
• They say "at time X, this no longer
exists"
Don’t use C* as a queue
• Queries on partitions with a lot of tombstones require a lot of filtering
• This can be reaaaaaaally slow
• Consider:
•  100,000 rows in a partition
•  99,999 are tombstones
•  How long to get a single row?
•  Run repairs frequently!
Not using a Snitch
• Snitch lets us distribute data in a fault tolerant way
• Changing this with a large cluster is time
consuming
• Dynamic Snitching
•  use the fastest replica for reads
• RackInferring (uses IP to pick replicas)
• DC aware
• PropertyFileSnitch (cassandra-
topology.properties)
• EC2Snitch & EC2MultiRegion
• GoogleCloudSnitch
• GossipingPropertyFileSnitch (recommended)
Version Mismatch
• SSTable format changed between
versions, making streaming
incompatible
• Version mismatch can break bootstrap,
repair, and decommission
• Introducing new nodes? Stick w/ the
same version
• Upgrade nodes in place
•  One at a time
•  One rack / AZ at a time (requires proper
snitch)
Disk Space not Reclaimed
• When you add new nodes, data is
streamed from existing nodes
• … but it's not deleted from them after
• You need to run a nodetool cleanup
• Otherwise you'll run out of space just by
adding nodes
Using Shared Storage
• Single point of failure
• High latency
• Expensive
• Performance is about latency
• Can increase throughput with more
disks
• In general avoid EBS, SAN, NAS
Diagnostic Tools
iostat
• Disk stats
•  Queue size, wait times
• Ignore %util
vmstat
• virtual memory statistics
• Am I swapping?
• Reports at an interval, to an optional count
dstat
• Flexible look at network, CPU, memory, disk
strace
• What is my process doing?
• See all system calls
• Filterable with -e
• Can attach to running
processes
tcpdump
• Watch network traffic
nodetool tpstats
• What's blocked?
• MemtableFlushWriter? -
Slow disks!
•  also leads to GC issues
• Dropped mutations?
•  need repair!
Histograms
• proxyhistograms
•  High level read and write times
•  Includes network latency
• cfhistograms <keyspace> <table>
•  reports stats for single table on a single
node
•  Used to identify tables with
performance problems
Query Tracing
GC Profiling
• Opscenter gc stats
•  Look for correlations between gc spikes
and read/write latency
• Cassandra GC Logging
•  Can be activated in cassandra-env.sh
• jstat
•  prints gc activity
How much does it matter?
Narrow Down the Problem
• Is it even Cassandra? Check your
metrics!
• Nodes flapping / failing
•  Check ops center
•  Dig into system metrics
• Slow queries
•  Find your bottleneck
•  Check system stats
•  JVM GC
•  Histograms
•  Tracing
©2015 DataStax Confidential. Do not distribute without consent. 30

More Related Content

What's hot

Taboola Road To Scale With Apache Spark
Taboola Road To Scale With Apache SparkTaboola Road To Scale With Apache Spark
Taboola Road To Scale With Apache Sparktsliwowicz
 
Python & Cassandra - Best Friends
Python & Cassandra - Best FriendsPython & Cassandra - Best Friends
Python & Cassandra - Best Friends
Jon Haddad
 
Building the Serverless Container Experience: Kevin McGrath, Spotinst, Server...
Building the Serverless Container Experience: Kevin McGrath, Spotinst, Server...Building the Serverless Container Experience: Kevin McGrath, Spotinst, Server...
Building the Serverless Container Experience: Kevin McGrath, Spotinst, Server...
iguazio
 
Migrating to Cassandra
Migrating to CassandraMigrating to Cassandra
Migrating to Cassandra
Instaclustr
 
Netflix Cloud Platform and Open Source
Netflix Cloud Platform and Open SourceNetflix Cloud Platform and Open Source
Netflix Cloud Platform and Open Source
aspyker
 
Elastic v5.0.0 Update uptoalpha3 v0.2 - 김종민
Elastic v5.0.0 Update uptoalpha3 v0.2 - 김종민Elastic v5.0.0 Update uptoalpha3 v0.2 - 김종민
Elastic v5.0.0 Update uptoalpha3 v0.2 - 김종민
NAVER D2
 
Docker Usage Patterns - Meetup Docker Paris - November, 10th 2015
Docker Usage Patterns - Meetup Docker Paris - November, 10th 2015Docker Usage Patterns - Meetup Docker Paris - November, 10th 2015
Docker Usage Patterns - Meetup Docker Paris - November, 10th 2015
Datadog
 
Introduction to Apache Cassandra
Introduction to Apache CassandraIntroduction to Apache Cassandra
Introduction to Apache Cassandra
Instaclustr
 
Monitoring kubernetes across data center and cloud
Monitoring kubernetes across data center and cloudMonitoring kubernetes across data center and cloud
Monitoring kubernetes across data center and cloud
Datadog
 
Automate your development environment with Jira and Saltstack
Automate your development environment with Jira and SaltstackAutomate your development environment with Jira and Saltstack
Automate your development environment with Jira and Saltstack
NetworkedAssets
 
Introduction to AWS Outposts
Introduction to AWS OutpostsIntroduction to AWS Outposts
Introduction to AWS Outposts
ScyllaDB
 
How to Enable Industrial Decarbonization with Node-RED and InfluxDB
How to Enable Industrial Decarbonization with Node-RED and InfluxDBHow to Enable Industrial Decarbonization with Node-RED and InfluxDB
How to Enable Industrial Decarbonization with Node-RED and InfluxDB
InfluxData
 
Diagnosing Problems in Production: Cassandra Summit 2014
Diagnosing Problems in Production: Cassandra Summit 2014Diagnosing Problems in Production: Cassandra Summit 2014
Diagnosing Problems in Production: Cassandra Summit 2014
Jon Haddad
 
Streaming Customer Insights with DataStax Cassandra & Apache Kafta at British...
Streaming Customer Insights with DataStax Cassandra & Apache Kafta at British...Streaming Customer Insights with DataStax Cassandra & Apache Kafta at British...
Streaming Customer Insights with DataStax Cassandra & Apache Kafta at British...
DataStax
 
Introducing DataStax Enterprise 4.7
Introducing DataStax Enterprise 4.7Introducing DataStax Enterprise 4.7
Introducing DataStax Enterprise 4.7
DataStax
 
Icinga Director and vSphereDB - how they play together - Icinga Camp Zurich 2019
Icinga Director and vSphereDB - how they play together - Icinga Camp Zurich 2019Icinga Director and vSphereDB - how they play together - Icinga Camp Zurich 2019
Icinga Director and vSphereDB - how they play together - Icinga Camp Zurich 2019
Icinga
 
Datadog- Monitoring In Motion
Datadog- Monitoring In Motion Datadog- Monitoring In Motion
Datadog- Monitoring In Motion
Cloud Native Apps SF
 
Big Data Day LA 2015 - Lessons learned from scaling Big Data in the Cloud by...
Big Data Day LA 2015 -  Lessons learned from scaling Big Data in the Cloud by...Big Data Day LA 2015 -  Lessons learned from scaling Big Data in the Cloud by...
Big Data Day LA 2015 - Lessons learned from scaling Big Data in the Cloud by...
Data Con LA
 
Graph Databases at Netflix
Graph Databases at NetflixGraph Databases at Netflix
Graph Databases at Netflix
Ioannis Papapanagiotou
 
Kubernetes
KubernetesKubernetes
Kubernetes
Sang-Min Park
 

What's hot (20)

Taboola Road To Scale With Apache Spark
Taboola Road To Scale With Apache SparkTaboola Road To Scale With Apache Spark
Taboola Road To Scale With Apache Spark
 
Python & Cassandra - Best Friends
Python & Cassandra - Best FriendsPython & Cassandra - Best Friends
Python & Cassandra - Best Friends
 
Building the Serverless Container Experience: Kevin McGrath, Spotinst, Server...
Building the Serverless Container Experience: Kevin McGrath, Spotinst, Server...Building the Serverless Container Experience: Kevin McGrath, Spotinst, Server...
Building the Serverless Container Experience: Kevin McGrath, Spotinst, Server...
 
Migrating to Cassandra
Migrating to CassandraMigrating to Cassandra
Migrating to Cassandra
 
Netflix Cloud Platform and Open Source
Netflix Cloud Platform and Open SourceNetflix Cloud Platform and Open Source
Netflix Cloud Platform and Open Source
 
Elastic v5.0.0 Update uptoalpha3 v0.2 - 김종민
Elastic v5.0.0 Update uptoalpha3 v0.2 - 김종민Elastic v5.0.0 Update uptoalpha3 v0.2 - 김종민
Elastic v5.0.0 Update uptoalpha3 v0.2 - 김종민
 
Docker Usage Patterns - Meetup Docker Paris - November, 10th 2015
Docker Usage Patterns - Meetup Docker Paris - November, 10th 2015Docker Usage Patterns - Meetup Docker Paris - November, 10th 2015
Docker Usage Patterns - Meetup Docker Paris - November, 10th 2015
 
Introduction to Apache Cassandra
Introduction to Apache CassandraIntroduction to Apache Cassandra
Introduction to Apache Cassandra
 
Monitoring kubernetes across data center and cloud
Monitoring kubernetes across data center and cloudMonitoring kubernetes across data center and cloud
Monitoring kubernetes across data center and cloud
 
Automate your development environment with Jira and Saltstack
Automate your development environment with Jira and SaltstackAutomate your development environment with Jira and Saltstack
Automate your development environment with Jira and Saltstack
 
Introduction to AWS Outposts
Introduction to AWS OutpostsIntroduction to AWS Outposts
Introduction to AWS Outposts
 
How to Enable Industrial Decarbonization with Node-RED and InfluxDB
How to Enable Industrial Decarbonization with Node-RED and InfluxDBHow to Enable Industrial Decarbonization with Node-RED and InfluxDB
How to Enable Industrial Decarbonization with Node-RED and InfluxDB
 
Diagnosing Problems in Production: Cassandra Summit 2014
Diagnosing Problems in Production: Cassandra Summit 2014Diagnosing Problems in Production: Cassandra Summit 2014
Diagnosing Problems in Production: Cassandra Summit 2014
 
Streaming Customer Insights with DataStax Cassandra & Apache Kafta at British...
Streaming Customer Insights with DataStax Cassandra & Apache Kafta at British...Streaming Customer Insights with DataStax Cassandra & Apache Kafta at British...
Streaming Customer Insights with DataStax Cassandra & Apache Kafta at British...
 
Introducing DataStax Enterprise 4.7
Introducing DataStax Enterprise 4.7Introducing DataStax Enterprise 4.7
Introducing DataStax Enterprise 4.7
 
Icinga Director and vSphereDB - how they play together - Icinga Camp Zurich 2019
Icinga Director and vSphereDB - how they play together - Icinga Camp Zurich 2019Icinga Director and vSphereDB - how they play together - Icinga Camp Zurich 2019
Icinga Director and vSphereDB - how they play together - Icinga Camp Zurich 2019
 
Datadog- Monitoring In Motion
Datadog- Monitoring In Motion Datadog- Monitoring In Motion
Datadog- Monitoring In Motion
 
Big Data Day LA 2015 - Lessons learned from scaling Big Data in the Cloud by...
Big Data Day LA 2015 -  Lessons learned from scaling Big Data in the Cloud by...Big Data Day LA 2015 -  Lessons learned from scaling Big Data in the Cloud by...
Big Data Day LA 2015 - Lessons learned from scaling Big Data in the Cloud by...
 
Graph Databases at Netflix
Graph Databases at NetflixGraph Databases at Netflix
Graph Databases at Netflix
 
Kubernetes
KubernetesKubernetes
Kubernetes
 

Viewers also liked

90 munazrae ahlesunnatwithahlebidat_text
90 munazrae ahlesunnatwithahlebidat_text90 munazrae ahlesunnatwithahlebidat_text
90 munazrae ahlesunnatwithahlebidat_text
idara-e-dosti
 
Ritmoson latino
Ritmoson latinoRitmoson latino
Chapter 5 class-vi- what books and burial tell us bkb
Chapter   5 class-vi- what books and burial tell us bkbChapter   5 class-vi- what books and burial tell us bkb
Chapter 5 class-vi- what books and burial tell us bkb
binoda007
 
Capítulo iii. competencia de arquimed
Capítulo iii. competencia de arquimedCapítulo iii. competencia de arquimed
Capítulo iii. competencia de arquimednicolasmunozvera
 
Magic Media Overview
Magic Media OverviewMagic Media Overview
Magic Media Overview
FootfallCam
 
Self Esteem issues in coaching
Self Esteem issues in coachingSelf Esteem issues in coaching
Self Esteem issues in coaching
Alison Maxwell Coaching Ltd
 
Conociendo a Bart Simpson
Conociendo a Bart SimpsonConociendo a Bart Simpson
Conociendo a Bart Simpson
newage89
 
Bazar del Conocimiento
Bazar del ConocimientoBazar del Conocimiento
Bazar del Conocimiento
Luis Casas Luengo
 
Group Presentation March 2014
Group Presentation March 2014Group Presentation March 2014
Group Presentation March 2014
Company Spotlight
 
Reproductor De Windows
Reproductor De WindowsReproductor De Windows
Reproductor De Windows
mdsjca
 
Módulo 4. curaduría y compartición de contenidos
Módulo 4.  curaduría y compartición de contenidosMódulo 4.  curaduría y compartición de contenidos
Módulo 4. curaduría y compartición de contenidos
MDCH
 
Alimentos precocinados e industriales
Alimentos precocinados e industrialesAlimentos precocinados e industriales
Alimentos precocinados e industriales
aulasaludable
 
Bua marketing social media explained mii nw
Bua marketing social media explained mii nwBua marketing social media explained mii nw
Bua marketing social media explained mii nwBua Marketing
 
Webinar: Diagnosing Apache Cassandra Problems in Production
Webinar: Diagnosing Apache Cassandra Problems in ProductionWebinar: Diagnosing Apache Cassandra Problems in Production
Webinar: Diagnosing Apache Cassandra Problems in Production
DataStax Academy
 
[DE] Langzeitarchivierung von Daten und Datenbanken: Aufbewahrung zwischen Co...
[DE] Langzeitarchivierung von Daten und Datenbanken: Aufbewahrung zwischen Co...[DE] Langzeitarchivierung von Daten und Datenbanken: Aufbewahrung zwischen Co...
[DE] Langzeitarchivierung von Daten und Datenbanken: Aufbewahrung zwischen Co...
PROJECT CONSULT Unternehmensberatung Dr. Ulrich Kampffmeyer GmbH
 
Capacidad de orientación david agudo trabajo informatica
Capacidad de orientación david agudo  trabajo informaticaCapacidad de orientación david agudo  trabajo informatica
Capacidad de orientación david agudo trabajo informatica
davidagudo1997
 
Jeep Brochure and Specs from Ancira Chrysler Jeep Dodge
Jeep Brochure and Specs from Ancira Chrysler Jeep DodgeJeep Brochure and Specs from Ancira Chrysler Jeep Dodge
Jeep Brochure and Specs from Ancira Chrysler Jeep Dodge
Ancira Auto Group
 
Carteles cinematográficos en la era de los 50´s en México
Carteles cinematográficos en la era de los 50´s en MéxicoCarteles cinematográficos en la era de los 50´s en México
Carteles cinematográficos en la era de los 50´s en México
Ivan Valenzuela
 

Viewers also liked (20)

90 munazrae ahlesunnatwithahlebidat_text
90 munazrae ahlesunnatwithahlebidat_text90 munazrae ahlesunnatwithahlebidat_text
90 munazrae ahlesunnatwithahlebidat_text
 
Ritmoson latino
Ritmoson latinoRitmoson latino
Ritmoson latino
 
Chapter 5 class-vi- what books and burial tell us bkb
Chapter   5 class-vi- what books and burial tell us bkbChapter   5 class-vi- what books and burial tell us bkb
Chapter 5 class-vi- what books and burial tell us bkb
 
Capítulo iii. competencia de arquimed
Capítulo iii. competencia de arquimedCapítulo iii. competencia de arquimed
Capítulo iii. competencia de arquimed
 
Magic Media Overview
Magic Media OverviewMagic Media Overview
Magic Media Overview
 
Edge002
Edge002Edge002
Edge002
 
Self Esteem issues in coaching
Self Esteem issues in coachingSelf Esteem issues in coaching
Self Esteem issues in coaching
 
Conociendo a Bart Simpson
Conociendo a Bart SimpsonConociendo a Bart Simpson
Conociendo a Bart Simpson
 
Bazar del Conocimiento
Bazar del ConocimientoBazar del Conocimiento
Bazar del Conocimiento
 
Group Presentation March 2014
Group Presentation March 2014Group Presentation March 2014
Group Presentation March 2014
 
Reproductor De Windows
Reproductor De WindowsReproductor De Windows
Reproductor De Windows
 
Módulo 4. curaduría y compartición de contenidos
Módulo 4.  curaduría y compartición de contenidosMódulo 4.  curaduría y compartición de contenidos
Módulo 4. curaduría y compartición de contenidos
 
Alimentos precocinados e industriales
Alimentos precocinados e industrialesAlimentos precocinados e industriales
Alimentos precocinados e industriales
 
Bua marketing social media explained mii nw
Bua marketing social media explained mii nwBua marketing social media explained mii nw
Bua marketing social media explained mii nw
 
Webinar: Diagnosing Apache Cassandra Problems in Production
Webinar: Diagnosing Apache Cassandra Problems in ProductionWebinar: Diagnosing Apache Cassandra Problems in Production
Webinar: Diagnosing Apache Cassandra Problems in Production
 
[DE] Langzeitarchivierung von Daten und Datenbanken: Aufbewahrung zwischen Co...
[DE] Langzeitarchivierung von Daten und Datenbanken: Aufbewahrung zwischen Co...[DE] Langzeitarchivierung von Daten und Datenbanken: Aufbewahrung zwischen Co...
[DE] Langzeitarchivierung von Daten und Datenbanken: Aufbewahrung zwischen Co...
 
Capacidad de orientación david agudo trabajo informatica
Capacidad de orientación david agudo  trabajo informaticaCapacidad de orientación david agudo  trabajo informatica
Capacidad de orientación david agudo trabajo informatica
 
Jeep Brochure and Specs from Ancira Chrysler Jeep Dodge
Jeep Brochure and Specs from Ancira Chrysler Jeep DodgeJeep Brochure and Specs from Ancira Chrysler Jeep Dodge
Jeep Brochure and Specs from Ancira Chrysler Jeep Dodge
 
17021551 acertijos
17021551 acertijos17021551 acertijos
17021551 acertijos
 
Carteles cinematográficos en la era de los 50´s en México
Carteles cinematográficos en la era de los 50´s en MéxicoCarteles cinematográficos en la era de los 50´s en México
Carteles cinematográficos en la era de los 50´s en México
 

Similar to Joel Jacobson (Datastax) - Diagnosing Cassandra Problems in Production

Diagnosing Problems in Production (Nov 2015)
Diagnosing Problems in Production (Nov 2015)Diagnosing Problems in Production (Nov 2015)
Diagnosing Problems in Production (Nov 2015)
Jon Haddad
 
Advanced Operations
Advanced OperationsAdvanced Operations
Advanced Operations
DataStax Academy
 
Cassandra Day Atlanta 2015: Diagnosing Problems in Production
Cassandra Day Atlanta 2015: Diagnosing Problems in ProductionCassandra Day Atlanta 2015: Diagnosing Problems in Production
Cassandra Day Atlanta 2015: Diagnosing Problems in Production
DataStax Academy
 
Cassandra Day Chicago 2015: Diagnosing Problems in Production
Cassandra Day Chicago 2015: Diagnosing Problems in ProductionCassandra Day Chicago 2015: Diagnosing Problems in Production
Cassandra Day Chicago 2015: Diagnosing Problems in Production
DataStax Academy
 
Cassandra Day London 2015: Diagnosing Problems in Production
Cassandra Day London 2015: Diagnosing Problems in ProductionCassandra Day London 2015: Diagnosing Problems in Production
Cassandra Day London 2015: Diagnosing Problems in Production
DataStax Academy
 
Diagnosing Problems in Production - Cassandra
Diagnosing Problems in Production - CassandraDiagnosing Problems in Production - Cassandra
Diagnosing Problems in Production - Cassandra
Jon Haddad
 
Webinar: Diagnosing Apache Cassandra Problems in Production
Webinar: Diagnosing Apache Cassandra Problems in ProductionWebinar: Diagnosing Apache Cassandra Problems in Production
Webinar: Diagnosing Apache Cassandra Problems in Production
DataStax Academy
 
John adams talk cloudy
John adams   talk cloudyJohn adams   talk cloudy
John adams talk cloudy
John Adams
 
Standing Up Your First Cluster
Standing Up Your First ClusterStanding Up Your First Cluster
Standing Up Your First Cluster
DataStax Academy
 
Flexible compute
Flexible computeFlexible compute
Flexible compute
Peter Clapham
 
Sanger, upcoming Openstack for Bio-informaticians
Sanger, upcoming Openstack for Bio-informaticiansSanger, upcoming Openstack for Bio-informaticians
Sanger, upcoming Openstack for Bio-informaticians
Peter Clapham
 
Introducing Cloudian HyperStore 6.0
Introducing Cloudian HyperStore 6.0Introducing Cloudian HyperStore 6.0
Introducing Cloudian HyperStore 6.0
Cloudian
 
Mtc learnings from isv & enterprise interaction
Mtc learnings from isv & enterprise  interactionMtc learnings from isv & enterprise  interaction
Mtc learnings from isv & enterprise interaction
Govind Kanshi
 
Mtc learnings from isv & enterprise (dated - Dec -2014)
Mtc learnings from isv & enterprise (dated - Dec -2014)Mtc learnings from isv & enterprise (dated - Dec -2014)
Mtc learnings from isv & enterprise (dated - Dec -2014)
Govind Kanshi
 
Netflix oss season 2 episode 1 - meetup Lightning talks
Netflix oss   season 2 episode 1 - meetup Lightning talksNetflix oss   season 2 episode 1 - meetup Lightning talks
Netflix oss season 2 episode 1 - meetup Lightning talksRuslan Meshenberg
 
AWS re:invent 2013 recap
AWS re:invent 2013 recapAWS re:invent 2013 recap
AWS re:invent 2013 recap
Peter Sankauskas
 
[DBA]_HiramFleitas_SQL_PASS_Summit_2017_Summary
[DBA]_HiramFleitas_SQL_PASS_Summit_2017_Summary[DBA]_HiramFleitas_SQL_PASS_Summit_2017_Summary
[DBA]_HiramFleitas_SQL_PASS_Summit_2017_Summary
Hiram Fleitas León
 
Devoxx PL 2018 - Microservices in action at the Dutch National Police
Devoxx PL 2018 - Microservices in action at the Dutch National PoliceDevoxx PL 2018 - Microservices in action at the Dutch National Police
Devoxx PL 2018 - Microservices in action at the Dutch National Police
Bert Jan Schrijver
 
CodeMotion Amsterdam 2018 - Microservices in action at the Dutch National Police
CodeMotion Amsterdam 2018 - Microservices in action at the Dutch National PoliceCodeMotion Amsterdam 2018 - Microservices in action at the Dutch National Police
CodeMotion Amsterdam 2018 - Microservices in action at the Dutch National Police
Bert Jan Schrijver
 
Microservices in action at the Dutch National Police - Bert Jan Schrijver - C...
Microservices in action at the Dutch National Police - Bert Jan Schrijver - C...Microservices in action at the Dutch National Police - Bert Jan Schrijver - C...
Microservices in action at the Dutch National Police - Bert Jan Schrijver - C...
Codemotion
 

Similar to Joel Jacobson (Datastax) - Diagnosing Cassandra Problems in Production (20)

Diagnosing Problems in Production (Nov 2015)
Diagnosing Problems in Production (Nov 2015)Diagnosing Problems in Production (Nov 2015)
Diagnosing Problems in Production (Nov 2015)
 
Advanced Operations
Advanced OperationsAdvanced Operations
Advanced Operations
 
Cassandra Day Atlanta 2015: Diagnosing Problems in Production
Cassandra Day Atlanta 2015: Diagnosing Problems in ProductionCassandra Day Atlanta 2015: Diagnosing Problems in Production
Cassandra Day Atlanta 2015: Diagnosing Problems in Production
 
Cassandra Day Chicago 2015: Diagnosing Problems in Production
Cassandra Day Chicago 2015: Diagnosing Problems in ProductionCassandra Day Chicago 2015: Diagnosing Problems in Production
Cassandra Day Chicago 2015: Diagnosing Problems in Production
 
Cassandra Day London 2015: Diagnosing Problems in Production
Cassandra Day London 2015: Diagnosing Problems in ProductionCassandra Day London 2015: Diagnosing Problems in Production
Cassandra Day London 2015: Diagnosing Problems in Production
 
Diagnosing Problems in Production - Cassandra
Diagnosing Problems in Production - CassandraDiagnosing Problems in Production - Cassandra
Diagnosing Problems in Production - Cassandra
 
Webinar: Diagnosing Apache Cassandra Problems in Production
Webinar: Diagnosing Apache Cassandra Problems in ProductionWebinar: Diagnosing Apache Cassandra Problems in Production
Webinar: Diagnosing Apache Cassandra Problems in Production
 
John adams talk cloudy
John adams   talk cloudyJohn adams   talk cloudy
John adams talk cloudy
 
Standing Up Your First Cluster
Standing Up Your First ClusterStanding Up Your First Cluster
Standing Up Your First Cluster
 
Flexible compute
Flexible computeFlexible compute
Flexible compute
 
Sanger, upcoming Openstack for Bio-informaticians
Sanger, upcoming Openstack for Bio-informaticiansSanger, upcoming Openstack for Bio-informaticians
Sanger, upcoming Openstack for Bio-informaticians
 
Introducing Cloudian HyperStore 6.0
Introducing Cloudian HyperStore 6.0Introducing Cloudian HyperStore 6.0
Introducing Cloudian HyperStore 6.0
 
Mtc learnings from isv & enterprise interaction
Mtc learnings from isv & enterprise  interactionMtc learnings from isv & enterprise  interaction
Mtc learnings from isv & enterprise interaction
 
Mtc learnings from isv & enterprise (dated - Dec -2014)
Mtc learnings from isv & enterprise (dated - Dec -2014)Mtc learnings from isv & enterprise (dated - Dec -2014)
Mtc learnings from isv & enterprise (dated - Dec -2014)
 
Netflix oss season 2 episode 1 - meetup Lightning talks
Netflix oss   season 2 episode 1 - meetup Lightning talksNetflix oss   season 2 episode 1 - meetup Lightning talks
Netflix oss season 2 episode 1 - meetup Lightning talks
 
AWS re:invent 2013 recap
AWS re:invent 2013 recapAWS re:invent 2013 recap
AWS re:invent 2013 recap
 
[DBA]_HiramFleitas_SQL_PASS_Summit_2017_Summary
[DBA]_HiramFleitas_SQL_PASS_Summit_2017_Summary[DBA]_HiramFleitas_SQL_PASS_Summit_2017_Summary
[DBA]_HiramFleitas_SQL_PASS_Summit_2017_Summary
 
Devoxx PL 2018 - Microservices in action at the Dutch National Police
Devoxx PL 2018 - Microservices in action at the Dutch National PoliceDevoxx PL 2018 - Microservices in action at the Dutch National Police
Devoxx PL 2018 - Microservices in action at the Dutch National Police
 
CodeMotion Amsterdam 2018 - Microservices in action at the Dutch National Police
CodeMotion Amsterdam 2018 - Microservices in action at the Dutch National PoliceCodeMotion Amsterdam 2018 - Microservices in action at the Dutch National Police
CodeMotion Amsterdam 2018 - Microservices in action at the Dutch National Police
 
Microservices in action at the Dutch National Police - Bert Jan Schrijver - C...
Microservices in action at the Dutch National Police - Bert Jan Schrijver - C...Microservices in action at the Dutch National Police - Bert Jan Schrijver - C...
Microservices in action at the Dutch National Police - Bert Jan Schrijver - C...
 

More from Outlyer

Murat Karslioglu, VP Solutions @ OpenEBS - Containerized storage for containe...
Murat Karslioglu, VP Solutions @ OpenEBS - Containerized storage for containe...Murat Karslioglu, VP Solutions @ OpenEBS - Containerized storage for containe...
Murat Karslioglu, VP Solutions @ OpenEBS - Containerized storage for containe...
Outlyer
 
How & When to Feature Flag
How & When to Feature FlagHow & When to Feature Flag
How & When to Feature Flag
Outlyer
 
Why You Need to Stop Using "The" Staging Server
Why You Need to Stop Using "The" Staging ServerWhy You Need to Stop Using "The" Staging Server
Why You Need to Stop Using "The" Staging Server
Outlyer
 
How GitHub combined with CI empowers rapid product delivery at Credit Karma
How GitHub combined with CI empowers rapid product delivery at Credit Karma How GitHub combined with CI empowers rapid product delivery at Credit Karma
How GitHub combined with CI empowers rapid product delivery at Credit Karma
Outlyer
 
Packaging Services with Nix
Packaging Services with NixPackaging Services with Nix
Packaging Services with Nix
Outlyer
 
Minimum Viable Docker: our journey towards orchestration
Minimum Viable Docker: our journey towards orchestrationMinimum Viable Docker: our journey towards orchestration
Minimum Viable Docker: our journey towards orchestration
Outlyer
 
Ops is dead. long live ops.
Ops is dead. long live ops.Ops is dead. long live ops.
Ops is dead. long live ops.
Outlyer
 
The service mesh: resilient communication for microservice applications
The service mesh: resilient communication for microservice applicationsThe service mesh: resilient communication for microservice applications
The service mesh: resilient communication for microservice applications
Outlyer
 
Microservices: Why We Did It (and should you?)
Microservices: Why We Did It (and should you?) Microservices: Why We Did It (and should you?)
Microservices: Why We Did It (and should you?)
Outlyer
 
Renan Dias: Using Alexa to deploy applications to Kubernetes
Renan Dias: Using Alexa to deploy applications to KubernetesRenan Dias: Using Alexa to deploy applications to Kubernetes
Renan Dias: Using Alexa to deploy applications to Kubernetes
Outlyer
 
Alex Dias: how to build a docker monitoring solution
Alex Dias: how to build a docker monitoring solution Alex Dias: how to build a docker monitoring solution
Alex Dias: how to build a docker monitoring solution
Outlyer
 
How to build a container monitoring solution - David Gildeh, CEO and Co-Found...
How to build a container monitoring solution - David Gildeh, CEO and Co-Found...How to build a container monitoring solution - David Gildeh, CEO and Co-Found...
How to build a container monitoring solution - David Gildeh, CEO and Co-Found...
Outlyer
 
Heresy in the church of - Corey Quinn, Principal at The Quinn Advisory Group
Heresy in the church of - Corey Quinn, Principal at The Quinn Advisory Group Heresy in the church of - Corey Quinn, Principal at The Quinn Advisory Group
Heresy in the church of - Corey Quinn, Principal at The Quinn Advisory Group
Outlyer
 
Anatomy of a real-life incident -Alex Solomon, CTO and Co-Founder of PagerDuty
Anatomy of a real-life incident -Alex Solomon, CTO and Co-Founder of PagerDutyAnatomy of a real-life incident -Alex Solomon, CTO and Co-Founder of PagerDuty
Anatomy of a real-life incident -Alex Solomon, CTO and Co-Founder of PagerDuty
Outlyer
 
A Holistic View of Operational Capabilities—Roy Rapoport, Insight Engineering...
A Holistic View of Operational Capabilities—Roy Rapoport, Insight Engineering...A Holistic View of Operational Capabilities—Roy Rapoport, Insight Engineering...
A Holistic View of Operational Capabilities—Roy Rapoport, Insight Engineering...
Outlyer
 
The Network Knows—Avi Freedman, CEO & Co-Founder of Kentik
The Network Knows—Avi Freedman, CEO & Co-Founder of Kentik The Network Knows—Avi Freedman, CEO & Co-Founder of Kentik
The Network Knows—Avi Freedman, CEO & Co-Founder of Kentik
Outlyer
 
Building a production-ready, fully-scalable Docker Swarm using Terraform & Pa...
Building a production-ready, fully-scalable Docker Swarm using Terraform & Pa...Building a production-ready, fully-scalable Docker Swarm using Terraform & Pa...
Building a production-ready, fully-scalable Docker Swarm using Terraform & Pa...
Outlyer
 
Zero Downtime Postgres Upgrades
Zero Downtime Postgres UpgradesZero Downtime Postgres Upgrades
Zero Downtime Postgres Upgrades
Outlyer
 
DOXLON November 2016: Facebook Engineering on cgroupv2
DOXLON November 2016: Facebook Engineering on cgroupv2DOXLON November 2016: Facebook Engineering on cgroupv2
DOXLON November 2016: Facebook Engineering on cgroupv2
Outlyer
 
DOXLON November 2016 - ELK Stack and Beats
DOXLON November 2016 - ELK Stack and Beats DOXLON November 2016 - ELK Stack and Beats
DOXLON November 2016 - ELK Stack and Beats
Outlyer
 

More from Outlyer (20)

Murat Karslioglu, VP Solutions @ OpenEBS - Containerized storage for containe...
Murat Karslioglu, VP Solutions @ OpenEBS - Containerized storage for containe...Murat Karslioglu, VP Solutions @ OpenEBS - Containerized storage for containe...
Murat Karslioglu, VP Solutions @ OpenEBS - Containerized storage for containe...
 
How & When to Feature Flag
How & When to Feature FlagHow & When to Feature Flag
How & When to Feature Flag
 
Why You Need to Stop Using "The" Staging Server
Why You Need to Stop Using "The" Staging ServerWhy You Need to Stop Using "The" Staging Server
Why You Need to Stop Using "The" Staging Server
 
How GitHub combined with CI empowers rapid product delivery at Credit Karma
How GitHub combined with CI empowers rapid product delivery at Credit Karma How GitHub combined with CI empowers rapid product delivery at Credit Karma
How GitHub combined with CI empowers rapid product delivery at Credit Karma
 
Packaging Services with Nix
Packaging Services with NixPackaging Services with Nix
Packaging Services with Nix
 
Minimum Viable Docker: our journey towards orchestration
Minimum Viable Docker: our journey towards orchestrationMinimum Viable Docker: our journey towards orchestration
Minimum Viable Docker: our journey towards orchestration
 
Ops is dead. long live ops.
Ops is dead. long live ops.Ops is dead. long live ops.
Ops is dead. long live ops.
 
The service mesh: resilient communication for microservice applications
The service mesh: resilient communication for microservice applicationsThe service mesh: resilient communication for microservice applications
The service mesh: resilient communication for microservice applications
 
Microservices: Why We Did It (and should you?)
Microservices: Why We Did It (and should you?) Microservices: Why We Did It (and should you?)
Microservices: Why We Did It (and should you?)
 
Renan Dias: Using Alexa to deploy applications to Kubernetes
Renan Dias: Using Alexa to deploy applications to KubernetesRenan Dias: Using Alexa to deploy applications to Kubernetes
Renan Dias: Using Alexa to deploy applications to Kubernetes
 
Alex Dias: how to build a docker monitoring solution
Alex Dias: how to build a docker monitoring solution Alex Dias: how to build a docker monitoring solution
Alex Dias: how to build a docker monitoring solution
 
How to build a container monitoring solution - David Gildeh, CEO and Co-Found...
How to build a container monitoring solution - David Gildeh, CEO and Co-Found...How to build a container monitoring solution - David Gildeh, CEO and Co-Found...
How to build a container monitoring solution - David Gildeh, CEO and Co-Found...
 
Heresy in the church of - Corey Quinn, Principal at The Quinn Advisory Group
Heresy in the church of - Corey Quinn, Principal at The Quinn Advisory Group Heresy in the church of - Corey Quinn, Principal at The Quinn Advisory Group
Heresy in the church of - Corey Quinn, Principal at The Quinn Advisory Group
 
Anatomy of a real-life incident -Alex Solomon, CTO and Co-Founder of PagerDuty
Anatomy of a real-life incident -Alex Solomon, CTO and Co-Founder of PagerDutyAnatomy of a real-life incident -Alex Solomon, CTO and Co-Founder of PagerDuty
Anatomy of a real-life incident -Alex Solomon, CTO and Co-Founder of PagerDuty
 
A Holistic View of Operational Capabilities—Roy Rapoport, Insight Engineering...
A Holistic View of Operational Capabilities—Roy Rapoport, Insight Engineering...A Holistic View of Operational Capabilities—Roy Rapoport, Insight Engineering...
A Holistic View of Operational Capabilities—Roy Rapoport, Insight Engineering...
 
The Network Knows—Avi Freedman, CEO & Co-Founder of Kentik
The Network Knows—Avi Freedman, CEO & Co-Founder of Kentik The Network Knows—Avi Freedman, CEO & Co-Founder of Kentik
The Network Knows—Avi Freedman, CEO & Co-Founder of Kentik
 
Building a production-ready, fully-scalable Docker Swarm using Terraform & Pa...
Building a production-ready, fully-scalable Docker Swarm using Terraform & Pa...Building a production-ready, fully-scalable Docker Swarm using Terraform & Pa...
Building a production-ready, fully-scalable Docker Swarm using Terraform & Pa...
 
Zero Downtime Postgres Upgrades
Zero Downtime Postgres UpgradesZero Downtime Postgres Upgrades
Zero Downtime Postgres Upgrades
 
DOXLON November 2016: Facebook Engineering on cgroupv2
DOXLON November 2016: Facebook Engineering on cgroupv2DOXLON November 2016: Facebook Engineering on cgroupv2
DOXLON November 2016: Facebook Engineering on cgroupv2
 
DOXLON November 2016 - ELK Stack and Beats
DOXLON November 2016 - ELK Stack and Beats DOXLON November 2016 - ELK Stack and Beats
DOXLON November 2016 - ELK Stack and Beats
 

Recently uploaded

To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Paul Groth
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
Bhaskar Mitra
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
CatarinaPereira64715
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
Fwdays
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 

Recently uploaded (20)

To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 

Joel Jacobson (Datastax) - Diagnosing Cassandra Problems in Production

  • 1. ©2015 DataStax Confidential. Do not distribute without consent. Joel Jacobson, Solutions Engineer @joeljacobson Diagnosing Problems in Production 1
  • 2. ©2014 DataStax Confidential. Do not distribute without consent. Apache Cassandra •  Distributed NoSQL Database •  Google Big Table •  Amazon Dynamo •  Continuous Availability •  Disaster Avoidance •  Linear Scale Performance •  Add nodes to scale •  Runs on Commodity Hardware •  Cloud or on Premise San Francisco New York Dubai
  • 3.
  • 4.
  • 6. DataStax OpsCenter • Will help with 90% of problems you encounter • Should be first place you look when there's an issue • Community version is free • Enterprise version has additional features
  • 7. Server Monitoring & Alerts • Monit •  monitor processes •  monitor disk usage •  send alerts • Munin / collectd •  system perf statistics • Nagios / Icinga • Various 3rd party services • Use whatever works for you
  • 8. Application Metrics • Statsd / Graphite • Grafana • Gather constant metrics from your application • Measure anything & everything • Microtimers, counters • Graph events •  user signup •  error rates • Cassandra Metrics Integration • jmxtrans
  • 9. Log Aggregation • Hosted - Splunk, Loggly • OSS - Logstash + Kibana, Greylog • Many more… • For best results all logs should be aggregated here • Oh yeah, and log your errors.
  • 11. Incorrect Server Times • Everything is written with a timestamp • Last write wins • Usually supplied by coordinator • Can also be supplied by client • What if your timestamps are wrong because your clocks are off? • Always install ntpd!
  • 12. Tombstones • Tombstones are a marker that data no longer exists • Tombstones have a timestamp just like normal data • They say "at time X, this no longer exists"
  • 13. Don’t use C* as a queue • Queries on partitions with a lot of tombstones require a lot of filtering • This can be reaaaaaaally slow • Consider: •  100,000 rows in a partition •  99,999 are tombstones •  How long to get a single row? •  Run repairs frequently!
  • 14. Not using a Snitch • Snitch lets us distribute data in a fault tolerant way • Changing this with a large cluster is time consuming • Dynamic Snitching •  use the fastest replica for reads • RackInferring (uses IP to pick replicas) • DC aware • PropertyFileSnitch (cassandra- topology.properties) • EC2Snitch & EC2MultiRegion • GoogleCloudSnitch • GossipingPropertyFileSnitch (recommended)
  • 15. Version Mismatch • SSTable format changed between versions, making streaming incompatible • Version mismatch can break bootstrap, repair, and decommission • Introducing new nodes? Stick w/ the same version • Upgrade nodes in place •  One at a time •  One rack / AZ at a time (requires proper snitch)
  • 16. Disk Space not Reclaimed • When you add new nodes, data is streamed from existing nodes • … but it's not deleted from them after • You need to run a nodetool cleanup • Otherwise you'll run out of space just by adding nodes
  • 17. Using Shared Storage • Single point of failure • High latency • Expensive • Performance is about latency • Can increase throughput with more disks • In general avoid EBS, SAN, NAS
  • 19. iostat • Disk stats •  Queue size, wait times • Ignore %util
  • 20. vmstat • virtual memory statistics • Am I swapping? • Reports at an interval, to an optional count
  • 21. dstat • Flexible look at network, CPU, memory, disk
  • 22. strace • What is my process doing? • See all system calls • Filterable with -e • Can attach to running processes
  • 24. nodetool tpstats • What's blocked? • MemtableFlushWriter? - Slow disks! •  also leads to GC issues • Dropped mutations? •  need repair!
  • 25. Histograms • proxyhistograms •  High level read and write times •  Includes network latency • cfhistograms <keyspace> <table> •  reports stats for single table on a single node •  Used to identify tables with performance problems
  • 27. GC Profiling • Opscenter gc stats •  Look for correlations between gc spikes and read/write latency • Cassandra GC Logging •  Can be activated in cassandra-env.sh • jstat •  prints gc activity
  • 28. How much does it matter?
  • 29. Narrow Down the Problem • Is it even Cassandra? Check your metrics! • Nodes flapping / failing •  Check ops center •  Dig into system metrics • Slow queries •  Find your bottleneck •  Check system stats •  JVM GC •  Histograms •  Tracing
  • 30. ©2015 DataStax Confidential. Do not distribute without consent. 30