SlideShare a Scribd company logo
1 of 36
Performance Monitoring
Understanding your Scylla Cluster
Glauber Costa & Tomasz Grabiec
Our Agenda for today
• Basics of Monitoring Scylla
• Monitoring Infrastructure
• Understanding Scylla metrics
Linux tools
• Linux tools are familiar, widely available, no setup needed
▪iostat, top, sar, netstat, etc.
•Good for tier-1 analysis and overviews
▪but often don’t tell the whole story,
▪and are limited to a node only.
The top example
• Scylla uses a polling architecture
▪Scylla running at < 100 % CPU -> definitely underloaded.
▪Scylla running at = 100 % CPU -> impossible to determine.
CPU in use CPU idle
request
poll
period
The top example
• Scylla uses a polling architecture
▪Scylla running at < 100 % CPU -> definitely underloaded.
▪Scylla running at = 100 % CPU -> impossible to determine.
CPU in use
poll
period
poll
period
poll
period
iostat
• iostat: useful to find disk bottlenecks
$ iostat -x -m 1
[...]
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
xvda 0.00 0.00 0.00 0.50 0.00 0.00 2.00 0.00 0.00 0.00 0.00 0.00 0.00
xvdb 1291.00 0.00 3690.50 453.00 234.12 51.34 141.09 8.07 1.95 1.99 1.61 0.23 94.05
xvdc 1332.50 0.00 3808.00 456.00 236.65 51.31 138.31 8.38 1.96 2.01 1.56 0.22 94.70
xvdd 1308.50 0.00 3704.50 449.50 233.14 50.78 139.98 6.83 1.65 1.69 1.27 0.23 93.95
xvde 1285.50 0.00 3632.50 454.50 229.48 51.53 140.81 7.74 1.89 1.94 1.53 0.23 93.40
xvdf 1281.50 0.00 3524.00 459.50 227.91 51.95 143.88 8.08 2.04 2.06 1.86 0.23 93.25
xvdg 1306.00 0.00 3576.50 453.50 231.10 51.70 143.71 7.58 1.89 1.92 1.64 0.23 93.50
xvdh 1302.00 0.00 3566.50 451.50 231.58 51.53 144.30 6.77 1.67 1.72 1.28 0.23 92.90
xvdi 1279.00 0.00 3627.00 448.00 235.86 51.11 144.22 7.92 1.95 1.97 1.73 0.23 93.45
md0 0.00 0.00 34234.50 3570.50 1860.41 411.33 123.07 0.00 0.00 0.00 0.00 0.00 0.00
Linux & Client side metrics
• iostat: useful to find disk bottlenecks
$ iostat -x -m 1
[...]
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
xvda 0.00 0.00 0.00 0.50 0.00 0.00 2.00 0.00 0.00 0.00 0.00 0.00 0.00
xvdb 1291.00 0.00 3690.50 453.00 234.12 51.34 141.09 8.07 1.95 1.99 1.61 0.23 94.05
xvdc 1332.50 0.00 3808.00 456.00 236.65 51.31 138.31 8.38 1.96 2.01 1.56 0.22 94.70
xvdd 1308.50 0.00 3704.50 449.50 233.14 50.78 139.98 6.83 1.65 1.69 1.27 0.23 93.95
xvde 1285.50 0.00 3632.50 454.50 229.48 51.53 140.81 7.74 1.89 1.94 1.53 0.23 93.40
xvdf 1281.50 0.00 3524.00 459.50 227.91 51.95 143.88 8.08 2.04 2.06 1.86 0.23 93.25
xvdg 1306.00 0.00 3576.50 453.50 231.10 51.70 143.71 7.58 1.89 1.92 1.64 0.23 93.50
xvdh 1302.00 0.00 3566.50 451.50 231.58 51.53 144.30 6.77 1.67 1.72 1.28 0.23 92.90
xvdi 1279.00 0.00 3627.00 448.00 235.86 51.11 144.22 7.92 1.95 1.97 1.73 0.23 93.45
md0 0.00 0.00 34234.50 3570.50 1860.41 411.33 123.07 0.00 0.00 0.00 0.00 0.00 0.00
Linux & Client side metrics
• iostat: useful to find disk bottlenecks
$ iostat -x -m 1
[...]
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
xvda 0.00 0.00 0.00 0.50 0.00 0.00 2.00 0.00 0.00 0.00 0.00 0.00 0.00
xvdb 1291.00 0.00 3690.50 453.00 234.12 51.34 141.09 8.07 1.95 1.99 1.61 0.23 94.05
xvdc 1332.50 0.00 3808.00 456.00 236.65 51.31 138.31 8.38 1.96 2.01 1.56 0.22 94.70
xvdd 1308.50 0.00 3704.50 449.50 233.14 50.78 139.98 6.83 1.65 1.69 1.27 0.23 93.95
xvde 1285.50 0.00 3632.50 454.50 229.48 51.53 140.81 7.74 1.89 1.94 1.53 0.23 93.40
xvdf 1281.50 0.00 3524.00 459.50 227.91 51.95 143.88 8.08 2.04 2.06 1.86 0.23 93.25
xvdg 1306.00 0.00 3576.50 453.50 231.10 51.70 143.71 7.58 1.89 1.92 1.64 0.23 93.50
xvdh 1302.00 0.00 3566.50 451.50 231.58 51.53 144.30 6.77 1.67 1.72 1.28 0.23 92.90
xvdi 1279.00 0.00 3627.00 448.00 235.86 51.11 144.22 7.92 1.95 1.97 1.73 0.23 93.45
md0 0.00 0.00 34234.50 3570.50 1860.41 411.33 123.07 0.00 0.00 0.00 0.00 0.00 0.00
Not all issues are database issues
• Client can introduce latencies as well
▪most notably, cassandra-stress will do.
▪JHiccup - client instrumentation for client-side hiccups.
Our Agenda for today
• Basics of Monitoring Scylla
• Monitoring Infrastructure
• Understanding Scylla metrics
collectd metrics
Prometheus
Scylla / Agent Browserip:9103
Grafana
ip:65534
ip:3000
ip:9103
ip:9103
HTTP
Scylla / Agent
Scylla / Agent
Scylla & Agent
Scylla Monitoring
collectd collectd_exporter
ip:65534
Scylla metrics
scyllatop
Scylla
ip:25826
Scylla + OS metrics
Ip:9103
HTTP
How to use those metrics?
• your own infrastructure
▪Whatever works for collectd, works for Scylla
• scyllatop
• prometheus + grafana
scyllatop
• easy to use, top-like interface.
• very high resolution
• good for ad-hoc probing
▪not very good for cluster-wide view or time progression
List of metrics available
• RESTful API:
$ curl http://scylla-server:10000/collectd | json_reformat
[
…
{
"enable": true,
"id": {
"plugin_instance": "#cpu",
"type_instance": "load",
"type": "gauge",
"plugin": "reactor"
}
},
• scyllatop -l:
▪ includes host metrics
# scylla running with --smp 1
$ scyllatop -l | wc -l
145
prometheus + grafana
•easy cluster-wide view, with pre-configured dashboards
•easy system progression view
•easy metric correlation
•adding composite metrics
•harder to setup,
-but we try to make it easier, docker images, pre-loaded dashboards.
-https://github.com/scylladb/scylla-grafana-monitoring
prometheus + grafana
• prometheus/grafana imgs, pre-loaded with dashboards:
▪https://github.com/scylladb/scylla-grafana-monitoring
Correlating metrics
Our Agenda for today
• Basics of Monitoring Scylla
• Monitoring Infrastructure
• Understanding Scylla metrics
Naming of metrics
Collectd naming scheme:
{host}/{plugin}-{plugin instance}/{type}-{type instance}
• plugin - name of the component
• plugin instance - instance of the component
• type - type of metric’s value
• type instance - name of the metric of given component
Naming of metrics
Collectd naming scheme:
{host}/{plugin}-{plugin instance}/{type}-{type instance}
E.g.:
node1/reactor-0/gauge-load
Naming of metrics
• plugin instances usually correspond to shard numbers.
▪ Example --smp 3:
node1/reactor-0/gauge-load
node1/reactor-1/gauge-load
node1/reactor-2/gauge-load
• GAUGE - value as is
▪ collectd types: gauge, bytes, pending_operations, ...
▪ reactor-*/gauge-load, lsa-*/bytes-total_space, ...
• DERIVE - change over time
▪ collectd types: total_operations, derive, ...
▪ database-*/total_operations-total_reads
Data source types
Naming of metrics
When exported to prometheus:
collectd_{plugin}_{type} { {plugin}={plugin instance},type={type instance},instance={host} }
E.g.:
collectd_reactor_gauge{reactor=”0”,type=”load”,instance=”node1”}
Metric plugins
coordinator replica
transport
(CQL server)
thrift
storage_proxy
database
memtables cachecommitlog
seastar framework
reactor memory io_queue
lsa
smp
compaction_manager
• transport-*/total_operations-requests_served
▪ counts incoming CQL requests
▪ coordinator-side
• database-*/total_operations-total_{reads|writes}
▪ counts incoming replica read/write requests
• both are DERIVE-typed
Throughput metrics
• storage_proxy-*/total_operations-{read|write} timeouts
▪ count number of timeouted read and write requests
▪ coordinator-side
• check coordinator logs
• check replica logs
• check for overload
Error metrics
Best reflected by reactor-*/gauge-load
• percentage of time Scylla was executing tasks
▪ excludes busy polling, execution of on-idle tasks, sleeping
▪ Updated every second and reflects past 5 seconds.
• 100 means the server is CPU-bound
CPU Utilization
Memory utilization metrics
total memory
standard
allocations
(non-LSA)
LSA free
memtables
(dirty)
cache
Memory utilization metrics
total memory
standard
allocations
(non-LSA)
LSA free
memtables
(dirty)
cache
lsa-*/bytes-non_lsa_used_space
memory-*/memory-total_memory
lsa-*/bytes-total_space
memory-*/bytes-dirty cache-*/bytes-total
Memory utilization metrics
• Useful for detecting:
▪cache getting shrunk down due to pressure from std allocations
▪requests blocking
-only 50 % of memory is allowed to be dirty.
-Requests will block if we can’t clean fast enough.
Memory utilization metrics
Cache metrics
• cache-*/total_operations-*:
▪ hits, misses - entries found/not found in cache during read
▪ merges - entries updated during memtable flush
▪ insertions - entries added (on miss, memtable flush)
▪ evictions - entries removed due to memory pressure
▪ removals - entries invalidated (ring ownership change)
• currently entries are per-partition
Cache metrics
I/O Queue metrics
• Scylla uses the I/O Queue to provide fairness among:
▪ commitlog, memtables, query, etc
io_queue-*/derive-{class name} bandwidth (bps)
io_queue-*/delay-{class name} queue latency, not counting disk access
(s)
io_queue-*/queue_length-{class name} # requests waiting
io_queue-*/total_operations-{class name} IOPS
Thank You!
github.com/scylladb/scylla-grafana-monitoring
Tomasz: tgrabiec@scylladb.com / @tgrabiec
Glauber: glauber@scylladb.com / @glcst

More Related Content

What's hot

Latest performance changes by Scylla - Project optimus / Nolimits
Latest performance changes by Scylla - Project optimus / Nolimits Latest performance changes by Scylla - Project optimus / Nolimits
Latest performance changes by Scylla - Project optimus / Nolimits ScyllaDB
 
ClickHouse Monitoring 101: What to monitor and how
ClickHouse Monitoring 101: What to monitor and howClickHouse Monitoring 101: What to monitor and how
ClickHouse Monitoring 101: What to monitor and howAltinity Ltd
 
Monitoring Microservices
Monitoring MicroservicesMonitoring Microservices
Monitoring MicroservicesWeaveworks
 
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of Facebook
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of FacebookTech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of Facebook
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of FacebookThe Hive
 
Grafana Mimir and VictoriaMetrics_ Performance Tests.pptx
Grafana Mimir and VictoriaMetrics_ Performance Tests.pptxGrafana Mimir and VictoriaMetrics_ Performance Tests.pptx
Grafana Mimir and VictoriaMetrics_ Performance Tests.pptxRomanKhavronenko
 
Dangerous on ClickHouse in 30 minutes, by Robert Hodges, Altinity CEO
Dangerous on ClickHouse in 30 minutes, by Robert Hodges, Altinity CEODangerous on ClickHouse in 30 minutes, by Robert Hodges, Altinity CEO
Dangerous on ClickHouse in 30 minutes, by Robert Hodges, Altinity CEOAltinity Ltd
 
Spark Performance Tuning .pdf
Spark Performance Tuning .pdfSpark Performance Tuning .pdf
Spark Performance Tuning .pdfAmit Raj
 
Data Source API in Spark
Data Source API in SparkData Source API in Spark
Data Source API in SparkDatabricks
 
Introduction to Kafka Cruise Control
Introduction to Kafka Cruise ControlIntroduction to Kafka Cruise Control
Introduction to Kafka Cruise ControlJiangjie Qin
 
Getting the Scylla Shard-Aware Drivers Faster
Getting the Scylla Shard-Aware Drivers FasterGetting the Scylla Shard-Aware Drivers Faster
Getting the Scylla Shard-Aware Drivers FasterScyllaDB
 
Hardening Kafka Replication
Hardening Kafka Replication Hardening Kafka Replication
Hardening Kafka Replication confluent
 
HTTP Analytics for 6M requests per second using ClickHouse, by Alexander Boc...
HTTP Analytics for 6M requests per second using ClickHouse, by  Alexander Boc...HTTP Analytics for 6M requests per second using ClickHouse, by  Alexander Boc...
HTTP Analytics for 6M requests per second using ClickHouse, by Alexander Boc...Altinity Ltd
 
Spark Operator—Deploy, Manage and Monitor Spark clusters on Kubernetes
 Spark Operator—Deploy, Manage and Monitor Spark clusters on Kubernetes Spark Operator—Deploy, Manage and Monitor Spark clusters on Kubernetes
Spark Operator—Deploy, Manage and Monitor Spark clusters on KubernetesDatabricks
 
Data Warehouses in Kubernetes Visualized: the ClickHouse Kubernetes Operator UI
Data Warehouses in Kubernetes Visualized: the ClickHouse Kubernetes Operator UIData Warehouses in Kubernetes Visualized: the ClickHouse Kubernetes Operator UI
Data Warehouses in Kubernetes Visualized: the ClickHouse Kubernetes Operator UIAltinity Ltd
 
Enabling Vectorized Engine in Apache Spark
Enabling Vectorized Engine in Apache SparkEnabling Vectorized Engine in Apache Spark
Enabling Vectorized Engine in Apache SparkKazuaki Ishizaki
 
ClickHouse in Real Life. Case Studies and Best Practices, by Alexander Zaitsev
ClickHouse in Real Life. Case Studies and Best Practices, by Alexander ZaitsevClickHouse in Real Life. Case Studies and Best Practices, by Alexander Zaitsev
ClickHouse in Real Life. Case Studies and Best Practices, by Alexander ZaitsevAltinity Ltd
 
How we got to 1 millisecond latency in 99% under repair, compaction, and flus...
How we got to 1 millisecond latency in 99% under repair, compaction, and flus...How we got to 1 millisecond latency in 99% under repair, compaction, and flus...
How we got to 1 millisecond latency in 99% under repair, compaction, and flus...ScyllaDB
 
Bucket your partitions wisely - Cassandra summit 2016
Bucket your partitions wisely - Cassandra summit 2016Bucket your partitions wisely - Cassandra summit 2016
Bucket your partitions wisely - Cassandra summit 2016Markus Höfer
 
[Meetup] a successful migration from elastic search to clickhouse
[Meetup] a successful migration from elastic search to clickhouse[Meetup] a successful migration from elastic search to clickhouse
[Meetup] a successful migration from elastic search to clickhouseVianney FOUCAULT
 
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...Databricks
 

What's hot (20)

Latest performance changes by Scylla - Project optimus / Nolimits
Latest performance changes by Scylla - Project optimus / Nolimits Latest performance changes by Scylla - Project optimus / Nolimits
Latest performance changes by Scylla - Project optimus / Nolimits
 
ClickHouse Monitoring 101: What to monitor and how
ClickHouse Monitoring 101: What to monitor and howClickHouse Monitoring 101: What to monitor and how
ClickHouse Monitoring 101: What to monitor and how
 
Monitoring Microservices
Monitoring MicroservicesMonitoring Microservices
Monitoring Microservices
 
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of Facebook
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of FacebookTech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of Facebook
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of Facebook
 
Grafana Mimir and VictoriaMetrics_ Performance Tests.pptx
Grafana Mimir and VictoriaMetrics_ Performance Tests.pptxGrafana Mimir and VictoriaMetrics_ Performance Tests.pptx
Grafana Mimir and VictoriaMetrics_ Performance Tests.pptx
 
Dangerous on ClickHouse in 30 minutes, by Robert Hodges, Altinity CEO
Dangerous on ClickHouse in 30 minutes, by Robert Hodges, Altinity CEODangerous on ClickHouse in 30 minutes, by Robert Hodges, Altinity CEO
Dangerous on ClickHouse in 30 minutes, by Robert Hodges, Altinity CEO
 
Spark Performance Tuning .pdf
Spark Performance Tuning .pdfSpark Performance Tuning .pdf
Spark Performance Tuning .pdf
 
Data Source API in Spark
Data Source API in SparkData Source API in Spark
Data Source API in Spark
 
Introduction to Kafka Cruise Control
Introduction to Kafka Cruise ControlIntroduction to Kafka Cruise Control
Introduction to Kafka Cruise Control
 
Getting the Scylla Shard-Aware Drivers Faster
Getting the Scylla Shard-Aware Drivers FasterGetting the Scylla Shard-Aware Drivers Faster
Getting the Scylla Shard-Aware Drivers Faster
 
Hardening Kafka Replication
Hardening Kafka Replication Hardening Kafka Replication
Hardening Kafka Replication
 
HTTP Analytics for 6M requests per second using ClickHouse, by Alexander Boc...
HTTP Analytics for 6M requests per second using ClickHouse, by  Alexander Boc...HTTP Analytics for 6M requests per second using ClickHouse, by  Alexander Boc...
HTTP Analytics for 6M requests per second using ClickHouse, by Alexander Boc...
 
Spark Operator—Deploy, Manage and Monitor Spark clusters on Kubernetes
 Spark Operator—Deploy, Manage and Monitor Spark clusters on Kubernetes Spark Operator—Deploy, Manage and Monitor Spark clusters on Kubernetes
Spark Operator—Deploy, Manage and Monitor Spark clusters on Kubernetes
 
Data Warehouses in Kubernetes Visualized: the ClickHouse Kubernetes Operator UI
Data Warehouses in Kubernetes Visualized: the ClickHouse Kubernetes Operator UIData Warehouses in Kubernetes Visualized: the ClickHouse Kubernetes Operator UI
Data Warehouses in Kubernetes Visualized: the ClickHouse Kubernetes Operator UI
 
Enabling Vectorized Engine in Apache Spark
Enabling Vectorized Engine in Apache SparkEnabling Vectorized Engine in Apache Spark
Enabling Vectorized Engine in Apache Spark
 
ClickHouse in Real Life. Case Studies and Best Practices, by Alexander Zaitsev
ClickHouse in Real Life. Case Studies and Best Practices, by Alexander ZaitsevClickHouse in Real Life. Case Studies and Best Practices, by Alexander Zaitsev
ClickHouse in Real Life. Case Studies and Best Practices, by Alexander Zaitsev
 
How we got to 1 millisecond latency in 99% under repair, compaction, and flus...
How we got to 1 millisecond latency in 99% under repair, compaction, and flus...How we got to 1 millisecond latency in 99% under repair, compaction, and flus...
How we got to 1 millisecond latency in 99% under repair, compaction, and flus...
 
Bucket your partitions wisely - Cassandra summit 2016
Bucket your partitions wisely - Cassandra summit 2016Bucket your partitions wisely - Cassandra summit 2016
Bucket your partitions wisely - Cassandra summit 2016
 
[Meetup] a successful migration from elastic search to clickhouse
[Meetup] a successful migration from elastic search to clickhouse[Meetup] a successful migration from elastic search to clickhouse
[Meetup] a successful migration from elastic search to clickhouse
 
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
 

Similar to Performance Monitoring: Understanding Your Scylla Cluster

Linux Systems Performance 2016
Linux Systems Performance 2016Linux Systems Performance 2016
Linux Systems Performance 2016Brendan Gregg
 
Benchmarking Solr Performance at Scale
Benchmarking Solr Performance at ScaleBenchmarking Solr Performance at Scale
Benchmarking Solr Performance at Scalethelabdude
 
Linux Performance Tools
Linux Performance ToolsLinux Performance Tools
Linux Performance ToolsBrendan Gregg
 
Performance Analysis: new tools and concepts from the cloud
Performance Analysis: new tools and concepts from the cloudPerformance Analysis: new tools and concepts from the cloud
Performance Analysis: new tools and concepts from the cloudBrendan Gregg
 
YOW2020 Linux Systems Performance
YOW2020 Linux Systems PerformanceYOW2020 Linux Systems Performance
YOW2020 Linux Systems PerformanceBrendan Gregg
 
QCon 2015 Broken Performance Tools
QCon 2015 Broken Performance ToolsQCon 2015 Broken Performance Tools
QCon 2015 Broken Performance ToolsBrendan Gregg
 
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...DataStax Academy
 
Leveraging Cassandra for real-time multi-datacenter public cloud analytics
Leveraging Cassandra for real-time multi-datacenter public cloud analyticsLeveraging Cassandra for real-time multi-datacenter public cloud analytics
Leveraging Cassandra for real-time multi-datacenter public cloud analyticsJulien Anguenot
 
Broken Performance Tools
Broken Performance ToolsBroken Performance Tools
Broken Performance ToolsC4Media
 
CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak
CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak   CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak
CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak PROIDEA
 
LISA2010 visualizations
LISA2010 visualizationsLISA2010 visualizations
LISA2010 visualizationsBrendan Gregg
 
Percona Live UK 2014 Part III
Percona Live UK 2014  Part IIIPercona Live UK 2014  Part III
Percona Live UK 2014 Part IIIAlkin Tezuysal
 
Nodejs性能分析优化和分布式设计探讨
Nodejs性能分析优化和分布式设计探讨Nodejs性能分析优化和分布式设计探讨
Nodejs性能分析优化和分布式设计探讨flyinweb
 
Oracle Database In-Memory Option in Action
Oracle Database In-Memory Option in ActionOracle Database In-Memory Option in Action
Oracle Database In-Memory Option in ActionTanel Poder
 
In Memory Database In Action by Tanel Poder and Kerry Osborne
In Memory Database In Action by Tanel Poder and Kerry OsborneIn Memory Database In Action by Tanel Poder and Kerry Osborne
In Memory Database In Action by Tanel Poder and Kerry OsborneEnkitec
 
Analyzing OS X Systems Performance with the USE Method
Analyzing OS X Systems Performance with the USE MethodAnalyzing OS X Systems Performance with the USE Method
Analyzing OS X Systems Performance with the USE MethodBrendan Gregg
 
hacking-embedded-devices.pptx
hacking-embedded-devices.pptxhacking-embedded-devices.pptx
hacking-embedded-devices.pptxssuserfcf43f
 
Performance Scenario: Diagnosing and resolving sudden slow down on two node RAC
Performance Scenario: Diagnosing and resolving sudden slow down on two node RACPerformance Scenario: Diagnosing and resolving sudden slow down on two node RAC
Performance Scenario: Diagnosing and resolving sudden slow down on two node RACKristofferson A
 
Broken Linux Performance Tools 2016
Broken Linux Performance Tools 2016Broken Linux Performance Tools 2016
Broken Linux Performance Tools 2016Brendan Gregg
 

Similar to Performance Monitoring: Understanding Your Scylla Cluster (20)

Linux Systems Performance 2016
Linux Systems Performance 2016Linux Systems Performance 2016
Linux Systems Performance 2016
 
Benchmarking Solr Performance at Scale
Benchmarking Solr Performance at ScaleBenchmarking Solr Performance at Scale
Benchmarking Solr Performance at Scale
 
Linux Performance Tools
Linux Performance ToolsLinux Performance Tools
Linux Performance Tools
 
Performance Analysis: new tools and concepts from the cloud
Performance Analysis: new tools and concepts from the cloudPerformance Analysis: new tools and concepts from the cloud
Performance Analysis: new tools and concepts from the cloud
 
YOW2020 Linux Systems Performance
YOW2020 Linux Systems PerformanceYOW2020 Linux Systems Performance
YOW2020 Linux Systems Performance
 
200.1,2-Capacity Planning
200.1,2-Capacity Planning200.1,2-Capacity Planning
200.1,2-Capacity Planning
 
QCon 2015 Broken Performance Tools
QCon 2015 Broken Performance ToolsQCon 2015 Broken Performance Tools
QCon 2015 Broken Performance Tools
 
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
 
Leveraging Cassandra for real-time multi-datacenter public cloud analytics
Leveraging Cassandra for real-time multi-datacenter public cloud analyticsLeveraging Cassandra for real-time multi-datacenter public cloud analytics
Leveraging Cassandra for real-time multi-datacenter public cloud analytics
 
Broken Performance Tools
Broken Performance ToolsBroken Performance Tools
Broken Performance Tools
 
CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak
CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak   CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak
CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak
 
LISA2010 visualizations
LISA2010 visualizationsLISA2010 visualizations
LISA2010 visualizations
 
Percona Live UK 2014 Part III
Percona Live UK 2014  Part IIIPercona Live UK 2014  Part III
Percona Live UK 2014 Part III
 
Nodejs性能分析优化和分布式设计探讨
Nodejs性能分析优化和分布式设计探讨Nodejs性能分析优化和分布式设计探讨
Nodejs性能分析优化和分布式设计探讨
 
Oracle Database In-Memory Option in Action
Oracle Database In-Memory Option in ActionOracle Database In-Memory Option in Action
Oracle Database In-Memory Option in Action
 
In Memory Database In Action by Tanel Poder and Kerry Osborne
In Memory Database In Action by Tanel Poder and Kerry OsborneIn Memory Database In Action by Tanel Poder and Kerry Osborne
In Memory Database In Action by Tanel Poder and Kerry Osborne
 
Analyzing OS X Systems Performance with the USE Method
Analyzing OS X Systems Performance with the USE MethodAnalyzing OS X Systems Performance with the USE Method
Analyzing OS X Systems Performance with the USE Method
 
hacking-embedded-devices.pptx
hacking-embedded-devices.pptxhacking-embedded-devices.pptx
hacking-embedded-devices.pptx
 
Performance Scenario: Diagnosing and resolving sudden slow down on two node RAC
Performance Scenario: Diagnosing and resolving sudden slow down on two node RACPerformance Scenario: Diagnosing and resolving sudden slow down on two node RAC
Performance Scenario: Diagnosing and resolving sudden slow down on two node RAC
 
Broken Linux Performance Tools 2016
Broken Linux Performance Tools 2016Broken Linux Performance Tools 2016
Broken Linux Performance Tools 2016
 

More from ScyllaDB

Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
What Developers Need to Unlearn for High Performance NoSQL
What Developers Need to Unlearn for High Performance NoSQLWhat Developers Need to Unlearn for High Performance NoSQL
What Developers Need to Unlearn for High Performance NoSQLScyllaDB
 
Low Latency at Extreme Scale: Proven Practices & Pitfalls
Low Latency at Extreme Scale: Proven Practices & PitfallsLow Latency at Extreme Scale: Proven Practices & Pitfalls
Low Latency at Extreme Scale: Proven Practices & PitfallsScyllaDB
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasScyllaDB
 
Beyond Linear Scaling: A New Path for Performance with ScyllaDB
Beyond Linear Scaling: A New Path for Performance with ScyllaDBBeyond Linear Scaling: A New Path for Performance with ScyllaDB
Beyond Linear Scaling: A New Path for Performance with ScyllaDBScyllaDB
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasScyllaDB
 
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...ScyllaDB
 
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...ScyllaDB
 
Database Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr SarnaDatabase Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr SarnaScyllaDB
 
Replacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDBReplacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDBScyllaDB
 
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear ScalabilityPowering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear ScalabilityScyllaDB
 
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptx7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptxScyllaDB
 
Getting the most out of ScyllaDB
Getting the most out of ScyllaDBGetting the most out of ScyllaDB
Getting the most out of ScyllaDBScyllaDB
 
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a MigrationNoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a MigrationScyllaDB
 
NoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration LogisticsNoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration LogisticsScyllaDB
 
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and ChallengesNoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and ChallengesScyllaDB
 
ScyllaDB Virtual Workshop
ScyllaDB Virtual WorkshopScyllaDB Virtual Workshop
ScyllaDB Virtual WorkshopScyllaDB
 
DBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & TradeoffsDBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & TradeoffsScyllaDB
 
Build Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDBBuild Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDBScyllaDB
 
NoSQL Data Modeling 101
NoSQL Data Modeling 101NoSQL Data Modeling 101
NoSQL Data Modeling 101ScyllaDB
 

More from ScyllaDB (20)

Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
What Developers Need to Unlearn for High Performance NoSQL
What Developers Need to Unlearn for High Performance NoSQLWhat Developers Need to Unlearn for High Performance NoSQL
What Developers Need to Unlearn for High Performance NoSQL
 
Low Latency at Extreme Scale: Proven Practices & Pitfalls
Low Latency at Extreme Scale: Proven Practices & PitfallsLow Latency at Extreme Scale: Proven Practices & Pitfalls
Low Latency at Extreme Scale: Proven Practices & Pitfalls
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance Dilemmas
 
Beyond Linear Scaling: A New Path for Performance with ScyllaDB
Beyond Linear Scaling: A New Path for Performance with ScyllaDBBeyond Linear Scaling: A New Path for Performance with ScyllaDB
Beyond Linear Scaling: A New Path for Performance with ScyllaDB
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance Dilemmas
 
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
 
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
 
Database Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr SarnaDatabase Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
 
Replacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDBReplacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDB
 
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear ScalabilityPowering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
 
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptx7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
 
Getting the most out of ScyllaDB
Getting the most out of ScyllaDBGetting the most out of ScyllaDB
Getting the most out of ScyllaDB
 
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a MigrationNoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
 
NoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration LogisticsNoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration Logistics
 
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and ChallengesNoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
 
ScyllaDB Virtual Workshop
ScyllaDB Virtual WorkshopScyllaDB Virtual Workshop
ScyllaDB Virtual Workshop
 
DBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & TradeoffsDBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & Tradeoffs
 
Build Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDBBuild Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDB
 
NoSQL Data Modeling 101
NoSQL Data Modeling 101NoSQL Data Modeling 101
NoSQL Data Modeling 101
 

Recently uploaded

SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Hyundai Motor Group
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsHyundai Motor Group
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 

Recently uploaded (20)

SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 

Performance Monitoring: Understanding Your Scylla Cluster

  • 1. Performance Monitoring Understanding your Scylla Cluster Glauber Costa & Tomasz Grabiec
  • 2. Our Agenda for today • Basics of Monitoring Scylla • Monitoring Infrastructure • Understanding Scylla metrics
  • 3. Linux tools • Linux tools are familiar, widely available, no setup needed ▪iostat, top, sar, netstat, etc. •Good for tier-1 analysis and overviews ▪but often don’t tell the whole story, ▪and are limited to a node only.
  • 4. The top example • Scylla uses a polling architecture ▪Scylla running at < 100 % CPU -> definitely underloaded. ▪Scylla running at = 100 % CPU -> impossible to determine. CPU in use CPU idle request poll period
  • 5. The top example • Scylla uses a polling architecture ▪Scylla running at < 100 % CPU -> definitely underloaded. ▪Scylla running at = 100 % CPU -> impossible to determine. CPU in use poll period poll period poll period
  • 6. iostat • iostat: useful to find disk bottlenecks $ iostat -x -m 1 [...] Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util xvda 0.00 0.00 0.00 0.50 0.00 0.00 2.00 0.00 0.00 0.00 0.00 0.00 0.00 xvdb 1291.00 0.00 3690.50 453.00 234.12 51.34 141.09 8.07 1.95 1.99 1.61 0.23 94.05 xvdc 1332.50 0.00 3808.00 456.00 236.65 51.31 138.31 8.38 1.96 2.01 1.56 0.22 94.70 xvdd 1308.50 0.00 3704.50 449.50 233.14 50.78 139.98 6.83 1.65 1.69 1.27 0.23 93.95 xvde 1285.50 0.00 3632.50 454.50 229.48 51.53 140.81 7.74 1.89 1.94 1.53 0.23 93.40 xvdf 1281.50 0.00 3524.00 459.50 227.91 51.95 143.88 8.08 2.04 2.06 1.86 0.23 93.25 xvdg 1306.00 0.00 3576.50 453.50 231.10 51.70 143.71 7.58 1.89 1.92 1.64 0.23 93.50 xvdh 1302.00 0.00 3566.50 451.50 231.58 51.53 144.30 6.77 1.67 1.72 1.28 0.23 92.90 xvdi 1279.00 0.00 3627.00 448.00 235.86 51.11 144.22 7.92 1.95 1.97 1.73 0.23 93.45 md0 0.00 0.00 34234.50 3570.50 1860.41 411.33 123.07 0.00 0.00 0.00 0.00 0.00 0.00
  • 7. Linux & Client side metrics • iostat: useful to find disk bottlenecks $ iostat -x -m 1 [...] Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util xvda 0.00 0.00 0.00 0.50 0.00 0.00 2.00 0.00 0.00 0.00 0.00 0.00 0.00 xvdb 1291.00 0.00 3690.50 453.00 234.12 51.34 141.09 8.07 1.95 1.99 1.61 0.23 94.05 xvdc 1332.50 0.00 3808.00 456.00 236.65 51.31 138.31 8.38 1.96 2.01 1.56 0.22 94.70 xvdd 1308.50 0.00 3704.50 449.50 233.14 50.78 139.98 6.83 1.65 1.69 1.27 0.23 93.95 xvde 1285.50 0.00 3632.50 454.50 229.48 51.53 140.81 7.74 1.89 1.94 1.53 0.23 93.40 xvdf 1281.50 0.00 3524.00 459.50 227.91 51.95 143.88 8.08 2.04 2.06 1.86 0.23 93.25 xvdg 1306.00 0.00 3576.50 453.50 231.10 51.70 143.71 7.58 1.89 1.92 1.64 0.23 93.50 xvdh 1302.00 0.00 3566.50 451.50 231.58 51.53 144.30 6.77 1.67 1.72 1.28 0.23 92.90 xvdi 1279.00 0.00 3627.00 448.00 235.86 51.11 144.22 7.92 1.95 1.97 1.73 0.23 93.45 md0 0.00 0.00 34234.50 3570.50 1860.41 411.33 123.07 0.00 0.00 0.00 0.00 0.00 0.00
  • 8. Linux & Client side metrics • iostat: useful to find disk bottlenecks $ iostat -x -m 1 [...] Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util xvda 0.00 0.00 0.00 0.50 0.00 0.00 2.00 0.00 0.00 0.00 0.00 0.00 0.00 xvdb 1291.00 0.00 3690.50 453.00 234.12 51.34 141.09 8.07 1.95 1.99 1.61 0.23 94.05 xvdc 1332.50 0.00 3808.00 456.00 236.65 51.31 138.31 8.38 1.96 2.01 1.56 0.22 94.70 xvdd 1308.50 0.00 3704.50 449.50 233.14 50.78 139.98 6.83 1.65 1.69 1.27 0.23 93.95 xvde 1285.50 0.00 3632.50 454.50 229.48 51.53 140.81 7.74 1.89 1.94 1.53 0.23 93.40 xvdf 1281.50 0.00 3524.00 459.50 227.91 51.95 143.88 8.08 2.04 2.06 1.86 0.23 93.25 xvdg 1306.00 0.00 3576.50 453.50 231.10 51.70 143.71 7.58 1.89 1.92 1.64 0.23 93.50 xvdh 1302.00 0.00 3566.50 451.50 231.58 51.53 144.30 6.77 1.67 1.72 1.28 0.23 92.90 xvdi 1279.00 0.00 3627.00 448.00 235.86 51.11 144.22 7.92 1.95 1.97 1.73 0.23 93.45 md0 0.00 0.00 34234.50 3570.50 1860.41 411.33 123.07 0.00 0.00 0.00 0.00 0.00 0.00
  • 9. Not all issues are database issues • Client can introduce latencies as well ▪most notably, cassandra-stress will do. ▪JHiccup - client instrumentation for client-side hiccups.
  • 10. Our Agenda for today • Basics of Monitoring Scylla • Monitoring Infrastructure • Understanding Scylla metrics
  • 11. collectd metrics Prometheus Scylla / Agent Browserip:9103 Grafana ip:65534 ip:3000 ip:9103 ip:9103 HTTP Scylla / Agent Scylla / Agent
  • 12. Scylla & Agent Scylla Monitoring collectd collectd_exporter ip:65534 Scylla metrics scyllatop Scylla ip:25826 Scylla + OS metrics Ip:9103 HTTP
  • 13. How to use those metrics? • your own infrastructure ▪Whatever works for collectd, works for Scylla • scyllatop • prometheus + grafana
  • 14. scyllatop • easy to use, top-like interface. • very high resolution • good for ad-hoc probing ▪not very good for cluster-wide view or time progression
  • 15. List of metrics available • RESTful API: $ curl http://scylla-server:10000/collectd | json_reformat [ … { "enable": true, "id": { "plugin_instance": "#cpu", "type_instance": "load", "type": "gauge", "plugin": "reactor" } }, • scyllatop -l: ▪ includes host metrics # scylla running with --smp 1 $ scyllatop -l | wc -l 145
  • 16. prometheus + grafana •easy cluster-wide view, with pre-configured dashboards •easy system progression view •easy metric correlation •adding composite metrics •harder to setup, -but we try to make it easier, docker images, pre-loaded dashboards. -https://github.com/scylladb/scylla-grafana-monitoring
  • 17. prometheus + grafana • prometheus/grafana imgs, pre-loaded with dashboards: ▪https://github.com/scylladb/scylla-grafana-monitoring
  • 19. Our Agenda for today • Basics of Monitoring Scylla • Monitoring Infrastructure • Understanding Scylla metrics
  • 20. Naming of metrics Collectd naming scheme: {host}/{plugin}-{plugin instance}/{type}-{type instance} • plugin - name of the component • plugin instance - instance of the component • type - type of metric’s value • type instance - name of the metric of given component
  • 21. Naming of metrics Collectd naming scheme: {host}/{plugin}-{plugin instance}/{type}-{type instance} E.g.: node1/reactor-0/gauge-load
  • 22. Naming of metrics • plugin instances usually correspond to shard numbers. ▪ Example --smp 3: node1/reactor-0/gauge-load node1/reactor-1/gauge-load node1/reactor-2/gauge-load
  • 23. • GAUGE - value as is ▪ collectd types: gauge, bytes, pending_operations, ... ▪ reactor-*/gauge-load, lsa-*/bytes-total_space, ... • DERIVE - change over time ▪ collectd types: total_operations, derive, ... ▪ database-*/total_operations-total_reads Data source types
  • 24. Naming of metrics When exported to prometheus: collectd_{plugin}_{type} { {plugin}={plugin instance},type={type instance},instance={host} } E.g.: collectd_reactor_gauge{reactor=”0”,type=”load”,instance=”node1”}
  • 25. Metric plugins coordinator replica transport (CQL server) thrift storage_proxy database memtables cachecommitlog seastar framework reactor memory io_queue lsa smp compaction_manager
  • 26. • transport-*/total_operations-requests_served ▪ counts incoming CQL requests ▪ coordinator-side • database-*/total_operations-total_{reads|writes} ▪ counts incoming replica read/write requests • both are DERIVE-typed Throughput metrics
  • 27. • storage_proxy-*/total_operations-{read|write} timeouts ▪ count number of timeouted read and write requests ▪ coordinator-side • check coordinator logs • check replica logs • check for overload Error metrics
  • 28. Best reflected by reactor-*/gauge-load • percentage of time Scylla was executing tasks ▪ excludes busy polling, execution of on-idle tasks, sleeping ▪ Updated every second and reflects past 5 seconds. • 100 means the server is CPU-bound CPU Utilization
  • 29. Memory utilization metrics total memory standard allocations (non-LSA) LSA free memtables (dirty) cache
  • 30. Memory utilization metrics total memory standard allocations (non-LSA) LSA free memtables (dirty) cache lsa-*/bytes-non_lsa_used_space memory-*/memory-total_memory lsa-*/bytes-total_space memory-*/bytes-dirty cache-*/bytes-total
  • 31. Memory utilization metrics • Useful for detecting: ▪cache getting shrunk down due to pressure from std allocations ▪requests blocking -only 50 % of memory is allowed to be dirty. -Requests will block if we can’t clean fast enough.
  • 33. Cache metrics • cache-*/total_operations-*: ▪ hits, misses - entries found/not found in cache during read ▪ merges - entries updated during memtable flush ▪ insertions - entries added (on miss, memtable flush) ▪ evictions - entries removed due to memory pressure ▪ removals - entries invalidated (ring ownership change) • currently entries are per-partition
  • 35. I/O Queue metrics • Scylla uses the I/O Queue to provide fairness among: ▪ commitlog, memtables, query, etc io_queue-*/derive-{class name} bandwidth (bps) io_queue-*/delay-{class name} queue latency, not counting disk access (s) io_queue-*/queue_length-{class name} # requests waiting io_queue-*/total_operations-{class name} IOPS