SlideShare a Scribd company logo
1 of 20
Download to read offline
Metrics Simplified
      Mark Lin
  mlin@admob.com
why?

"If you can not measure it, you can not improve it"
                                    -Lord Kelvin


99.999% ("five nines") = 5.26 minutes
previously ...

                 Sending/Collecting is
                 complicated.
                 Single collection server.
                 Tedious to configure
                 new metric collection or
                 creation.
                 Calculating metric from
                 file is expensive.
bottlenecks ...

  Poll based collection server

  Not easy (!fun) to configure new metric collection or
  creation.
     =grunt work for ops-engineer

                                           uhhhh....
enabling technology

  Graphite

  RabbitMQ

  Graphite Local Proxy

  RockSteady ( w/ Esper )
path to graph
1min.juicer.output.apple.sc1.jcr1 20 1276822626



echo "1min.juicer.output.apple.sc1.jcr1 20 1276822626" | nc
localhost 3400
path to graph
1min.juicer.output.apple.sc1.jcr1 20 1276822626



echo "1min.juicer.output.apple.sc1.jcr1 20 1276822626" | nc
localhost 3400
graph
graph
graph
graph = post event forensic
Rocksteady, metric as event

1min.juicer.common.version.sc1.jcr1 100 1276822626

INSERT INTO Deploy
SELECT * FROM Metric(name='common.
revision')                MATCH_RECORNIZE (
       partition by colo, hostname
       measures A.value as revision, A.colo as colo, A.
hostname as hostname, A.
app as app, A.timestamp as timestamp
        pattern (A)
        define
            A as A.value > prev(A.value))
Rocksteady, metric as event

1min.juicer.common.version.sc1.jcr1 100 1276822626

INSERT INTO Deploy
SELECT * FROM Metric(name='common.
revision')                MATCH_RECORNIZE (
       partition by colo, hostname
       measures A.value as revision, A.colo as colo, A.
hostname as hostname, A.
app as app, A.timestamp as timestamp
        pattern (A)
        define
            A as A.value > prev(A.value))
auto threshold, prediction
correlation

  Deployment related problem.

  Capture sets of metrics when important ones crossed
  threshold.

  Determine dependencies such as cpu to request to second
  or response time.
correlation

  Deployment related problem.

  Capture sets of metrics when important ones crossed
  threshold.

  Determine dependencies such as cpu to request to second
  or response time.
revelation
beyond simple metric

  Timing info per request.

  Actual time spent in each component in an application.
  Map out dependency, find exact area of problem.
beyond simple metric

  Timing info per request.

  Actual time spent in each component in an application.
  Map out dependency, find exact area of problem.
what we learned?

1. Make metric sending simple.
2. Nice UI to make sense of data.
3. Real time processing of metric rocks.

More Related Content

What's hot

ArcGIS Server Tips and Tricks, MAC-URISA2010
ArcGIS Server Tips and Tricks, MAC-URISA2010ArcGIS Server Tips and Tricks, MAC-URISA2010
ArcGIS Server Tips and Tricks, MAC-URISA2010Brian
 
Nika it consulting weekly update
Nika it consulting weekly update  Nika it consulting weekly update
Nika it consulting weekly update Rod Delwar
 
apidays LIVE Paris 2021 - Building an AWS EC2 Carbon Emissions Dataset by Ben...
apidays LIVE Paris 2021 - Building an AWS EC2 Carbon Emissions Dataset by Ben...apidays LIVE Paris 2021 - Building an AWS EC2 Carbon Emissions Dataset by Ben...
apidays LIVE Paris 2021 - Building an AWS EC2 Carbon Emissions Dataset by Ben...apidays
 
Reactive Integration with Akka Streams and Alpakka
Reactive Integration with Akka Streams and AlpakkaReactive Integration with Akka Streams and Alpakka
Reactive Integration with Akka Streams and AlpakkaSoftwareMill
 
Barbara Nelson [InfluxData] | How Can I Put That Dashboard in My App? | Influ...
Barbara Nelson [InfluxData] | How Can I Put That Dashboard in My App? | Influ...Barbara Nelson [InfluxData] | How Can I Put That Dashboard in My App? | Influ...
Barbara Nelson [InfluxData] | How Can I Put That Dashboard in My App? | Influ...InfluxData
 
Unlimited Virtual Computing Capacity using the Cloud for Automated Parameter ...
Unlimited Virtual Computing Capacity using the Cloud for Automated Parameter ...Unlimited Virtual Computing Capacity using the Cloud for Automated Parameter ...
Unlimited Virtual Computing Capacity using the Cloud for Automated Parameter ...Joseph Luchette
 
Big data unit iv and v lecture notes qb model exam
Big data unit iv and v lecture notes   qb model examBig data unit iv and v lecture notes   qb model exam
Big data unit iv and v lecture notes qb model examIndhujeni
 
Testing Hadoop jobs with MRUnit
Testing Hadoop jobs with MRUnitTesting Hadoop jobs with MRUnit
Testing Hadoop jobs with MRUnitEric Wendelin
 
Graphite, an introduction
Graphite, an introductionGraphite, an introduction
Graphite, an introductionjamesrwu
 
Collecting metrics with Graphite and StatsD
Collecting metrics with Graphite and StatsDCollecting metrics with Graphite and StatsD
Collecting metrics with Graphite and StatsDitnig
 
Pyclustering tutorial - K-means
Pyclustering tutorial - K-meansPyclustering tutorial - K-means
Pyclustering tutorial - K-meansAndrei Novikov
 
Earthquake analysis by psudeo static method
Earthquake analysis by psudeo static methodEarthquake analysis by psudeo static method
Earthquake analysis by psudeo static methodPralhad Kore
 
Scott Anderson [InfluxData] | Map & Reduce – The Powerhouses of Custom Flux F...
Scott Anderson [InfluxData] | Map & Reduce – The Powerhouses of Custom Flux F...Scott Anderson [InfluxData] | Map & Reduce – The Powerhouses of Custom Flux F...
Scott Anderson [InfluxData] | Map & Reduce – The Powerhouses of Custom Flux F...InfluxData
 
From airflow to google cloud composer
From airflow to google cloud composerFrom airflow to google cloud composer
From airflow to google cloud composerBruce Kuo
 
Major Projects Worked On
Major Projects Worked OnMajor Projects Worked On
Major Projects Worked OnDave Smith
 
Pantheon Demo
Pantheon DemoPantheon Demo
Pantheon DemoZac Ayers
 
Functional Programming with JavaScript
Functional Programming with JavaScriptFunctional Programming with JavaScript
Functional Programming with JavaScriptMark Shelton
 

What's hot (20)

ArcGIS Server Tips and Tricks, MAC-URISA2010
ArcGIS Server Tips and Tricks, MAC-URISA2010ArcGIS Server Tips and Tricks, MAC-URISA2010
ArcGIS Server Tips and Tricks, MAC-URISA2010
 
Nika it consulting weekly update
Nika it consulting weekly update  Nika it consulting weekly update
Nika it consulting weekly update
 
apidays LIVE Paris 2021 - Building an AWS EC2 Carbon Emissions Dataset by Ben...
apidays LIVE Paris 2021 - Building an AWS EC2 Carbon Emissions Dataset by Ben...apidays LIVE Paris 2021 - Building an AWS EC2 Carbon Emissions Dataset by Ben...
apidays LIVE Paris 2021 - Building an AWS EC2 Carbon Emissions Dataset by Ben...
 
Reactive Integration with Akka Streams and Alpakka
Reactive Integration with Akka Streams and AlpakkaReactive Integration with Akka Streams and Alpakka
Reactive Integration with Akka Streams and Alpakka
 
Barbara Nelson [InfluxData] | How Can I Put That Dashboard in My App? | Influ...
Barbara Nelson [InfluxData] | How Can I Put That Dashboard in My App? | Influ...Barbara Nelson [InfluxData] | How Can I Put That Dashboard in My App? | Influ...
Barbara Nelson [InfluxData] | How Can I Put That Dashboard in My App? | Influ...
 
MATLAB Project details
MATLAB Project detailsMATLAB Project details
MATLAB Project details
 
Unlimited Virtual Computing Capacity using the Cloud for Automated Parameter ...
Unlimited Virtual Computing Capacity using the Cloud for Automated Parameter ...Unlimited Virtual Computing Capacity using the Cloud for Automated Parameter ...
Unlimited Virtual Computing Capacity using the Cloud for Automated Parameter ...
 
Highly Available Graphite
Highly Available GraphiteHighly Available Graphite
Highly Available Graphite
 
Big data unit iv and v lecture notes qb model exam
Big data unit iv and v lecture notes   qb model examBig data unit iv and v lecture notes   qb model exam
Big data unit iv and v lecture notes qb model exam
 
Testing Hadoop jobs with MRUnit
Testing Hadoop jobs with MRUnitTesting Hadoop jobs with MRUnit
Testing Hadoop jobs with MRUnit
 
Graphite, an introduction
Graphite, an introductionGraphite, an introduction
Graphite, an introduction
 
Collecting metrics with Graphite and StatsD
Collecting metrics with Graphite and StatsDCollecting metrics with Graphite and StatsD
Collecting metrics with Graphite and StatsD
 
React meets o OCalm
React meets o OCalmReact meets o OCalm
React meets o OCalm
 
Pyclustering tutorial - K-means
Pyclustering tutorial - K-meansPyclustering tutorial - K-means
Pyclustering tutorial - K-means
 
Earthquake analysis by psudeo static method
Earthquake analysis by psudeo static methodEarthquake analysis by psudeo static method
Earthquake analysis by psudeo static method
 
Scott Anderson [InfluxData] | Map & Reduce – The Powerhouses of Custom Flux F...
Scott Anderson [InfluxData] | Map & Reduce – The Powerhouses of Custom Flux F...Scott Anderson [InfluxData] | Map & Reduce – The Powerhouses of Custom Flux F...
Scott Anderson [InfluxData] | Map & Reduce – The Powerhouses of Custom Flux F...
 
From airflow to google cloud composer
From airflow to google cloud composerFrom airflow to google cloud composer
From airflow to google cloud composer
 
Major Projects Worked On
Major Projects Worked OnMajor Projects Worked On
Major Projects Worked On
 
Pantheon Demo
Pantheon DemoPantheon Demo
Pantheon Demo
 
Functional Programming with JavaScript
Functional Programming with JavaScriptFunctional Programming with JavaScript
Functional Programming with JavaScript
 

Similar to Metrics simplified

Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...
Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...
Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...MLconf
 
Cloudwatch - The In's and Out's
Cloudwatch - The In's and Out'sCloudwatch - The In's and Out's
Cloudwatch - The In's and Out'sbeaknit
 
Using SLOs for Continuous Performance Optimizations of Your k8s Workloads
Using SLOs for Continuous Performance Optimizations of Your k8s WorkloadsUsing SLOs for Continuous Performance Optimizations of Your k8s Workloads
Using SLOs for Continuous Performance Optimizations of Your k8s WorkloadsScyllaDB
 
Optimizing Terascale Machine Learning Pipelines with Keystone ML
Optimizing Terascale Machine Learning Pipelines with Keystone MLOptimizing Terascale Machine Learning Pipelines with Keystone ML
Optimizing Terascale Machine Learning Pipelines with Keystone MLSpark Summit
 
hbaseconasia2019 Phoenix Practice in China Life Insurance Co., Ltd
hbaseconasia2019 Phoenix Practice in China Life Insurance Co., Ltdhbaseconasia2019 Phoenix Practice in China Life Insurance Co., Ltd
hbaseconasia2019 Phoenix Practice in China Life Insurance Co., LtdMichael Stack
 
QConSF 2014 talk on Netflix Mantis, a stream processing system
QConSF 2014 talk on Netflix Mantis, a stream processing systemQConSF 2014 talk on Netflix Mantis, a stream processing system
QConSF 2014 talk on Netflix Mantis, a stream processing systemDanny Yuan
 
The Art of Java Benchmarking
The Art of Java BenchmarkingThe Art of Java Benchmarking
The Art of Java BenchmarkingAzul Systems Inc.
 
High Throughput Analytics with Cassandra & Azure
High Throughput Analytics with Cassandra & AzureHigh Throughput Analytics with Cassandra & Azure
High Throughput Analytics with Cassandra & AzureDataStax Academy
 
Serverless Multi Region Cache Replication
Serverless Multi Region Cache ReplicationServerless Multi Region Cache Replication
Serverless Multi Region Cache ReplicationSanghyun Lee
 
Instaclustr Webinar 50,000 Transactions Per Second with Apache Spark on Apach...
Instaclustr Webinar 50,000 Transactions Per Second with Apache Spark on Apach...Instaclustr Webinar 50,000 Transactions Per Second with Apache Spark on Apach...
Instaclustr Webinar 50,000 Transactions Per Second with Apache Spark on Apach...Instaclustr
 
Instaclustr webinar 50,000 transactions per second with Apache Spark on Apach...
Instaclustr webinar 50,000 transactions per second with Apache Spark on Apach...Instaclustr webinar 50,000 transactions per second with Apache Spark on Apach...
Instaclustr webinar 50,000 transactions per second with Apache Spark on Apach...Instaclustr
 
SamzaSQL QCon'16 presentation
SamzaSQL QCon'16 presentationSamzaSQL QCon'16 presentation
SamzaSQL QCon'16 presentationYi Pan
 
Flink Forward San Francisco 2018: David Reniz & Dahyr Vergara - "Real-time m...
Flink Forward San Francisco 2018:  David Reniz & Dahyr Vergara - "Real-time m...Flink Forward San Francisco 2018:  David Reniz & Dahyr Vergara - "Real-time m...
Flink Forward San Francisco 2018: David Reniz & Dahyr Vergara - "Real-time m...Flink Forward
 
Autonomic Resource Provisioning for Cloud-Based Software
Autonomic Resource Provisioning for Cloud-Based SoftwareAutonomic Resource Provisioning for Cloud-Based Software
Autonomic Resource Provisioning for Cloud-Based SoftwarePooyan Jamshidi
 
Performance Optimization of Rails Applications
Performance Optimization of Rails ApplicationsPerformance Optimization of Rails Applications
Performance Optimization of Rails ApplicationsSerge Smetana
 
Accumulo Summit 2015: Building Aggregation Systems on Accumulo [Leveraging Ac...
Accumulo Summit 2015: Building Aggregation Systems on Accumulo [Leveraging Ac...Accumulo Summit 2015: Building Aggregation Systems on Accumulo [Leveraging Ac...
Accumulo Summit 2015: Building Aggregation Systems on Accumulo [Leveraging Ac...Accumulo Summit
 
Image Recognition on Streaming Data
Image Recognition  on Streaming DataImage Recognition  on Streaming Data
Image Recognition on Streaming DataSingleStore
 
Fuzzy Control meets Software Engineering
Fuzzy Control meets Software EngineeringFuzzy Control meets Software Engineering
Fuzzy Control meets Software EngineeringPooyan Jamshidi
 
Cloud Computing in the Cloud (Hadoop.tw Meetup @ 2015/11/23)
Cloud Computing in the Cloud (Hadoop.tw Meetup @ 2015/11/23)Cloud Computing in the Cloud (Hadoop.tw Meetup @ 2015/11/23)
Cloud Computing in the Cloud (Hadoop.tw Meetup @ 2015/11/23)Jeff Hung
 

Similar to Metrics simplified (20)

Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...
Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...
Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...
 
Cloudwatch - The In's and Out's
Cloudwatch - The In's and Out'sCloudwatch - The In's and Out's
Cloudwatch - The In's and Out's
 
Using SLOs for Continuous Performance Optimizations of Your k8s Workloads
Using SLOs for Continuous Performance Optimizations of Your k8s WorkloadsUsing SLOs for Continuous Performance Optimizations of Your k8s Workloads
Using SLOs for Continuous Performance Optimizations of Your k8s Workloads
 
Optimizing Terascale Machine Learning Pipelines with Keystone ML
Optimizing Terascale Machine Learning Pipelines with Keystone MLOptimizing Terascale Machine Learning Pipelines with Keystone ML
Optimizing Terascale Machine Learning Pipelines with Keystone ML
 
hbaseconasia2019 Phoenix Practice in China Life Insurance Co., Ltd
hbaseconasia2019 Phoenix Practice in China Life Insurance Co., Ltdhbaseconasia2019 Phoenix Practice in China Life Insurance Co., Ltd
hbaseconasia2019 Phoenix Practice in China Life Insurance Co., Ltd
 
QConSF 2014 talk on Netflix Mantis, a stream processing system
QConSF 2014 talk on Netflix Mantis, a stream processing systemQConSF 2014 talk on Netflix Mantis, a stream processing system
QConSF 2014 talk on Netflix Mantis, a stream processing system
 
The Art of Java Benchmarking
The Art of Java BenchmarkingThe Art of Java Benchmarking
The Art of Java Benchmarking
 
High Throughput Analytics with Cassandra & Azure
High Throughput Analytics with Cassandra & AzureHigh Throughput Analytics with Cassandra & Azure
High Throughput Analytics with Cassandra & Azure
 
Serverless Multi Region Cache Replication
Serverless Multi Region Cache ReplicationServerless Multi Region Cache Replication
Serverless Multi Region Cache Replication
 
Instaclustr Webinar 50,000 Transactions Per Second with Apache Spark on Apach...
Instaclustr Webinar 50,000 Transactions Per Second with Apache Spark on Apach...Instaclustr Webinar 50,000 Transactions Per Second with Apache Spark on Apach...
Instaclustr Webinar 50,000 Transactions Per Second with Apache Spark on Apach...
 
Instaclustr webinar 50,000 transactions per second with Apache Spark on Apach...
Instaclustr webinar 50,000 transactions per second with Apache Spark on Apach...Instaclustr webinar 50,000 transactions per second with Apache Spark on Apach...
Instaclustr webinar 50,000 transactions per second with Apache Spark on Apach...
 
SamzaSQL QCon'16 presentation
SamzaSQL QCon'16 presentationSamzaSQL QCon'16 presentation
SamzaSQL QCon'16 presentation
 
Flink Forward San Francisco 2018: David Reniz & Dahyr Vergara - "Real-time m...
Flink Forward San Francisco 2018:  David Reniz & Dahyr Vergara - "Real-time m...Flink Forward San Francisco 2018:  David Reniz & Dahyr Vergara - "Real-time m...
Flink Forward San Francisco 2018: David Reniz & Dahyr Vergara - "Real-time m...
 
Performance tests - it's a trap
Performance tests - it's a trapPerformance tests - it's a trap
Performance tests - it's a trap
 
Autonomic Resource Provisioning for Cloud-Based Software
Autonomic Resource Provisioning for Cloud-Based SoftwareAutonomic Resource Provisioning for Cloud-Based Software
Autonomic Resource Provisioning for Cloud-Based Software
 
Performance Optimization of Rails Applications
Performance Optimization of Rails ApplicationsPerformance Optimization of Rails Applications
Performance Optimization of Rails Applications
 
Accumulo Summit 2015: Building Aggregation Systems on Accumulo [Leveraging Ac...
Accumulo Summit 2015: Building Aggregation Systems on Accumulo [Leveraging Ac...Accumulo Summit 2015: Building Aggregation Systems on Accumulo [Leveraging Ac...
Accumulo Summit 2015: Building Aggregation Systems on Accumulo [Leveraging Ac...
 
Image Recognition on Streaming Data
Image Recognition  on Streaming DataImage Recognition  on Streaming Data
Image Recognition on Streaming Data
 
Fuzzy Control meets Software Engineering
Fuzzy Control meets Software EngineeringFuzzy Control meets Software Engineering
Fuzzy Control meets Software Engineering
 
Cloud Computing in the Cloud (Hadoop.tw Meetup @ 2015/11/23)
Cloud Computing in the Cloud (Hadoop.tw Meetup @ 2015/11/23)Cloud Computing in the Cloud (Hadoop.tw Meetup @ 2015/11/23)
Cloud Computing in the Cloud (Hadoop.tw Meetup @ 2015/11/23)
 

Metrics simplified

  • 1. Metrics Simplified Mark Lin mlin@admob.com
  • 2. why? "If you can not measure it, you can not improve it" -Lord Kelvin 99.999% ("five nines") = 5.26 minutes
  • 3. previously ... Sending/Collecting is complicated. Single collection server. Tedious to configure new metric collection or creation. Calculating metric from file is expensive.
  • 4. bottlenecks ... Poll based collection server Not easy (!fun) to configure new metric collection or creation. =grunt work for ops-engineer uhhhh....
  • 5. enabling technology Graphite RabbitMQ Graphite Local Proxy RockSteady ( w/ Esper )
  • 6. path to graph 1min.juicer.output.apple.sc1.jcr1 20 1276822626 echo "1min.juicer.output.apple.sc1.jcr1 20 1276822626" | nc localhost 3400
  • 7. path to graph 1min.juicer.output.apple.sc1.jcr1 20 1276822626 echo "1min.juicer.output.apple.sc1.jcr1 20 1276822626" | nc localhost 3400
  • 10. graph
  • 11. graph = post event forensic
  • 12. Rocksteady, metric as event 1min.juicer.common.version.sc1.jcr1 100 1276822626 INSERT INTO Deploy SELECT * FROM Metric(name='common. revision') MATCH_RECORNIZE ( partition by colo, hostname measures A.value as revision, A.colo as colo, A. hostname as hostname, A. app as app, A.timestamp as timestamp pattern (A) define A as A.value > prev(A.value))
  • 13. Rocksteady, metric as event 1min.juicer.common.version.sc1.jcr1 100 1276822626 INSERT INTO Deploy SELECT * FROM Metric(name='common. revision') MATCH_RECORNIZE ( partition by colo, hostname measures A.value as revision, A.colo as colo, A. hostname as hostname, A. app as app, A.timestamp as timestamp pattern (A) define A as A.value > prev(A.value))
  • 15. correlation Deployment related problem. Capture sets of metrics when important ones crossed threshold. Determine dependencies such as cpu to request to second or response time.
  • 16. correlation Deployment related problem. Capture sets of metrics when important ones crossed threshold. Determine dependencies such as cpu to request to second or response time.
  • 18. beyond simple metric Timing info per request. Actual time spent in each component in an application. Map out dependency, find exact area of problem.
  • 19. beyond simple metric Timing info per request. Actual time spent in each component in an application. Map out dependency, find exact area of problem.
  • 20. what we learned? 1. Make metric sending simple. 2. Nice UI to make sense of data. 3. Real time processing of metric rocks.