SlideShare a Scribd company logo
Identifying Performance Deviations
in Thread Pools
Mark D. Syer, Bram Adams and Ahmed E. Hassan
(Software Analysis and Intelligence Lab (SAIL) School of Computing, Queen's University, Canada)
1
Outline
● Introduction
● Thread Pools
● Methodology
● Case study
● Results
● Evaluation
● Conclusion
2
Introduction
● Ultra-Large-Scale (ULS) systems need high
concurrency and speed.
● Analysing system performance is difficult
○ Time Consuming
○ Significant manual review of the data and logs
○ Lack of tool support, need heavy instrumentation
○ Hardware sensor gathered data hard to interpret.
3
Main contributions
To analyze performance deviations in systems
designed using thread pools..
1. A top-down methodology for identifying and ranking
the most deviating thread behaviour
1. A qualitative and quantitative evaluation of proposing
methodology on a large-scale industrial ULS system
4
Thread Pools
● Advantages
o Avoid thread create/destroy overhead
o System become more responsive
● Difficulties
o Too many threads → resource thrashing →
performance degradation
o Hard to configure and test
o synchronization errors → idle threads/deadlocks
o Thread leakage
5
Motivational Example
● Analyse the system for x5 large workload
● Machine level macro threads for each thread
pool
● Identify majority and deviating behaviours
by the use of clusters in dendrogram
6
Motivational Example(Cont)
7
Methodology
8
Methodology (cont.)
A. Performance Data
a. Resource usage metrics of the pooled resources
eg: CPU, memory, #opened files
a. Resource metrics - accuracy - overhead of
performance monitoring - data redundancy
9
Methodology (cont.)
B. Metric Abstraction
a. Group threads into higher level (macro) abstractions
by space or time.
eg: in a cluster of machines, all pooled threads
executing on one node aggregated into one.
b. Identify the deviations at higher level
c. Repeat methodology for deviations at lower level
10
Methodology (cont.)
C. Distance Calculation Between Covariance
Matrices
To get level of dissimilarity, or distance, between two
abstractions
11
Remember your Maths?
Methodology (cont.)
Covariance metric
12
Covariance between X(n) and X(1)
how much X(n) and X(1) variables change together
Ref - Wikipedia(2014, Dec), Covarience Metrics. link: http://en.wikipedia.org/wiki/Covariance_matrix
Variance of metric ( i )
Methodology (cont.)
Eg:
● Metrics Xc,XM,XH(cpu, memory, #open files) for
Threads A,B
● A is instrumented 100 times, B 1000 times
● Calculate distance metric (Forstner & Moonen method)
one-dimensional distance for each pair of covariance
matrices
● Distance value - Similarity
13
Methodology (cont.)
D. Hierarchical Clustering
● Starts with each abstraction in its own cluster and
proceeds to find and merge the closest pair of clusters
● Have used Ward's method of clustering
14ref -Large Scale Gene Expression Data Analysis I . link:http://compbio.uthsc.edu/microarray/lecture1.htm
Methodology (cont.)
E. Cluster Visualization
15
Methodology (cont.)
F. Ranking Clusters
● Recursive
● Top to bottom
16
Algorithm
17
Case Study
● Performance data
○ CPU, Virtual Bytes, Private Bytes, Handles,
MicroThreads
● Metric Abstraction
● Hierarchical Clustering and Ranking
18
Results
19
Results (cont.)
20
Quantitative Evaluation
● Validating the ability of the methodology to
identify and rank deviations
● Identified important deviations
● Injecting synthetical deviations in to the
performance data
● Verify the methodology (precision/recall)
21
Quantitative Evaluation (cont.)
22
Conclusion
● A methodology for automatically identifying
deviating behaviour in ULS systems
● Ranking most deviating thread behavior
with different abstractions (wave, thread)
● It is possible to use this methodology for
other applications
23
Thank You!
24

More Related Content

Similar to Threads concurrency identifying performance deviations in thread pools(1)

Declarative data analysis
Declarative data analysisDeclarative data analysis
Declarative data analysis
South West Data Meetup
 
ME Synopsis
ME SynopsisME Synopsis
ME Synopsis
Poonam Debnath
 
PEARC17:A real-time machine learning and visualization framework for scientif...
PEARC17:A real-time machine learning and visualization framework for scientif...PEARC17:A real-time machine learning and visualization framework for scientif...
PEARC17:A real-time machine learning and visualization framework for scientif...
Feng Li
 
Thesis Presentation
Thesis PresentationThesis Presentation
Workshop nwav 47 - LVS - Tool for Quantitative Data Analysis
Workshop nwav 47 - LVS - Tool for Quantitative Data AnalysisWorkshop nwav 47 - LVS - Tool for Quantitative Data Analysis
Workshop nwav 47 - LVS - Tool for Quantitative Data Analysis
Olga Scrivner
 
Fahroo - Optimization and Discrete Mathematics - Spring Review 2013
Fahroo - Optimization and Discrete Mathematics - Spring Review 2013Fahroo - Optimization and Discrete Mathematics - Spring Review 2013
Fahroo - Optimization and Discrete Mathematics - Spring Review 2013
The Air Force Office of Scientific Research
 
Threads and Concurrency Identifying Performance Deviations in Thread Pools
Threads and Concurrency Identifying Performance Deviations in Thread PoolsThreads and Concurrency Identifying Performance Deviations in Thread Pools
Threads and Concurrency Identifying Performance Deviations in Thread Pools
Pushpalanka Jayawardhana
 
Semantics in Sensor Networks
Semantics in Sensor NetworksSemantics in Sensor Networks
Semantics in Sensor Networks
Oscar Corcho
 
ICDE-2015 Shortest Path Traversal Optimization and Analysis for Large Graph C...
ICDE-2015 Shortest Path Traversal Optimization and Analysis for Large Graph C...ICDE-2015 Shortest Path Traversal Optimization and Analysis for Large Graph C...
ICDE-2015 Shortest Path Traversal Optimization and Analysis for Large Graph C...
Waqas Nawaz
 
Evaluating Classification Algorithms Applied To Data Streams Esteban Donato
Evaluating Classification Algorithms Applied To Data Streams   Esteban DonatoEvaluating Classification Algorithms Applied To Data Streams   Esteban Donato
Evaluating Classification Algorithms Applied To Data Streams Esteban Donato
Esteban Donato
 
Data Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence dataData Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence data
Datamining Tools
 
Data Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence dataData Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence data
DataminingTools Inc
 
Eswc2009
Eswc2009Eswc2009
Eswc2009
fanizzi
 
AI for automated materials discovery via learning to represent, predict, gene...
AI for automated materials discovery via learning to represent, predict, gene...AI for automated materials discovery via learning to represent, predict, gene...
AI for automated materials discovery via learning to represent, predict, gene...
Deakin University
 
Taming Big Data!
Taming Big Data!Taming Big Data!
Taming Big Data!
Ian Foster
 
Capter10 cluster basic : Han & Kamber
Capter10 cluster basic : Han & KamberCapter10 cluster basic : Han & Kamber
Capter10 cluster basic : Han & Kamber
Houw Liong The
 
Capter10 cluster basic
Capter10 cluster basicCapter10 cluster basic
Capter10 cluster basic
Houw Liong The
 
Graphical Structure Learning accelerated with POWER9
Graphical Structure Learning accelerated with POWER9Graphical Structure Learning accelerated with POWER9
Graphical Structure Learning accelerated with POWER9
Ganesan Narayanasamy
 
Model Evaluation in the land of Deep Learning
Model Evaluation in the land of Deep LearningModel Evaluation in the land of Deep Learning
Model Evaluation in the land of Deep Learning
Pramit Choudhary
 
Shift-Robust Node Classification via Graph Adversarial Clustering Neurips 202...
Shift-Robust Node Classification via Graph Adversarial Clustering Neurips 202...Shift-Robust Node Classification via Graph Adversarial Clustering Neurips 202...
Shift-Robust Node Classification via Graph Adversarial Clustering Neurips 202...
ssuser2624f71
 

Similar to Threads concurrency identifying performance deviations in thread pools(1) (20)

Declarative data analysis
Declarative data analysisDeclarative data analysis
Declarative data analysis
 
ME Synopsis
ME SynopsisME Synopsis
ME Synopsis
 
PEARC17:A real-time machine learning and visualization framework for scientif...
PEARC17:A real-time machine learning and visualization framework for scientif...PEARC17:A real-time machine learning and visualization framework for scientif...
PEARC17:A real-time machine learning and visualization framework for scientif...
 
Thesis Presentation
Thesis PresentationThesis Presentation
Thesis Presentation
 
Workshop nwav 47 - LVS - Tool for Quantitative Data Analysis
Workshop nwav 47 - LVS - Tool for Quantitative Data AnalysisWorkshop nwav 47 - LVS - Tool for Quantitative Data Analysis
Workshop nwav 47 - LVS - Tool for Quantitative Data Analysis
 
Fahroo - Optimization and Discrete Mathematics - Spring Review 2013
Fahroo - Optimization and Discrete Mathematics - Spring Review 2013Fahroo - Optimization and Discrete Mathematics - Spring Review 2013
Fahroo - Optimization and Discrete Mathematics - Spring Review 2013
 
Threads and Concurrency Identifying Performance Deviations in Thread Pools
Threads and Concurrency Identifying Performance Deviations in Thread PoolsThreads and Concurrency Identifying Performance Deviations in Thread Pools
Threads and Concurrency Identifying Performance Deviations in Thread Pools
 
Semantics in Sensor Networks
Semantics in Sensor NetworksSemantics in Sensor Networks
Semantics in Sensor Networks
 
ICDE-2015 Shortest Path Traversal Optimization and Analysis for Large Graph C...
ICDE-2015 Shortest Path Traversal Optimization and Analysis for Large Graph C...ICDE-2015 Shortest Path Traversal Optimization and Analysis for Large Graph C...
ICDE-2015 Shortest Path Traversal Optimization and Analysis for Large Graph C...
 
Evaluating Classification Algorithms Applied To Data Streams Esteban Donato
Evaluating Classification Algorithms Applied To Data Streams   Esteban DonatoEvaluating Classification Algorithms Applied To Data Streams   Esteban Donato
Evaluating Classification Algorithms Applied To Data Streams Esteban Donato
 
Data Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence dataData Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence data
 
Data Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence dataData Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence data
 
Eswc2009
Eswc2009Eswc2009
Eswc2009
 
AI for automated materials discovery via learning to represent, predict, gene...
AI for automated materials discovery via learning to represent, predict, gene...AI for automated materials discovery via learning to represent, predict, gene...
AI for automated materials discovery via learning to represent, predict, gene...
 
Taming Big Data!
Taming Big Data!Taming Big Data!
Taming Big Data!
 
Capter10 cluster basic : Han & Kamber
Capter10 cluster basic : Han & KamberCapter10 cluster basic : Han & Kamber
Capter10 cluster basic : Han & Kamber
 
Capter10 cluster basic
Capter10 cluster basicCapter10 cluster basic
Capter10 cluster basic
 
Graphical Structure Learning accelerated with POWER9
Graphical Structure Learning accelerated with POWER9Graphical Structure Learning accelerated with POWER9
Graphical Structure Learning accelerated with POWER9
 
Model Evaluation in the land of Deep Learning
Model Evaluation in the land of Deep LearningModel Evaluation in the land of Deep Learning
Model Evaluation in the land of Deep Learning
 
Shift-Robust Node Classification via Graph Adversarial Clustering Neurips 202...
Shift-Robust Node Classification via Graph Adversarial Clustering Neurips 202...Shift-Robust Node Classification via Graph Adversarial Clustering Neurips 202...
Shift-Robust Node Classification via Graph Adversarial Clustering Neurips 202...
 

Recently uploaded

Graspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code AnalysisGraspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code Analysis
Aftab Hussain
 
Oracle Database 19c New Features for DBAs and Developers.pptx
Oracle Database 19c New Features for DBAs and Developers.pptxOracle Database 19c New Features for DBAs and Developers.pptx
Oracle Database 19c New Features for DBAs and Developers.pptx
Remote DBA Services
 
Hand Rolled Applicative User Validation Code Kata
Hand Rolled Applicative User ValidationCode KataHand Rolled Applicative User ValidationCode Kata
Hand Rolled Applicative User Validation Code Kata
Philip Schwarz
 
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdfAutomated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
timtebeek1
 
SWEBOK and Education at FUSE Okinawa 2024
SWEBOK and Education at FUSE Okinawa 2024SWEBOK and Education at FUSE Okinawa 2024
SWEBOK and Education at FUSE Okinawa 2024
Hironori Washizaki
 
Energy consumption of Database Management - Florina Jonuzi
Energy consumption of Database Management - Florina JonuziEnergy consumption of Database Management - Florina Jonuzi
Energy consumption of Database Management - Florina Jonuzi
Green Software Development
 
Revolutionizing Visual Effects Mastering AI Face Swaps.pdf
Revolutionizing Visual Effects Mastering AI Face Swaps.pdfRevolutionizing Visual Effects Mastering AI Face Swaps.pdf
Revolutionizing Visual Effects Mastering AI Face Swaps.pdf
Undress Baby
 
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
Adele Miller
 
Microservice Teams - How the cloud changes the way we work
Microservice Teams - How the cloud changes the way we workMicroservice Teams - How the cloud changes the way we work
Microservice Teams - How the cloud changes the way we work
Sven Peters
 
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
mz5nrf0n
 
Empowering Growth with Best Software Development Company in Noida - Deuglo
Empowering Growth with Best Software  Development Company in Noida - DeugloEmpowering Growth with Best Software  Development Company in Noida - Deuglo
Empowering Growth with Best Software Development Company in Noida - Deuglo
Deuglo Infosystem Pvt Ltd
 
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, FactsALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
Green Software Development
 
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CDKuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
rodomar2
 
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of CodeA Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
Aftab Hussain
 
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxTop Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
rickgrimesss22
 
How to write a program in any programming language
How to write a program in any programming languageHow to write a program in any programming language
How to write a program in any programming language
Rakesh Kumar R
 
GOING AOT WITH GRAALVM FOR SPRING BOOT (SPRING IO)
GOING AOT WITH GRAALVM FOR  SPRING BOOT (SPRING IO)GOING AOT WITH GRAALVM FOR  SPRING BOOT (SPRING IO)
GOING AOT WITH GRAALVM FOR SPRING BOOT (SPRING IO)
Alina Yurenko
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
LORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptx
LORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptxLORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptx
LORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptx
lorraineandreiamcidl
 
OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024
OpenMetadata
 

Recently uploaded (20)

Graspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code AnalysisGraspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code Analysis
 
Oracle Database 19c New Features for DBAs and Developers.pptx
Oracle Database 19c New Features for DBAs and Developers.pptxOracle Database 19c New Features for DBAs and Developers.pptx
Oracle Database 19c New Features for DBAs and Developers.pptx
 
Hand Rolled Applicative User Validation Code Kata
Hand Rolled Applicative User ValidationCode KataHand Rolled Applicative User ValidationCode Kata
Hand Rolled Applicative User Validation Code Kata
 
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdfAutomated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
 
SWEBOK and Education at FUSE Okinawa 2024
SWEBOK and Education at FUSE Okinawa 2024SWEBOK and Education at FUSE Okinawa 2024
SWEBOK and Education at FUSE Okinawa 2024
 
Energy consumption of Database Management - Florina Jonuzi
Energy consumption of Database Management - Florina JonuziEnergy consumption of Database Management - Florina Jonuzi
Energy consumption of Database Management - Florina Jonuzi
 
Revolutionizing Visual Effects Mastering AI Face Swaps.pdf
Revolutionizing Visual Effects Mastering AI Face Swaps.pdfRevolutionizing Visual Effects Mastering AI Face Swaps.pdf
Revolutionizing Visual Effects Mastering AI Face Swaps.pdf
 
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
 
Microservice Teams - How the cloud changes the way we work
Microservice Teams - How the cloud changes the way we workMicroservice Teams - How the cloud changes the way we work
Microservice Teams - How the cloud changes the way we work
 
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
 
Empowering Growth with Best Software Development Company in Noida - Deuglo
Empowering Growth with Best Software  Development Company in Noida - DeugloEmpowering Growth with Best Software  Development Company in Noida - Deuglo
Empowering Growth with Best Software Development Company in Noida - Deuglo
 
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, FactsALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
 
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CDKuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
 
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of CodeA Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
 
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxTop Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
 
How to write a program in any programming language
How to write a program in any programming languageHow to write a program in any programming language
How to write a program in any programming language
 
GOING AOT WITH GRAALVM FOR SPRING BOOT (SPRING IO)
GOING AOT WITH GRAALVM FOR  SPRING BOOT (SPRING IO)GOING AOT WITH GRAALVM FOR  SPRING BOOT (SPRING IO)
GOING AOT WITH GRAALVM FOR SPRING BOOT (SPRING IO)
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
LORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptx
LORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptxLORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptx
LORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptx
 
OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024
 

Threads concurrency identifying performance deviations in thread pools(1)

  • 1. Identifying Performance Deviations in Thread Pools Mark D. Syer, Bram Adams and Ahmed E. Hassan (Software Analysis and Intelligence Lab (SAIL) School of Computing, Queen's University, Canada) 1
  • 2. Outline ● Introduction ● Thread Pools ● Methodology ● Case study ● Results ● Evaluation ● Conclusion 2
  • 3. Introduction ● Ultra-Large-Scale (ULS) systems need high concurrency and speed. ● Analysing system performance is difficult ○ Time Consuming ○ Significant manual review of the data and logs ○ Lack of tool support, need heavy instrumentation ○ Hardware sensor gathered data hard to interpret. 3
  • 4. Main contributions To analyze performance deviations in systems designed using thread pools.. 1. A top-down methodology for identifying and ranking the most deviating thread behaviour 1. A qualitative and quantitative evaluation of proposing methodology on a large-scale industrial ULS system 4
  • 5. Thread Pools ● Advantages o Avoid thread create/destroy overhead o System become more responsive ● Difficulties o Too many threads → resource thrashing → performance degradation o Hard to configure and test o synchronization errors → idle threads/deadlocks o Thread leakage 5
  • 6. Motivational Example ● Analyse the system for x5 large workload ● Machine level macro threads for each thread pool ● Identify majority and deviating behaviours by the use of clusters in dendrogram 6
  • 9. Methodology (cont.) A. Performance Data a. Resource usage metrics of the pooled resources eg: CPU, memory, #opened files a. Resource metrics - accuracy - overhead of performance monitoring - data redundancy 9
  • 10. Methodology (cont.) B. Metric Abstraction a. Group threads into higher level (macro) abstractions by space or time. eg: in a cluster of machines, all pooled threads executing on one node aggregated into one. b. Identify the deviations at higher level c. Repeat methodology for deviations at lower level 10
  • 11. Methodology (cont.) C. Distance Calculation Between Covariance Matrices To get level of dissimilarity, or distance, between two abstractions 11 Remember your Maths?
  • 12. Methodology (cont.) Covariance metric 12 Covariance between X(n) and X(1) how much X(n) and X(1) variables change together Ref - Wikipedia(2014, Dec), Covarience Metrics. link: http://en.wikipedia.org/wiki/Covariance_matrix Variance of metric ( i )
  • 13. Methodology (cont.) Eg: ● Metrics Xc,XM,XH(cpu, memory, #open files) for Threads A,B ● A is instrumented 100 times, B 1000 times ● Calculate distance metric (Forstner & Moonen method) one-dimensional distance for each pair of covariance matrices ● Distance value - Similarity 13
  • 14. Methodology (cont.) D. Hierarchical Clustering ● Starts with each abstraction in its own cluster and proceeds to find and merge the closest pair of clusters ● Have used Ward's method of clustering 14ref -Large Scale Gene Expression Data Analysis I . link:http://compbio.uthsc.edu/microarray/lecture1.htm
  • 15. Methodology (cont.) E. Cluster Visualization 15
  • 16. Methodology (cont.) F. Ranking Clusters ● Recursive ● Top to bottom 16
  • 18. Case Study ● Performance data ○ CPU, Virtual Bytes, Private Bytes, Handles, MicroThreads ● Metric Abstraction ● Hierarchical Clustering and Ranking 18
  • 21. Quantitative Evaluation ● Validating the ability of the methodology to identify and rank deviations ● Identified important deviations ● Injecting synthetical deviations in to the performance data ● Verify the methodology (precision/recall) 21
  • 23. Conclusion ● A methodology for automatically identifying deviating behaviour in ULS systems ● Ranking most deviating thread behavior with different abstractions (wave, thread) ● It is possible to use this methodology for other applications 23

Editor's Notes

  1. 10 sec
  2. 15 sec
  3. 30 sec- ULS - data, connections, hardware , 300-400 items are sold per second, 13 minl trasactions per second need - near-perfect up-time, hundreds or even thousands of concurrent connections, speed Performance analysis- determine maximum operating capacity, validate non-functional requirements, cover bottlenecks hrs to days . jprofiler, j-test CPU/memory resource metrics very hard to interpret
  4. 30 sec Performance deviation in ULS systems that are designed using the thread pools architectural pattern. can be applied with very little understanding of the architecture even
  5. 4 min -a design pattern, technique for designing scalable multithreaded and distributed systems. - explain diagram -useful in server applications where work items are typically short-lived and the number of incoming work items is large Ad; -. - abling incoming work items to be assigned and executed quickly Dis resource thrashing- resources expended to manage the threads increases at the expense of the resources available to process the work items thread syncs issues - deadlock where all threads in the pool are waiting for a work item on the queue to be processed, but there are no "available" threads to process that work item Thread leakage - when a thread finishes processing its work item, but fails to return to the pool
  6. 1 min test system with newly integrated package tracking system - fast method top down aggregate each machine's thread data into one macro­ thread at the machine level,
  7. 1 min iterative, top-down methodology notice two groups of behaviour: on the threads of machine4 one thread for cluster competition for access to the database between machine4 and the other machines is affecting the access time
  8. 30
  9. 30 sec resource usage metrics of
  10. 1 min Top down apply method at high level abstractions
  11. sample - http://stattrek.com/matrix-algebra/covariance-matrix.aspx every covariance matrix is symmetric covariance is a measure of how much two random variables change together. If the greater values of one variable mainly correspond with the greater values of the other variable, and the same holds for the smaller values, i.e., the variables tend to show similar behavior, the covariance is positive expected value of a random variable is intuitively the long-run average value of repetitions of the experiment it represents. For example, the expected value of a die roll is 3.5 because
  12. 1.30
  13. 1min
  14. cluster visualization - Once we calculated hierarchical clustering, details cluster visualization would be helpful to manually identify any deviating behavior in a particular abstraction. clusters of abstractions are visualized using hierarchical cluster dendrograms.These are binary tree like diagrams that show each stage of the clustering procedure as nested clusters.The height of the dendrogram is proportional to the distance between clusters. Higher the height higher the deviation. 1 min
  15. Ranking clusters - Once the hierarchical cluster dendrograms have been generated, they are analyzed to identify and rank the most deviating behaviour. The authors have formalized an algorithm from their experiences analyzing dendrograms. We can recognize some characteristics of the algorithm. recursive top to bottom min: 1
  16. 2 min
  17. 2 min Subject system: the case study is based on data of an industrial ultra-large-scale system. These system follow thread pool architecture. The threads of the thread pool manage their own micro-threds. But they are been ignored due to unavailability of metrics. Performance Data: The Data in this case study consists of five resource usage metrics.The metrics are sampled approximately every 7 seconds. CPU - Percentage of CPU time in use VirtualBytes - Amount of virtual address space in use PrivateBytes - Amount of private( non-shared) memory in use Handles - Number of open file handles MicroThreads - Number of allocated micro-threads Hierarchical Clustering The authors have used pvclust and R package for hierarchical clustering, to cluster abstractions. (R is a free software environment for statistical computing and graphics) For ranking clusters: need to use algorithm
  18. 0.5 min
  19. 1 min
  20. To verify at the wave level need to use this, and to verify at the thread level can be used same set of equations 1min
  21. 1 min