• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Towards Detecting Performance Anti-patterns Using Classification Techniques
 

Towards Detecting Performance Anti-patterns Using Classification Techniques

on

  • 243 views

This is the talk I gave on behalf of my Ph.D. student at the Machine Learning and Information Retrieval (MALIR) for Software Evolution (MALIR-SE) workshop at ASE 2013.

This is the talk I gave on behalf of my Ph.D. student at the Machine Learning and Information Retrieval (MALIR) for Software Evolution (MALIR-SE) workshop at ASE 2013.

Statistics

Views

Total Views
243
Views on SlideShare
243
Embed Views
0

Actions

Likes
0
Downloads
1
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Towards Detecting Performance Anti-patterns Using Classification Techniques Towards Detecting Performance Anti-patterns Using Classification Techniques Presentation Transcript

    • Towards  Detecting   Performance  Anti-­‐Patterns   Using  Classi8ication  Techniques   Manjala  Peiris  and  James  H.  Hill   1st  Interna4onal  Workshop  on  Machine  Learning  &  Informa4on   Retrieval  for  SoBware  Evolu4on   Nov  11,  2013,  Silicon  Valley,  California,  USA.    
    • Motivation:  Software   Performance  Anti-­‐Patterns   •  Common  design  choices  that   have  nega4ve  consequences   •  Solely  focus  on  performance   of  the  system   •  e.g.:  throughput,  response   4me   •  Suggests  solu4ons  and   refactoring   •  e.g.,  One  lane  bridge,   Excessive  dynamic  alloca4ons,   God  Class  
    • One  Lane  Bridge  (Smith  et  al.)   Reasons  for  An+-­‐Pa.ern   •  Lack  of  concurrency   •  Limited  to  number  of   resources   •  Not  u4lizing  available   resources   Consequences   •  Low  system  throughput   •  High  latency   •  High  response  4me     One or few processes/ threads are allowed to execute concurrently
    • Excessive  Dynamic  Allocations   (Smith  et  al.)   Reason for Anti-Pattern •  Objects are created when they are first accessed and then destroyed when no longer needed. Consequences •  The cost of dynamic allocations       N-­‐  Number  of  Calls   Sc  ,Sd-­‐  Costs  for  an  object  crea6on  and  dele6on    
    • Why  Automatic  Detection  of     Performance  Anti-­‐Patterns   •  Difficult to manually analyze large amount of performance data •  Make sense of large amount of performance data rather than just showing it to users •  Provides intuitions to system designers where the refactoring is required  
    • Current  Approaches  for  Anti-­‐ Pattern  detection   Approach  based  on  so7ware   design  ar+facts   1.  Annotate  the  soBware   design   2.  Runs  simula4ons  and   gather  performance  data   3.  Apply  rules     Approaches  based  on  run+me   data   •  Architecture  dependent  (e.g.:   J2EE  an4-­‐paYerns)   •  Requires  architecture  specific   deployment  details  
    • Non-­‐intrusive  Performance  Anti-­‐ Pattern  Detector  (NiPAD)   •  Collect  system  performance  metrics   •  SoBware  execu4on  with  a   performance  an4-­‐paYern  (Class  0)   •  SoBware  execu4on  without  the   performance  an4-­‐paYern  (Class  1)   •  Normalize  the  data   •  Train  a  classifier   •  Naïve  Bayes,  Logis4c  Regression,   FLD,  SVM  (Linear),  SVM  (RBF)   •  Predict  for  new  performance  data  for   which  the  class  label  is  unknown  
    • System  level  Metrics   Metric Descrip+on CPU  Idle  Time The  4me  CPU  is  idle  not  doing  any  work CPU  User  Time CPU  u4liza4on  for  user  applica4ons CPU  System  Time CPU  u4liza4on  for  system  level  programs Free  Memory Total  free  memory  when  invoking  the  applica4on Cached  Memory Total  cached  memory  available  when  invoking  the   applica4on Total  Commits Total  number  of  commits •  Metrics  are  collected  every  1  second  epochs  
    • CPU  Times  with  One  Lane   Bridge  
    • CPU  Times  without  One  Lane   Bridge  
    • Experiments  with  Apache  Web   Server   •  Emula4ng  One  Lane  Bridge  An4-­‐PaYern   •  Use  Apache  Benchmark  to  generate  a  load   •  Server  configura4ons   One  Lane  Bridge Without  One  Lane  Bridge 300  concurrent  clients  sending     1  million  requests,  server  has   150  threads 300  concurrent  clients  sending  1   million  request,  server  has  300   threads 200 records for training 400 records for testing
    • Classi8ication  Results  for  One   Lane  Bridge   One  Lane  Bridge   1   0.9   0.8   Accuracy   0.7   0.6   Naïve  Bayes   0.5   Logis4c  Regression   0.4   FLD   0.3   SVM  (Linear)   0.2   SVM  (RBF)   0.1   0   Naïve  Bayes   Logis4c   Regression   FLD   Classifier   SVM  (Linear)   SVM  (RBF)  
    • Classi8ication  Results  for  One  Lane   Bridge  with  Noise   One  Lane  Bridge  with  Noise   0.8   0.7   0.6   Naïve  Bayes   0.5   Logis4c  Regression   0.4   FLD   0.3   SVM  (Linear)   0.2   SVM  (RBF)   0.1   0   Naïve  Bayes   Logis4c   Regression   FLD   SVM  (Linear)   SVM  (RBF)  
    • Experiments  with  Apache  Web   Server   •  Emula4ng  Excessive  Dynamic  Alloca4on  an4-­‐paYern   •  Server  configura4ons   Excessive  Dynamic   Alloca+ons Without  Excessive  Dynamic   Alloca+ons •  300  concurrent  clients   •  300  concurrent  clients   sending  1  million  requests,   sending  1  million  request,   server  has  300  threads   server  has  300  threads   •  Memory  pool  size  of  1kb •  Memory  pool  size  of  1Mb 200 records for training 400 records for testing
    • Classi8ication  Results  for  Excessive   Dynamic  Allocations   Excessive  Dynamic  Alloca+on   0.7   0.6   Accuracy   0.5   Naïve  Bayes   0.4   Logis4c  Regression   0.3   FLD   0.2   SVM  (Linear)   0.1   SVM  (RBF)   0   Naïve  Bayes   Logis4c  Regression   FLD   SVM  (Linear)   SVM  (RBF)   Classifier   Reason for poor classification performance •  Emery et al. shows custom memory allocation techniques does not have much advantages
    • Cost  Analysis  for  One  Lane   Bridge   Classifier   Sensi+vity   Specificity   Precision   Accuracy   Logis4c   0.95   0.62   0.53   0.76   FLD   0.95   0.66   0.56   0.7   Naïve  Bayes   0.92   0.28   0.38   0.5   SVM  (Linear)   0.98   0.92   0.84   0.94   SVM  (RBF)   0.96   0.7   0.61   0.75   •  Posi4ve  class  is  the  one  which  does  not  have  the  An4-­‐paYern   •  Predicts  these  situa4ons  more  accurately   •  Cost  of  misclassifica4on  depends  on  the  nature  of  the   soBware  and  soBware  development  cost   •  This  technique  will  eliminate  unnecessary  soBware  tes4ng   •  Not  good  for  real  4me  soBware  systems    
    • Cost  Analysis  for  One  Lane   Bridge   Classifier   Sensi+vity   Specificity   Precision   Accuracy   Logis4c   0.95   0.62   0.53   0.76   FLD   0.95   0.66   0.56   0.7   Naïve  Bayes   0.92   0.28   0.38   0.5   SVM  (Linear)   0.98   0.92   0.84   0.94   SVM  (RBF)   0.96   0.7   0.61   0.75   •  Posi4ve  class  is  the  one  which  does  not  have  the  An4-­‐paYern   •  Predicts  these  situa4ons  more  accurately   •  Cost  of  misclassifica4on  depends  on  the  nature  of  the   soBware  and  soBware  development  cost   •  This  technique  will  eliminate  unnecessary  soBware  tes4ng   •  Not  good  for  real  4me  soBware  systems    
    • Concluding  Remarks   Limita+ons   •  System  level  performance  metrics   may  not  show  enough  varia4ons   •  e.g.,  Excessive  Dynamic   Alloca4ons   •  Bad  performance  may  be  for  some   other  reasons   •  e.g.,  Configura4on  errors,  bad   user  inputs   Future  work   •  Currently  including  behavior  of  the   soBware  applica4on  in  analysis   •  Applying  this  technique  to  other   soBware  applica4ons  
    • Questions