Ase2013

405 views
298 views

Published on

Class Level Fault Prediction using Software Clustering

for
IEEE ASE 2013

by
Giuseppe Scanniello (1) Carmine Gravino (2) Andrian Marcus (3) Tim Menzies (4)

from
1 University of Basilicata, Italy
2 Italy University of Salerno, Italy
3 Wayne State University, USA
4 West Virginia University, USA

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
405
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
6
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Ase2013

  1. 1. Class'Level'Fault'PredicCon'' using'SoMware'Clustering'' Giuseppe'Scanniello '' ' Carmine'Gravino ' 1 2 'Andrian'Marcus3'' Tim'Menzies4' ' ' giuseppe.scanniello@unibas.it,''gravino@unisa.it' 'amarcus@wayne.edu,'''Cm@menzies.us''' ' 1'University'of'Basilicata,'Italy'' 2'Italy'University'of'Salerno,'Italy'' 3'Wayne'State'University,'USA'' 4'West'Virginia'University,'USA' ' ' '
  2. 2. This talk= BorderFlow clustering for defect prediction •  !Defect!predic+on! –  Sort'modules'by'odds'of'having'defects' –  Used'to'prioriCzing'subsequent'work' ' •  Borderflow' –  Finds'code'clusters'with' •  High'cohesion' •  Low'coupling' ' •  Produces'beRer'defect'predictors.' 2'
  3. 3. Q:'Why?'' A:'Too'much'blah,'blah' •  SoMware'is'being'wriRen'by' –  More'people' –  For'more'tasks' –  Using'changing'tools' –  On'ever'changing' plaWorms' ' ' •  Any'claim'that'X'is'always'THE'prime''determiner'of'' defects,'efforts,'livelocks,'etc'etc,'etc'is'…..' –  Trite'simplificaCons'of'a'more'complex'issue' 3'
  4. 4. Local'lessons'>'Trite'global'claims' •  Cluster'data' •  Learn'1'model'per'cluster' –  –  –  –  –  –  –  Context^specific'soluCons' BeRer'predicCons,'' lower'variance',' lower'false'alarm,' faster'runCmes' beRer'explanaCon' etc.' •  As'recommended'by'' ' any'number'of'papers' –  –  –  –  –  ' Turhan:ESEj’09;// Menzies:ASE’11,TSE’13;// Be:enburg:MSR’12;// Yang:IST’13// etc,'etc,'etc.' ' 4'
  5. 5. Related'Work' •  !How'to'cluster?'' –  By'intra^module'features?'' •  Menzies:ASE’11,TSE’13;//Be:enburg:MSR’12/ –  By'performance'deltas'of'models'learned'' from'different'straCficaCons?''' Need'principles' for'reducing' opCon'space' •  Yang:IST’13;/He:ASEj’12;/He:ESEM’13/ •  And'what'features'to'use?' –  Just'staCc'code'measures:' •  'Menzies:ASE’11,TSE’13;//Be:enburg:MSR’12' –  Using'design'+'code'arCfacts?'''' •  Schroter:ESEM’06/ –  Using'soMware'process'+product'measures'?' •  Kamei:ICSM’10/ –  Using'synthesized'aRributes'using'PCA'or'LSI?' •  'Nagappan:ICSE’06,'Tan:WCRE’11/ / Hence,'' this'paper' •  Premise'of'this'paper:' –  These'How'and'What'are'related' 5'
  6. 6. New'idea'' •  SoMware'has'' natural'clusters' –  Regions'of'high'cohesion' and'low'coupling' ' •  What'if'we'exploited' that'naturally'occurring' structure?' ' •  So'cluster'not'by'intra7module!features;'' –  e.g.'as'done'by'Menzies:ASE’11,TSE’13;/Be:enburg:MSR’12;/etc/ –  But'by'inter7module!features! ! •  Use'a'clusterer'that'understands'cohesion'and'coupling' –  This'talk:'BorderFlow'clustering'for'defect!predic+on! 6'
  7. 7. Target'domain' •  Defect'predicCon'from'staCc' code'features' •  Easy'to'use:'' –  scalable'feature'extractors'+'logs'of' defects'found' •  Widely'used:' –  PrioriCze'inspecCons:'find'20%'of'code' with'80%'of'errors' •  Ostrand:ISSTA’04;'Nagappan:ICSE’06;' Menzies:ASEj’10;'Tosun:IAAI’10;'etc'etc' •  Useful'to'use:' –  Compared'to'(some)'samples'of' industrial'pracCces…' •  Finds'more'defects:'Menzies:TSE’07;'' 7'
  8. 8. Borderflow' Ngomo:CLCing’09/ •  Graph'G'='(V,E);'V'are'classes''' ' •  e(ci','cj')' 'E'if''cj''references'ci'' –  In'class'instanCaCon,'method'invocaCon,'or'field'access' –  JRIPPLES:'hRp://jripples.sourceforge.net/' ' •  A'cluster'X'is'a'subset'of'V'that'maximizes:'at' –  F(X)' '='Ω(b(X),X)''/'Ω(b(X'),'n(X'))' –  'b(X)' '=''border'nodes'inside'X' –  'n(X)' '='direct'neighbors'of'b(X),'outside'of'X' –  Ω'' '=''number'of'the'edges''between''subsets'of'V'';''' ' '''''Ω(X,Y)=Σ'e(ci,cj)|ci' X'and'cj' Y'' ' •  'IteraCvely'inserts''nodes'in'n(X)'Cll''F(X)'is'maximized.' –  1)''Candidates:'find''C('X')'='nodes'not'in'X''where'''F(X+C(X))'>'F(X)' –  2)''Prune:''subset'Y''in''C(X)'that''maximize'Ω(Y,'n(X')).' –  3)'Test:'if'F(X+Y)'>='F(X),'then'X'='X'+'Y' 8'
  9. 9. Experiment:' leave^one^out,'JAVA'classes,'' learn'from'clusters'vs'learn'from'all'' !  !  !  Dependent'variable' !  ClassFault.' ' independent:'' !  WMC'' !  Weighted'Methods'' per'Class)' !  DIT'' !  Depth'Inheritance'Tree'' !  NOC'' !  Number'Of'Children)' !  CBO'' !  Coupling'Between'' Object'classes' !  RFC'' !  Response'For'Class' !  LCOM'' !  Lack'of'Cohesion'in' Methods' !  NPM'' !  Number'of'' Public'Methods' !  LOC'' !  Lines'Of'Code' ' Hypothesis'test:Mann^Whitney'(5%)' Java'systems'''from'promisedata.googleode.com' X'='one'of''Ant,Jedit,Lucene,POI,Synapse,Velocity,Xalan''Xerces' ' ' Version'='one'version'of'X'' Clusters'='BorderFlow('Version')' ' ' For'Cluster'in'Clusters/ '''''''For'Class'in'Version/ '''''''''''''Test '= 'Class/ ''''''''''''''''a '= 'Class.Faults' '''''''''''Train '= 'Version'–'Test'' '''''''Model0 '= 'SwLSR'('Train')!#!baseline:!global!model' '''''''''''''''''p0'= 'Model0('Test')! '''''''Model1 '= 'SwLSR('Cluster'–'Class)'#!!local!model' '''''''''''''''''p1'= 'Model1('Class/)' 9'
  10. 10. Results:' Error'='mean'absolute'residuals'='p^a' (more'='worse)' ^0.5' 'JEdit'4.0''' 'Velocity'1.6.1'' 'Velocity'1.5'' 0' 'JEdit'4.3'' –  See'Velocity,'' Jedit'(but'only' some'versions)' 0.5' 'JEdit'4.2'' •  ExcepCons:'' 1' 'Velocity'1.4'' –  Has'less'error' –  SomeCmes,'' much'beRer' 1.5' 'JEdit'3.2.1'' 'JEdit'4.1'' •  Usually,'' local'is'best' ^1' global' local' delta' 10'
  11. 11. Summary! Future!Work! •  Too'many'opCons' •  Repeat'on'more'data'sets' ' •  Compare'with'other'local'learners' ' •  Test'if'inter^module'always'best' –  Need'principles'to'design'data' miners'for'soMware'engineering' ' •  We'applied'a'core'SE'principle' –  Coupling'and'cohesion' ' •  Used'it'to'select'both' –  A'data'miner:'BorderFlow' –  And'the'aRributes'it'explores' •  Inter^module'features' ' •  Obtained'beRer'results' 11'

×