Defect, defect, defect: PROMISE 2012 Keynote

Sung Kim
Sung KimAssociate Prof.
Keynote



   Defect,
Defect, Defect

       Sung Kim
The Hong Kong University of
  Science and Technology
Defect, defect, defect: PROMISE 2012 Keynote
Defect, defect, defect: PROMISE 2012 Keynote
Defect, defect, defect: PROMISE 2012 Keynote
Defect, defect, defect: PROMISE 2012 Keynote
Program Analysis and Mining
      (PAM) Group
Program Analysis and Mining
      (PAM) Group
The First Bug
   September 9, 1947
More Bugs
Finding Bugs
       Verification

         Testing

        Prediction
Defect Prediction


                    42        24

                         14




Program   Tool   Future defects
Why Prediction?
Defect Prediction Model


             D= 4.8 6+
              0. 0 18L

   F. Akiyama, “An Example of Software System Debugging,” Information Processing, vol. 71, 1971
Defect Prediction

Identifying New Metrics

Developing New Algorithms

Various Granularities
Defect Prediction

Identifying New Metrics

Developing New Algorithms

Various Granularities
Complex Files




                                                          a simple file

                a complex file
Ostrand and Weyuker, Basili et al., TSE 1996, Ohlsson and Alberg, TSE 1996, Menzies et al., TSE 2007
Complex Files




                                                          a simple file

                a complex file
Ostrand and Weyuker, Basili et al., TSE 1996, Ohlsson and Alberg, TSE 1996, Menzies et al., TSE 2007
Changes




Bell et al. PROMISE 2011, Moser et al., ICSE 2008, Nagappan et al., ICSE 2006, Hassan et al., ICSM 2005
Changes




Bell et al. PROMISE 2011, Moser et al., ICSE 2008, Nagappan et al., ICSE 2006, Hassan et al., ICSM 2005
View/Edit Patterns




                     Lee et al., FSE2011
Slide by Mik Kersten. “Mylyn – The task-focused interface” (December 2007, http://live.eclipse.org)
With Mylyn
                               Tasks are integrated
                  See only what you are working on

Slide by Mik Kersten. “Mylyn – The task-focused interface” (December 2007, http://live.eclipse.org)
* Eclipse plug-in storing and recovering task contexts
* Eclipse plug-in storing and recovering task contexts
<InteractionEvent … Kind=“ ” … StartDate=“ ” EndDate=“ ”
        … StructureHandle=“ ” … Interest=“ ” … >




                    * Eclipse plug-in storing and recovering task contexts
Burst Edits/Views




                    Lee et al., FSE2011
Burst Edits/Views




                    Lee et al., FSE2011
Change Entropy

11
            Low Entropy                                      High Entropy



                                             3        3         3           3      3
     1       1            1     1


F1
F1   F2
     F2     F3
            F3            F4
                          F4   F5
                                F5           F6
                                             F1       F7
                                                      F2       F8
                                                               F3            F9
                                                                            F4    F10
                                                                                   F5
     The number of changes in a period (e.g., a week) per file



              Hassan, “Predicting Faults Using the Complexity of Code Changes,” ICSE 2009
Change Entropy

11
            Low Entropy                                      High Entropy



                                             3        3         3           3      3
     1       1            1     1


F1
F1   F2
     F2     F3
            F3            F4
                          F4   F5
                                F5           F6
                                             F1       F7
                                                      F2       F8
                                                               F3            F9
                                                                            F4    F10
                                                                                   F5
     The number of changes in a period (e.g., a week) per file



              Hassan, “Predicting Faults Using the Complexity of Code Changes,” ICSE 2009
Previous Fixes




        Hassan et al., ICSM 2005, Kim et al., ICSE 2007
Previous Fixes




        Hassan et al., ICSM 2005, Kim et al., ICSE 2007
Previous Fixes




        Hassan et al., ICSM 2005, Kim et al., ICSE 2007
Network




Zimmermann and Nagappan, “Predicting Defects using Network Analysis on Dependency Graphs,”ICSE 2008
Network




Zimmermann and Nagappan, “Predicting Defects using Network Analysis on Dependency Graphs,”ICSE 2008
More Metrics
                        Complexity (Size)
                                       CK
                                 McCabe
                                       OO
                         Process metrics
                                Halstead
                Developer Count metrics
                          Change metrics
Entropy of changes (Change Complexity)
            Churn (source code metrics)
                  # of changes to the file
                         Previous defects
                      Network measures
              Calling structure attributes
           Entropy (source code metrics)


                                             0       5      10     15     20        25
                                                 # of publications (last 7 years)
Defect Prediction

Identifying New Metrics

Developing New Algorithms

Various Granularities
Classification
                training instances    complexity metrics
                (metrics+ labels)
                                       historical metrics
                                               ...




    ?
new instance
                                       Prediction
                       Learner       (classification)
Regression
               training instances   complexity metrics
               (metrics+ values)
                                     historical metrics
                                             ...




    ?
new instance
                                    Prediction
                      Learner       (values)
Active Learning
                    Anomaly Detection
                        System       1

                                                        Refinement
                         Sorted
                                                          Engine
                       Bug Reports 2                                 5
                                 <<Refinement Loop>>

                   First Few Bug                            User
                      Reports 3                           Feedback
                                                                 4
                     Figure 4.    Active Refinement Process

characteristics in a clone group. Then, the set ofet al., PROMISE 2012
     Lo et al., “Active Refinement of Clone Anomaly Reports,” ICSE 2012, Lu anomalies or
Bug Cache
                  c h                               10% files
               e t                                most bug-prone
         - f
    re               ad
                       s
                    is
  p
               on Lo
                  m




                                                                          t
                                                                       en
                                                                   em
                                                               ac
                                                             pl
Nearby: co changes




                                                          re
                          all files

                        Kim et al., “Predicting Faults from Cached History,” ICSE 2007
Algorithms

             Classification                                                          21
Algorithms




               Regression                                                  18

                     Both         4

                      Etc.        4


                              0       5           10            15              20        25
                                          # of publications (recent 7 years)


                                                 31
Defect Prediction

Identifying New Metrics

Developing New Algorithms

Various Granularities
Module/Binary/Package
        Level
Module/Binary/Package
        Level
File Level
File Level
Method Level

                                        void foo () {
                                              ...

                                                  }




Hata et al.,“Bug Prediction Based on Fine-Grained Module Histories,” ICSE 2012
Method Level

                                        void foo () {
                                              ...

                                                  }




Hata et al.,“Bug Prediction Based on Fine-Grained Module Histories,” ICSE 2012
Change Level
            Development history of a file
Rev 1            Rev 2                      Rev 3                        Rev 4
...              ...                        ...                          ...
...     change   ...            change      ...             change       ...
...              ...                        ...                          ...
...              ...                        ...                          ...

                                                             Did I just
                                                            introduce
                                                              a bug?



                    Kim et al., "Classifying Software Changes: Clean or Buggy?" TSE 2009
Change Level
            Development history of a file
Rev 1            Rev 2                      Rev 3                        Rev 4
...              ...                        ...                          ...
...     change   ...            change      ...             change       ...
...              ...                        ...                          ...
...              ...                        ...                          ...

                                                             Did I just
                                                            introduce
                                                              a bug?



                    Kim et al., "Classifying Software Changes: Clean or Buggy?" TSE 2009
More Granularities

Project/Release/SubSystem               3
       Component/Module                                8
                 Package                3
                     File                                                        19
                   Class                               8
         Function/Method            2
        Change/Hunk level       1


                            0                 5            10            15      20
                                            # of publications (recent 7 years)
Defect Prediction Summary

 Identifying New Metrics

 Developing New Algorithms

 Various Granularities
Performance                                                                     11




  Apache        ArgoUML          Eclipse     Embedded Healthcare Microsoft                Mozilla
                                              System   system
                                              system

Hall et al., "A Systematic Review of Fault Prediction Performance in Software Engineering," TSE 2011 (Figure 2)
Performance
                                                                                                   13                      13




                           Class                   File   Module        Binary/plug-in

                                                                                              *For example plug-ins, binaries
                                                                      *For example plug-ins, binaries

        Figure 6. The granularity of the results
The granularity of the results
           Hall et al., "A Systematic Review of Fault Prediction Performance in Software Engineering," TSE 2011 (Figure 6)
Performance
                                                                                                   13                      13




                           Class                   File   Module        Binary/plug-in

                                                                                              *For example plug-ins, binaries
                                                                      *For example plug-ins, binaries

        Figure 6. The granularity of the results
The granularity of the results
           Hall et al., "A Systematic Review of Fault Prediction Performance in Software Engineering," TSE 2011 (Figure 6)
Defect prediction
 totally works!
Defect prediction
 totally works!
Done? Why are not using?
Detailed To Fix List
VS Buggy Modules
Detailed To Fix List
VS Buggy Modules
This is what developers want!
Defect Prediction 2.0

Finer Granularity

Noise Handling

New Customers
Defect Prediction 2.0

Finer Granularity

Noise Handling

New Customers
FindBugs




           http://findbugs.sourceforge.net/
Performance of Bug Detection Tools

           Tools' priority 1


                 FindBugs
Warnings




                       jLint


                      PMD


                               0   5              10              15             20
                                            Precision (%)


                                   Kim and Ernst, “Which Warnings Should I Fix First?” FSE 2007
RQ1: How Many False Negatives
!  Defects missed, partially, or fully captured




!  Warnings from a tool should also correctly explain in
   detail why a flagged line may be faulty
!  How many one-line defects are captured and explained
   reasonably well (so called, “strictly captured”)?




                        Very high miss rates!
                                                                                 21

               Thung et al., “To What Extent Could We Detect Field Defects?” ASE 2012
RQ1: How Many False Negatives
!  Defects missed, partially, or fully captured




!  Warnings from a tool should also correctly explain in
   detail why a flagged line may be faulty
!  How many one-line defects are captured and explained
   reasonably well (so called, “strictly captured”)?




                        Very high miss rates!
                                                                                 21

               Thung et al., “To What Extent Could We Detect Field Defects?” ASE 2012
Line Level Defect Prediction
Line Level Defect Prediction



           We have seen this
           bug in revision 100
Bug Fix Memories
  Bug fix changes in
   revision 1 .. n-1

               ……




Extract patterns in bug fix   Memory
     change history




                              Kim et al., “"Memories of bug fixes",” FSE 2006
Bug Fix Memories
  Bug fix changes in
   revision 1 .. n-1          Code to examine
               ……




                                Search for patterns
                                   in Memory




Extract patterns in bug fix     Memory
     change history




                                Kim et al., “"Memories of bug fixes",” FSE 2006
Fix Wizard
public void setColspan(int colspan) throws WrongValueException{        public
 if (colspan <= 0) throw new WrongValueException(...);
 if ( colspan != colspan) {                                            public
    colspan = colspan;                                                  Objec
 final Execution exec = Executions.getCurrent();                         if (tar
                                                                          MCla
 if (exec != null && exec.isExplorer()) invalidate() ;                    Colle
  smartUpdate(”colspan” Integer.toString( colspan));...
                       ,                                                  MOp
                                                                          class

public void setRowspan(int rowspan) throws WrongValueException{
 if (rowspan <= 0) throw new WrongValueException(...);
 if ( rowspan != rowspan) {                                            public
    rowspan = rowspan;
 final Execution exec = Executions.getCurrent();                        public
                                                                        Objec
 if (exec != null && exec.isExplorer()) invalidate();                   if (tar
  smartUpdate(”rowspan” Integer.toString( rowspan));...
                       ,                                                  MCla
                                                                          Colle
                                                                          MAt
                                                                          class
      Figure 1: et al., “Recurring at v5088-v5089 in ZK
          Nguyen Bug Fixes Bug Fixes in Object-Oriented Programs,” ICSE 2010
if (exec != null && exec.isExplorer()) invalidate();
     smartUpdate(”rowspan” Integer.toString( rowspan));...
                          ,

                       Fix Wizard in ZK
            Figure 1: Bug Fixes at v5088-v5089
                 public void setColspan(int colspan) throws WrongValueException{
                  if (colspan <= 0) throw new WrongValueException(...);
                 Usage in method colspan) {
                  if ( colspan != colSpan                         Usage in method rowSpan
                     colspan = colspan;
            IF                                IF                         IF                        IF
                  final Execution exec = Executions.getCurrent();

WrongValueException(exec
                if .<init>   != nullExecutions.getCurrent
                                     && exec.isExplorer()) invalidate() ;
                                                           WrongValueException .<init>   Executions.getCurrent
                   smartUpdate(”colspan” Integer.toString( colspan));...
                                        ,
                                     Execution.isExplorer                                 Execution.isExplorer


                                          IF
                 public void setRowspan(int rowspan) throws WrongValueException{                   IF
                  if (rowspan <= 0) throw new WrongValueException(...);
                  if ( rowspan != rowspan) {
                                  Auxheader.invalidate                                    Auxheader.invalidate
                     rowspan = rowspan;
                                                               Usage in changed code
                               Auxheader.smartUpdate
                  final Execution exec = Executions.getCurrent();                         Auxheader.smartUpdate

                  if (exec != null && exec.isExplorer()) invalidate();
                   smartUpdate(”rowspan” Integer.toString( rowspan));...
                                        ,
   Figure 2: Graph-based Object Usages for Figure 1

                                Nguyen et al., Recurring Bug Fixes in Object-Oriented Programs,” ICSE 2010
Word Level Defect Prediction
Word Level Defect Prediction




                    Fix suggestion
                          ...
Defect Prediction 2.0

Finer Granularity

Noise Handling

New Customers
Source Repository                                                       Bug Database

              all commits C
                    commit                                                       all bugs B
         commit                commit




commit                                      commit




                    commit
                                                                          fixed bugs Bf

                                        commit




           commit




                      commit




                                            Bird et al., “Fair and Balanced? Bias in Bug-Fix Datasets,” FSE2009
Source Repository                                                         Bug Database

              all commits C
                    commit                                                       all bugs B
         commit                commit




commit                                      commit




                    commit
                                                                           fixed bugs Bf

                                        commit




           commit




                      commit



                                                 linked via log messages


                                            Bird et al., “Fair and Balanced? Bias in Bug-Fix Datasets,” FSE2009
Source Repository                                                          Bug Database

              all commits C
                    commit                                                        all bugs B
         commit                commit




commit                                      commit




                    commit
                                                                             fixed bugs Bf

                                        commit




                                                                           linked fixed bugs Bfl
           commit




                      commit



                                                 linked via log messages


                                            Bird et al., “Fair and Balanced? Bias in Bug-Fix Datasets,” FSE2009
Source Repository                                                          Bug Database

              all commits C
                    commit                                                        all bugs B
         commit                commit




commit                                      commit




                    commit
                                                                             fixed bugs Bf

                                        commit




          linked fixes Cfl                                                   linked fixed bugs Bfl
           commit




                      commit



                                                 linked via log messages


                                            Bird et al., “Fair and Balanced? Bias in Bug-Fix Datasets,” FSE2009
Source Repository                                                          Bug Database

              all commits C
                    commit                                                        all bugs B
         commit                commit




commit                                      commit




                                                           related,
                    commit
                                                        but not linked       fixed bugs Bf

                                        commit




          linked fixes Cfl                                                   linked fixed bugs Bfl
           commit




                      commit



                                                 linked via log messages


                                            Bird et al., “Fair and Balanced? Bias in Bug-Fix Datasets,” FSE2009
Source Repository                                                           Bug Database

              all commits C
                     commit                                                        all bugs B
         commit                 commit




commit                                       commit



                  bug fixes Cf                               related,
                     commit
                                                         but not linked       fixed bugs Bf

                                         commit




          linked fixes Cfl                                                    linked fixed bugs Bfl
           commit




                       commit



                                                  linked via log messages


                                             Bird et al., “Fair and Balanced? Bias in Bug-Fix Datasets,” FSE2009
Source Repository                                                           Bug Database

              all commits C
                                                          oise!
                                                         N                         all bugs B
                     commit

         commit                 commit




commit                                       commit



                  bug fixes Cf                               related,
                     commit
                                                         but not linked       fixed bugs Bf

                                         commit




          linked fixes Cfl                                                    linked fixed bugs Bfl
           commit




                       commit



                                                  linked via log messages


                                             Bird et al., “Fair and Balanced? Bias in Bug-Fix Datasets,” FSE2009
How resistant a defect
                              prediction model is to noise?
                    1"
                  0.9"
                  0.8"
                  0.7"
Buggy%F'measure




                                                                                                              SWT"
                  0.6"
                  0.5"                                                                                        Debug"
                  0.4"                                                                                        Columba"
                  0.3"
                                                                                                              Eclipse"
                  0.2"
                                                                                                              Scarab"
                  0.1"
                    0"
                         0"         0.1"        0.2"        0.3"        0.4"        0.5"        0.6"
                               (c)%Training%set%false%nega6ve%(FN)%&%false%posi6ve%(FP)%rate
                                                       Kim et al., “Dealing with Noise in Defect Prediction,” ICSE 2011
How resistant a defect
                              prediction model is to noise?
                    1"
                  0.9"
                  0.8"
                  0.7"
Buggy%F'measure




                                                                                                              SWT"
                  0.6"
                  0.5"                                                                                        Debug"
                  0.4"                                                                                        Columba"
                  0.3"
                                                                                                              Eclipse"
                  0.2"
                                                                                                              Scarab"
                  0.1"
                    0"
                         0"         0.1"        0.2"        0.3"        0.4"        0.5"        0.6"
                               (c)%Training%set%false%nega6ve%(FN)%&%false%posi6ve%(FP)%rate
                                                       Kim et al., “Dealing with Noise in Defect Prediction,” ICSE 2011
How resistant a defect
                              prediction model is to noise?
                    1"
                  0.9"
                  0.8"
                  0.7"
Buggy%F'measure




                                                                                                              SWT"
                  0.6"
                  0.5"                                                                                        Debug"
                  0.4"                                                                                        Columba"
                  0.3"
                  0.2"
                  0.1"
                                             20%                                                              Eclipse"

                                                                                                              Scarab"

                    0"
                         0"         0.1"        0.2"        0.3"        0.4"        0.5"        0.6"
                               (c)%Training%set%false%nega6ve%(FN)%&%false%posi6ve%(FP)%rate
                                                       Kim et al., “Dealing with Noise in Defect Prediction,” ICSE 2011
Closest List Noise
return Aj

             Identification
       F igure 9. The pseudo-code of the C LN I algorit




                             A




                  Kim et al., “Dealing with Noise in Defect Prediction,” ICSE 2011
Noise detection
         performance

          Precision             Recall             F-measure

Debug      0.681                0.871                 0.764

SWT        0.624                0.830                 0.712



                                    (noise level =20%)
             Kim et al., “Dealing with Noise in Defect Prediction,” ICSE 2011
Bug prediction using cleaned data
                           Noisey

                100



                 75
SWT F-measure




                 50



                 25



                  0
                      0%   15%                    30%   45%
                                    Noise level
Bug prediction using cleaned data
                           Noisey          Cleaned

                100



                 75
SWT F-measure




                 50



                 25



                  0
                      0%   15%                       30%   45%
                                    Noise level
Bug prediction using cleaned data
                               Noisey          Cleaned

                100



                 75
SWT F-measure




                 50



                 25
                           76%
                             F-measure
                           with 45% noise
                  0
                      0%      15%                        30%   45%
                                        Noise level
ReLink
 Source
  code
repository        Traditional
                                                          Unknown
                   heuristics
                  (link miner)                              links

  Bug
database                                                  Recovering
                                                          links using
                                                             feature
                 Links
                                       Features




                                                          Links
                                      Combine




                                                  Links



             Wu et al., “ReLink: Recovering Links between Bugs and Changes,” FSE 2011
ReLink
 Source
  code
repository        Traditional
                                                          Unknown
                   heuristics
                  (link miner)                              links

  Bug
database                                                  Recovering
                                                          links using
                                                             feature
                 Links
                                       Features




                                                          Links
                                      Combine




                                                  Links



             Wu et al., “ReLink: Recovering Links between Bugs and Changes,” FSE 2011
ReLink Performance
                ZXing
Projects




           OpenIntents



               Apache


                         0           20           40            60           80           100
                                                     F-measure

                                           Traditional        ReLink

                             Wu et al., “ReLink: Recovering Links between Bugs and Changes,” FSE 2011
Label Historical Changes
                                                                 Change message:
                                                                 “fix for bug 28434”

                                    Rev 101 (with BUG)                              Rev 102 (no BUG)
                                    ...                                             ...
                                    ...                                             ...
                                    ...                                             ...
                                                                   fixed
                                    ...                                             ...
 Rev 1                          Rev 100                        Rev 101                        Rev 102
 ...                            ...                            ...                            ...
 ...
 ...
                   ……           ...
                                ...
                                                  change       ...
                                                               ...
                                                                                 change       ...
                                                                                              ...
 ...                            ...                            ...                            ...



                       Development history of a file
Fischer et al, “Populating a Release History Database from Version Control and Bug Tracking Systems,” ICSM2003
Atomic Change
                                         Change message:
                                         “fix for bug 28434”

            Rev 101 (with BUG)                               Rev 102 (no BUG)
            ...                                              ...
            setText(“t”)                                    insertTab()
            ...                                              ...
                                            fixed
            ...                                              ...




Fischer et al, “Populating a Release History Database from Version Control and Bug Tracking Systems,” ICSM2003
Composite Change
         public TimeSeriesDataItem addOrUpdate(RegularTimePeriod period, double value)

hunk 1   677
         678   }
                     return this.addOrUpdate(period, new Double(value));
                                                                           }
                                                                               return addOrUpdate(period, new Double(value));   this.
                                                                                                                                this.
         public TimeSeries createCopy(RegularTimePeriod start, RegularTimePeriod end)                                           this.
         944   if (endIndex < 0) {                                         if ((endIndex < 0) || (endIndex < startIndex)) {

hunk 2   945
         946   }
                     emptyRange = true;
                                                                           }
                                                                               emptyRange = true;                               if (ti
                                                                                                                                      |
                                                                                                                                    ge
         public boolean equals(Object object)
         973   if (!ObjectUtilities.equal(                                 if (!ObjectUtilities.equal(getDomainDescription(),
                                                                                                                                }
hunk 3   974
               )){
                     getDomainDescription(), s.getDomainDescription()          s.getDomainDescription())) {


         975         return false;                                             return false;
         976   }                                                           }
                                                                                                                                pub
         978   if (!ObjectUtilities.equal(                                 if (!ObjectUtilities.equal(getRangeDescription(),

hunk 4   979
               )){
                     getRangeDescription(), s.getRangeDescription()            s.getRangeDescription())) {
                                                                                                                                }
         980         return false;                                             return false;
         981   }                                                           }

                                                         JFree revision 1083
                                             Figure 5. JFreeChart revision 1083.
                                     Tao et al, “"How Do Software Engineers Understand Code Changes?” FSE 2012
Defect Prediction 2.0

Finer Granularity

Noise Handling

New Customers
Warning Developers

       “Safe” Files
(Predicted as not buggy)



      “Risky” Files
  (Predicted as buggy)
Change Classification
Rev 1             Rev 2                     Rev 3                     Rev 4
...               ...                       ...                       ...
...      change   ...           change      ...           change      ...
...               ...                       ...                       ...




                   Kim et al., "Classifying Software Changes: Clean or Buggy?" TSE 2009
Change Classification
  Rev 1             Rev 2            Rev 3            Rev 4
  ...               ...              ...              ...
  ...      change   ...     change   ...     change   ...
  ...               ...              ...              ...


“Safe” Files
  Rev 1             Rev 2            Rev 3            Rev 4
  ...               ...              ...              ...
  ...      change   ...     change   ...     change   ...
  ...               ...              ...              ...


“Risky” Files
Change Classification
  Rev 1             Rev 2            Rev 3            Rev 4
  ...               ...              ...              ...
  ...      change   ...     change   ...     change   ...
  ...               ...              ...              ...


“Safe” Files
  Rev 1             Rev 2            Rev 3            Rev 4
  ...               ...              ...              ...
  ...      change   ...     change   ...     change   ...
  ...               ...              ...              ...


“Risky” Files
Defect prediction based
                       Change Classification
           Debug UI

               JDT

              JEdit
Projects




               PDE

               POI

            Team UI

                      0     0.20     0.40        0.60   0.80
                                   F-measure
                            CC       Cached CC
Warning Developers

    “Safe” Location
(Predicted as not buggy)



    “Risky” Location
  (Predicted as buggy)
Test-case Selection
Test-case Selection



              Executing
              test cases
Test-case Selection
       1.00

       0.75
                                                                                   Baseline
APFD




       0.50                                                                        History1
                                                                                   History2

       0.25

         0
          R1.0   R1.1        R1.2          R1.3          R1.4         R1.5
                                 Releases
                  Runeson and Ljung, “Improving Regression Testing Transparency and Efficiency with
                                                          History-Based Prioritization,” ICST 2011
Warning Prioritization
Warning Prioritization
Warning Prioritization
                 18"
                 16"
                 14"
                 12"
Precision)(%))




                 10"
                 8"                                                                      History"
                                                                                         Tool"
                 6"
                 4"
                  2"
                 0"
                       0"   20"         40"          60"         80"             100"
                                  Warning)Instances)by)Priority)

                                        Kim and Ernst, “Which Warnings Should I Fix First?” FSE 2007
Other Topics

• Explanation
  - Why it has been predicted as defect-prone?
• Cross-project prediction
• Cost effectiveness measures
• Active Learning/Refinement
Defect Prediction 2.0

  New metrics

   Algorithms


Coarse granularity   1.0
Defect Prediction 2.0

  New metrics              Finer granularity


   Algorithms              Noise Handling


Coarse granularity   1.0   New customers       2.0
Defect Prediction 2.0

  New metrics             Finer granularity


   Algorithms             Noise Handling


Corse granularity   1.0   New customers       2.0
2013
MSR$2013:$Back$to$roots$
Tom$Zimmermann$                    Alberto$Bacchelli$$
General'chair'               Mining'Challenge'Chair'




        Massimiliano$Di$Penta$and$Sung$Kim$
                Program'co)chairs'
MSR$2013:$Back$to$roots$
Tom$Zimmermann$                    Alberto$Bacchelli$$
General'chair'               Mining'Challenge'Chair'


                  February



                   1 5
        Massimiliano$Di$Penta$and$Sung$Kim$
                Program'co)chairs'
Some slides/data are borrowed
           with thanks from

•   Tom Zimmermann, Chris Bird
•   Andreas Zeller
•   Ahmed Hassan
•   David Lo
•   Jaechang Nam,Yida Tao
•   Tien Neguan
•   Steve Counsell, David Bowes, Tracy Hall and David Gray
•   Wen Zhang
1 of 118

Recommended

Random Artificial Incorporation of Noise in a Learning Classifier System Envi... by
Random Artificial Incorporation of Noise in a Learning Classifier System Envi...Random Artificial Incorporation of Noise in a Learning Classifier System Envi...
Random Artificial Incorporation of Noise in a Learning Classifier System Envi...Daniele Loiacono
731 views24 slides
BugTriage with Bug Tossing Graphs (ESEC/FSE 2009) by
BugTriage with Bug Tossing Graphs (ESEC/FSE 2009)BugTriage with Bug Tossing Graphs (ESEC/FSE 2009)
BugTriage with Bug Tossing Graphs (ESEC/FSE 2009)Sung Kim
2.1K views35 slides
2011實驗室介紹 by
2011實驗室介紹2011實驗室介紹
2011實驗室介紹Visual Cognition and Modeling Lab
8.1K views66 slides
If you fix everything you lose fixes for everything else by
If you fix everything you lose fixes for everything else If you fix everything you lose fixes for everything else
If you fix everything you lose fixes for everything else CS, NcState
278 views25 slides
An Empirical Comparison of Model Validation Techniques for Defect Prediction ... by
An Empirical Comparison of Model Validation Techniques for Defect Prediction ...An Empirical Comparison of Model Validation Techniques for Defect Prediction ...
An Empirical Comparison of Model Validation Techniques for Defect Prediction ...Chakkrit (Kla) Tantithamthavorn
1.9K views57 slides
Leveraging HPC Resources to Improve the Experimental Design of Software Analy... by
Leveraging HPC Resources to Improve the Experimental Design of Software Analy...Leveraging HPC Resources to Improve the Experimental Design of Software Analy...
Leveraging HPC Resources to Improve the Experimental Design of Software Analy...Chakkrit (Kla) Tantithamthavorn
5.6K views82 slides

More Related Content

Similar to Defect, defect, defect: PROMISE 2012 Keynote

Parameterizing and Assembling IR-based Solutions for SE Tasks using Genetic A... by
Parameterizing and Assembling IR-based Solutions for SE Tasks using Genetic A...Parameterizing and Assembling IR-based Solutions for SE Tasks using Genetic A...
Parameterizing and Assembling IR-based Solutions for SE Tasks using Genetic A...Annibale Panichella
295 views37 slides
Mining Cause Effect Chains from Version Archives - ISSRE 2011 by
Mining Cause Effect Chains from Version Archives - ISSRE 2011Mining Cause Effect Chains from Version Archives - ISSRE 2011
Mining Cause Effect Chains from Version Archives - ISSRE 2011Kim Herzig
770 views61 slides
A tutorial on EMF-IncQuery by
A tutorial on EMF-IncQueryA tutorial on EMF-IncQuery
A tutorial on EMF-IncQueryIstvan Rath
1.9K views101 slides
Architecting Smarter Apps with Entity Framework by
Architecting Smarter Apps with Entity FrameworkArchitecting Smarter Apps with Entity Framework
Architecting Smarter Apps with Entity FrameworkSaltmarch Media
600 views10 slides
Changes and Bugs: Mining and Predicting Development Activities by
Changes and Bugs: Mining and Predicting Development ActivitiesChanges and Bugs: Mining and Predicting Development Activities
Changes and Bugs: Mining and Predicting Development ActivitiesThomas Zimmermann
4.6K views34 slides
Sciunits: Resuable Research Object by
Sciunits: Resuable Research Object Sciunits: Resuable Research Object
Sciunits: Resuable Research Object Tanu Malik
514 views20 slides

Similar to Defect, defect, defect: PROMISE 2012 Keynote (20)

Parameterizing and Assembling IR-based Solutions for SE Tasks using Genetic A... by Annibale Panichella
Parameterizing and Assembling IR-based Solutions for SE Tasks using Genetic A...Parameterizing and Assembling IR-based Solutions for SE Tasks using Genetic A...
Parameterizing and Assembling IR-based Solutions for SE Tasks using Genetic A...
Mining Cause Effect Chains from Version Archives - ISSRE 2011 by Kim Herzig
Mining Cause Effect Chains from Version Archives - ISSRE 2011Mining Cause Effect Chains from Version Archives - ISSRE 2011
Mining Cause Effect Chains from Version Archives - ISSRE 2011
Kim Herzig770 views
A tutorial on EMF-IncQuery by Istvan Rath
A tutorial on EMF-IncQueryA tutorial on EMF-IncQuery
A tutorial on EMF-IncQuery
Istvan Rath1.9K views
Architecting Smarter Apps with Entity Framework by Saltmarch Media
Architecting Smarter Apps with Entity FrameworkArchitecting Smarter Apps with Entity Framework
Architecting Smarter Apps with Entity Framework
Saltmarch Media600 views
Changes and Bugs: Mining and Predicting Development Activities by Thomas Zimmermann
Changes and Bugs: Mining and Predicting Development ActivitiesChanges and Bugs: Mining and Predicting Development Activities
Changes and Bugs: Mining and Predicting Development Activities
Thomas Zimmermann4.6K views
Sciunits: Resuable Research Object by Tanu Malik
Sciunits: Resuable Research Object Sciunits: Resuable Research Object
Sciunits: Resuable Research Object
Tanu Malik514 views
TAROT2013 Testing School - Gilles Perrouin presentation by Henry Muccini
TAROT2013 Testing School -  Gilles Perrouin presentationTAROT2013 Testing School -  Gilles Perrouin presentation
TAROT2013 Testing School - Gilles Perrouin presentation
Henry Muccini1.4K views
Modeling XCS in class imbalances: Population sizing and parameter settings by kknsastry
Modeling XCS in class imbalances: Population sizing and parameter settingsModeling XCS in class imbalances: Population sizing and parameter settings
Modeling XCS in class imbalances: Population sizing and parameter settings
kknsastry556 views
Changes and Bugs: Mining and Predicting Development Activities by Thomas Zimmermann
Changes and Bugs: Mining and Predicting Development ActivitiesChanges and Bugs: Mining and Predicting Development Activities
Changes and Bugs: Mining and Predicting Development Activities
Thomas Zimmermann1.4K views
The Evolution of Scala / Scala進化論 by scalaconfjp
The Evolution of Scala / Scala進化論The Evolution of Scala / Scala進化論
The Evolution of Scala / Scala進化論
scalaconfjp5.2K views
Fuzzing for CPS Mutation Testing by Lionel Briand
Fuzzing for CPS Mutation TestingFuzzing for CPS Mutation Testing
Fuzzing for CPS Mutation Testing
Lionel Briand9 views
At&t research at trecvid 2009 by Kirill Lazarev
At&t research at trecvid 2009At&t research at trecvid 2009
At&t research at trecvid 2009
Kirill Lazarev488 views
Is Advanced Verification for FPGA based Logic needed by chiportal
Is Advanced Verification for FPGA based Logic neededIs Advanced Verification for FPGA based Logic needed
Is Advanced Verification for FPGA based Logic needed
chiportal674 views
QwalKeko, a History Querying Tool by stevensreinout
QwalKeko, a History Querying ToolQwalKeko, a History Querying Tool
QwalKeko, a History Querying Tool
stevensreinout379 views
Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distrib... by MLAI2
Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distrib...Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distrib...
Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distrib...
MLAI2793 views
Scientific and Grid Workflow Management (SGS09) by Cesare Pautasso
Scientific and Grid Workflow Management (SGS09)Scientific and Grid Workflow Management (SGS09)
Scientific and Grid Workflow Management (SGS09)
Cesare Pautasso1.4K views
Testability for developers – Fighting a mess by making it testable by Alexander Tarlinder
Testability for developers – Fighting a mess by making it testableTestability for developers – Fighting a mess by making it testable
Testability for developers – Fighting a mess by making it testable

More from Sung Kim

DeepAM: Migrate APIs with Multi-modal Sequence to Sequence Learning by
DeepAM: Migrate APIs with Multi-modal Sequence to Sequence LearningDeepAM: Migrate APIs with Multi-modal Sequence to Sequence Learning
DeepAM: Migrate APIs with Multi-modal Sequence to Sequence LearningSung Kim
1.3K views23 slides
Deep API Learning (FSE 2016) by
Deep API Learning (FSE 2016)Deep API Learning (FSE 2016)
Deep API Learning (FSE 2016)Sung Kim
1.4K views25 slides
Time series classification by
Time series classificationTime series classification
Time series classificationSung Kim
5.7K views29 slides
Tensor board by
Tensor boardTensor board
Tensor boardSung Kim
8.4K views17 slides
REMI: Defect Prediction for Efficient API Testing (

ESEC/FSE 2015, Industria... by
REMI: Defect Prediction for Efficient API Testing (

ESEC/FSE 2015, Industria...REMI: Defect Prediction for Efficient API Testing (

ESEC/FSE 2015, Industria...
REMI: Defect Prediction for Efficient API Testing (

ESEC/FSE 2015, Industria...Sung Kim
2.5K views16 slides
Heterogeneous Defect Prediction (

ESEC/FSE 2015) by
Heterogeneous Defect Prediction (

ESEC/FSE 2015)Heterogeneous Defect Prediction (

ESEC/FSE 2015)
Heterogeneous Defect Prediction (

ESEC/FSE 2015)Sung Kim
2.2K views28 slides

More from Sung Kim(20)

DeepAM: Migrate APIs with Multi-modal Sequence to Sequence Learning by Sung Kim
DeepAM: Migrate APIs with Multi-modal Sequence to Sequence LearningDeepAM: Migrate APIs with Multi-modal Sequence to Sequence Learning
DeepAM: Migrate APIs with Multi-modal Sequence to Sequence Learning
Sung Kim1.3K views
Deep API Learning (FSE 2016) by Sung Kim
Deep API Learning (FSE 2016)Deep API Learning (FSE 2016)
Deep API Learning (FSE 2016)
Sung Kim1.4K views
Time series classification by Sung Kim
Time series classificationTime series classification
Time series classification
Sung Kim5.7K views
Tensor board by Sung Kim
Tensor boardTensor board
Tensor board
Sung Kim8.4K views
REMI: Defect Prediction for Efficient API Testing (

ESEC/FSE 2015, Industria... by Sung Kim
REMI: Defect Prediction for Efficient API Testing (

ESEC/FSE 2015, Industria...REMI: Defect Prediction for Efficient API Testing (

ESEC/FSE 2015, Industria...
REMI: Defect Prediction for Efficient API Testing (

ESEC/FSE 2015, Industria...
Sung Kim2.5K views
Heterogeneous Defect Prediction (

ESEC/FSE 2015) by Sung Kim
Heterogeneous Defect Prediction (

ESEC/FSE 2015)Heterogeneous Defect Prediction (

ESEC/FSE 2015)
Heterogeneous Defect Prediction (

ESEC/FSE 2015)
Sung Kim2.2K views
A Survey on Automatic Software Evolution Techniques by Sung Kim
A Survey on Automatic Software Evolution TechniquesA Survey on Automatic Software Evolution Techniques
A Survey on Automatic Software Evolution Techniques
Sung Kim1.1K views
Crowd debugging (FSE 2015) by Sung Kim
Crowd debugging (FSE 2015)Crowd debugging (FSE 2015)
Crowd debugging (FSE 2015)
Sung Kim1.9K views
Software Defect Prediction on Unlabeled Datasets by Sung Kim
Software Defect Prediction on Unlabeled DatasetsSoftware Defect Prediction on Unlabeled Datasets
Software Defect Prediction on Unlabeled Datasets
Sung Kim16.7K views
Partitioning Composite Code Changes to Facilitate Code Review (MSR2015) by Sung Kim
Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)
Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)
Sung Kim1.6K views
Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014) by Sung Kim
Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)
Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)
Sung Kim1.9K views
How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2... by Sung Kim
How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...
How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...
Sung Kim2.2K views
CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014) by Sung Kim
CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)
CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)
Sung Kim6.4K views
Source code comprehension on evolving software by Sung Kim
Source code comprehension on evolving softwareSource code comprehension on evolving software
Source code comprehension on evolving software
Sung Kim1.6K views
A Survey on Dynamic Symbolic Execution for Automatic Test Generation by Sung Kim
A Survey on  Dynamic Symbolic Execution  for Automatic Test GenerationA Survey on  Dynamic Symbolic Execution  for Automatic Test Generation
A Survey on Dynamic Symbolic Execution for Automatic Test Generation
Sung Kim3.1K views
Survey on Software Defect Prediction by Sung Kim
Survey on Software Defect PredictionSurvey on Software Defect Prediction
Survey on Software Defect Prediction
Sung Kim14.1K views
MSR2014 opening by Sung Kim
MSR2014 openingMSR2014 opening
MSR2014 opening
Sung Kim17K views
Personalized Defect Prediction by Sung Kim
Personalized Defect PredictionPersonalized Defect Prediction
Personalized Defect Prediction
Sung Kim3.7K views
STAR: Stack Trace based Automatic Crash Reproduction by Sung Kim
STAR: Stack Trace based Automatic Crash ReproductionSTAR: Stack Trace based Automatic Crash Reproduction
STAR: Stack Trace based Automatic Crash Reproduction
Sung Kim7K views
Transfer defect learning by Sung Kim
Transfer defect learningTransfer defect learning
Transfer defect learning
Sung Kim3.2K views

Recently uploaded

Special_edition_innovator_2023.pdf by
Special_edition_innovator_2023.pdfSpecial_edition_innovator_2023.pdf
Special_edition_innovator_2023.pdfWillDavies22
17 views6 slides
20231123_Camunda Meetup Vienna.pdf by
20231123_Camunda Meetup Vienna.pdf20231123_Camunda Meetup Vienna.pdf
20231123_Camunda Meetup Vienna.pdfPhactum Softwareentwicklung GmbH
33 views73 slides
Case Study Copenhagen Energy and Business Central.pdf by
Case Study Copenhagen Energy and Business Central.pdfCase Study Copenhagen Energy and Business Central.pdf
Case Study Copenhagen Energy and Business Central.pdfAitana
16 views3 slides
Data Integrity for Banking and Financial Services by
Data Integrity for Banking and Financial ServicesData Integrity for Banking and Financial Services
Data Integrity for Banking and Financial ServicesPrecisely
12 views26 slides
Transcript: The Details of Description Techniques tips and tangents on altern... by
Transcript: The Details of Description Techniques tips and tangents on altern...Transcript: The Details of Description Techniques tips and tangents on altern...
Transcript: The Details of Description Techniques tips and tangents on altern...BookNet Canada
135 views15 slides
Attacking IoT Devices from a Web Perspective - Linux Day by
Attacking IoT Devices from a Web Perspective - Linux Day Attacking IoT Devices from a Web Perspective - Linux Day
Attacking IoT Devices from a Web Perspective - Linux Day Simone Onofri
15 views68 slides

Recently uploaded(20)

Special_edition_innovator_2023.pdf by WillDavies22
Special_edition_innovator_2023.pdfSpecial_edition_innovator_2023.pdf
Special_edition_innovator_2023.pdf
WillDavies2217 views
Case Study Copenhagen Energy and Business Central.pdf by Aitana
Case Study Copenhagen Energy and Business Central.pdfCase Study Copenhagen Energy and Business Central.pdf
Case Study Copenhagen Energy and Business Central.pdf
Aitana16 views
Data Integrity for Banking and Financial Services by Precisely
Data Integrity for Banking and Financial ServicesData Integrity for Banking and Financial Services
Data Integrity for Banking and Financial Services
Precisely12 views
Transcript: The Details of Description Techniques tips and tangents on altern... by BookNet Canada
Transcript: The Details of Description Techniques tips and tangents on altern...Transcript: The Details of Description Techniques tips and tangents on altern...
Transcript: The Details of Description Techniques tips and tangents on altern...
BookNet Canada135 views
Attacking IoT Devices from a Web Perspective - Linux Day by Simone Onofri
Attacking IoT Devices from a Web Perspective - Linux Day Attacking IoT Devices from a Web Perspective - Linux Day
Attacking IoT Devices from a Web Perspective - Linux Day
Simone Onofri15 views
STKI Israeli Market Study 2023 corrected forecast 2023_24 v3.pdf by Dr. Jimmy Schwarzkopf
STKI Israeli Market Study 2023   corrected forecast 2023_24 v3.pdfSTKI Israeli Market Study 2023   corrected forecast 2023_24 v3.pdf
STKI Israeli Market Study 2023 corrected forecast 2023_24 v3.pdf
Igniting Next Level Productivity with AI-Infused Data Integration Workflows by Safe Software
Igniting Next Level Productivity with AI-Infused Data Integration Workflows Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Safe Software257 views
Five Things You SHOULD Know About Postman by Postman
Five Things You SHOULD Know About PostmanFive Things You SHOULD Know About Postman
Five Things You SHOULD Know About Postman
Postman30 views
Business Analyst Series 2023 - Week 3 Session 5 by DianaGray10
Business Analyst Series 2023 -  Week 3 Session 5Business Analyst Series 2023 -  Week 3 Session 5
Business Analyst Series 2023 - Week 3 Session 5
DianaGray10237 views
SAP Automation Using Bar Code and FIORI.pdf by Virendra Rai, PMP
SAP Automation Using Bar Code and FIORI.pdfSAP Automation Using Bar Code and FIORI.pdf
SAP Automation Using Bar Code and FIORI.pdf

Defect, defect, defect: PROMISE 2012 Keynote

  • 1. Keynote Defect, Defect, Defect Sung Kim The Hong Kong University of Science and Technology
  • 6. Program Analysis and Mining (PAM) Group
  • 7. Program Analysis and Mining (PAM) Group
  • 8. The First Bug September 9, 1947
  • 10. Finding Bugs Verification Testing Prediction
  • 11. Defect Prediction 42 24 14 Program Tool Future defects
  • 13. Defect Prediction Model D= 4.8 6+ 0. 0 18L F. Akiyama, “An Example of Software System Debugging,” Information Processing, vol. 71, 1971
  • 14. Defect Prediction Identifying New Metrics Developing New Algorithms Various Granularities
  • 15. Defect Prediction Identifying New Metrics Developing New Algorithms Various Granularities
  • 16. Complex Files a simple file a complex file Ostrand and Weyuker, Basili et al., TSE 1996, Ohlsson and Alberg, TSE 1996, Menzies et al., TSE 2007
  • 17. Complex Files a simple file a complex file Ostrand and Weyuker, Basili et al., TSE 1996, Ohlsson and Alberg, TSE 1996, Menzies et al., TSE 2007
  • 18. Changes Bell et al. PROMISE 2011, Moser et al., ICSE 2008, Nagappan et al., ICSE 2006, Hassan et al., ICSM 2005
  • 19. Changes Bell et al. PROMISE 2011, Moser et al., ICSE 2008, Nagappan et al., ICSE 2006, Hassan et al., ICSM 2005
  • 20. View/Edit Patterns Lee et al., FSE2011
  • 21. Slide by Mik Kersten. “Mylyn – The task-focused interface” (December 2007, http://live.eclipse.org)
  • 22. With Mylyn Tasks are integrated See only what you are working on Slide by Mik Kersten. “Mylyn – The task-focused interface” (December 2007, http://live.eclipse.org)
  • 23. * Eclipse plug-in storing and recovering task contexts
  • 24. * Eclipse plug-in storing and recovering task contexts
  • 25. <InteractionEvent … Kind=“ ” … StartDate=“ ” EndDate=“ ” … StructureHandle=“ ” … Interest=“ ” … > * Eclipse plug-in storing and recovering task contexts
  • 26. Burst Edits/Views Lee et al., FSE2011
  • 27. Burst Edits/Views Lee et al., FSE2011
  • 28. Change Entropy 11 Low Entropy High Entropy 3 3 3 3 3 1 1 1 1 F1 F1 F2 F2 F3 F3 F4 F4 F5 F5 F6 F1 F7 F2 F8 F3 F9 F4 F10 F5 The number of changes in a period (e.g., a week) per file Hassan, “Predicting Faults Using the Complexity of Code Changes,” ICSE 2009
  • 29. Change Entropy 11 Low Entropy High Entropy 3 3 3 3 3 1 1 1 1 F1 F1 F2 F2 F3 F3 F4 F4 F5 F5 F6 F1 F7 F2 F8 F3 F9 F4 F10 F5 The number of changes in a period (e.g., a week) per file Hassan, “Predicting Faults Using the Complexity of Code Changes,” ICSE 2009
  • 30. Previous Fixes Hassan et al., ICSM 2005, Kim et al., ICSE 2007
  • 31. Previous Fixes Hassan et al., ICSM 2005, Kim et al., ICSE 2007
  • 32. Previous Fixes Hassan et al., ICSM 2005, Kim et al., ICSE 2007
  • 33. Network Zimmermann and Nagappan, “Predicting Defects using Network Analysis on Dependency Graphs,”ICSE 2008
  • 34. Network Zimmermann and Nagappan, “Predicting Defects using Network Analysis on Dependency Graphs,”ICSE 2008
  • 35. More Metrics Complexity (Size) CK McCabe OO Process metrics Halstead Developer Count metrics Change metrics Entropy of changes (Change Complexity) Churn (source code metrics) # of changes to the file Previous defects Network measures Calling structure attributes Entropy (source code metrics) 0 5 10 15 20 25 # of publications (last 7 years)
  • 36. Defect Prediction Identifying New Metrics Developing New Algorithms Various Granularities
  • 37. Classification training instances complexity metrics (metrics+ labels) historical metrics ... ? new instance Prediction Learner (classification)
  • 38. Regression training instances complexity metrics (metrics+ values) historical metrics ... ? new instance Prediction Learner (values)
  • 39. Active Learning Anomaly Detection System 1 Refinement Sorted Engine Bug Reports 2 5 <<Refinement Loop>> First Few Bug User Reports 3 Feedback 4 Figure 4. Active Refinement Process characteristics in a clone group. Then, the set ofet al., PROMISE 2012 Lo et al., “Active Refinement of Clone Anomaly Reports,” ICSE 2012, Lu anomalies or
  • 40. Bug Cache c h 10% files e t most bug-prone - f re ad s is p on Lo m t en em ac pl Nearby: co changes re all files Kim et al., “Predicting Faults from Cached History,” ICSE 2007
  • 41. Algorithms Classification 21 Algorithms Regression 18 Both 4 Etc. 4 0 5 10 15 20 25 # of publications (recent 7 years) 31
  • 42. Defect Prediction Identifying New Metrics Developing New Algorithms Various Granularities
  • 47. Method Level void foo () { ... } Hata et al.,“Bug Prediction Based on Fine-Grained Module Histories,” ICSE 2012
  • 48. Method Level void foo () { ... } Hata et al.,“Bug Prediction Based on Fine-Grained Module Histories,” ICSE 2012
  • 49. Change Level Development history of a file Rev 1 Rev 2 Rev 3 Rev 4 ... ... ... ... ... change ... change ... change ... ... ... ... ... ... ... ... ... Did I just introduce a bug? Kim et al., "Classifying Software Changes: Clean or Buggy?" TSE 2009
  • 50. Change Level Development history of a file Rev 1 Rev 2 Rev 3 Rev 4 ... ... ... ... ... change ... change ... change ... ... ... ... ... ... ... ... ... Did I just introduce a bug? Kim et al., "Classifying Software Changes: Clean or Buggy?" TSE 2009
  • 51. More Granularities Project/Release/SubSystem 3 Component/Module 8 Package 3 File 19 Class 8 Function/Method 2 Change/Hunk level 1 0 5 10 15 20 # of publications (recent 7 years)
  • 52. Defect Prediction Summary Identifying New Metrics Developing New Algorithms Various Granularities
  • 53. Performance 11 Apache ArgoUML Eclipse Embedded Healthcare Microsoft Mozilla System system system Hall et al., "A Systematic Review of Fault Prediction Performance in Software Engineering," TSE 2011 (Figure 2)
  • 54. Performance 13 13 Class File Module Binary/plug-in *For example plug-ins, binaries *For example plug-ins, binaries Figure 6. The granularity of the results The granularity of the results Hall et al., "A Systematic Review of Fault Prediction Performance in Software Engineering," TSE 2011 (Figure 6)
  • 55. Performance 13 13 Class File Module Binary/plug-in *For example plug-ins, binaries *For example plug-ins, binaries Figure 6. The granularity of the results The granularity of the results Hall et al., "A Systematic Review of Fault Prediction Performance in Software Engineering," TSE 2011 (Figure 6)
  • 58. Done? Why are not using?
  • 59. Detailed To Fix List VS Buggy Modules
  • 60. Detailed To Fix List VS Buggy Modules
  • 61. This is what developers want!
  • 62. Defect Prediction 2.0 Finer Granularity Noise Handling New Customers
  • 63. Defect Prediction 2.0 Finer Granularity Noise Handling New Customers
  • 64. FindBugs http://findbugs.sourceforge.net/
  • 65. Performance of Bug Detection Tools Tools' priority 1 FindBugs Warnings jLint PMD 0 5 10 15 20 Precision (%) Kim and Ernst, “Which Warnings Should I Fix First?” FSE 2007
  • 66. RQ1: How Many False Negatives !  Defects missed, partially, or fully captured !  Warnings from a tool should also correctly explain in detail why a flagged line may be faulty !  How many one-line defects are captured and explained reasonably well (so called, “strictly captured”)? Very high miss rates! 21 Thung et al., “To What Extent Could We Detect Field Defects?” ASE 2012
  • 67. RQ1: How Many False Negatives !  Defects missed, partially, or fully captured !  Warnings from a tool should also correctly explain in detail why a flagged line may be faulty !  How many one-line defects are captured and explained reasonably well (so called, “strictly captured”)? Very high miss rates! 21 Thung et al., “To What Extent Could We Detect Field Defects?” ASE 2012
  • 68. Line Level Defect Prediction
  • 69. Line Level Defect Prediction We have seen this bug in revision 100
  • 70. Bug Fix Memories Bug fix changes in revision 1 .. n-1 …… Extract patterns in bug fix Memory change history Kim et al., “"Memories of bug fixes",” FSE 2006
  • 71. Bug Fix Memories Bug fix changes in revision 1 .. n-1 Code to examine …… Search for patterns in Memory Extract patterns in bug fix Memory change history Kim et al., “"Memories of bug fixes",” FSE 2006
  • 72. Fix Wizard public void setColspan(int colspan) throws WrongValueException{ public if (colspan <= 0) throw new WrongValueException(...); if ( colspan != colspan) { public colspan = colspan; Objec final Execution exec = Executions.getCurrent(); if (tar MCla if (exec != null && exec.isExplorer()) invalidate() ; Colle smartUpdate(”colspan” Integer.toString( colspan));... , MOp class public void setRowspan(int rowspan) throws WrongValueException{ if (rowspan <= 0) throw new WrongValueException(...); if ( rowspan != rowspan) { public rowspan = rowspan; final Execution exec = Executions.getCurrent(); public Objec if (exec != null && exec.isExplorer()) invalidate(); if (tar smartUpdate(”rowspan” Integer.toString( rowspan));... , MCla Colle MAt class Figure 1: et al., “Recurring at v5088-v5089 in ZK Nguyen Bug Fixes Bug Fixes in Object-Oriented Programs,” ICSE 2010
  • 73. if (exec != null && exec.isExplorer()) invalidate(); smartUpdate(”rowspan” Integer.toString( rowspan));... , Fix Wizard in ZK Figure 1: Bug Fixes at v5088-v5089 public void setColspan(int colspan) throws WrongValueException{ if (colspan <= 0) throw new WrongValueException(...); Usage in method colspan) { if ( colspan != colSpan Usage in method rowSpan colspan = colspan; IF IF IF IF final Execution exec = Executions.getCurrent(); WrongValueException(exec if .<init> != nullExecutions.getCurrent && exec.isExplorer()) invalidate() ; WrongValueException .<init> Executions.getCurrent smartUpdate(”colspan” Integer.toString( colspan));... , Execution.isExplorer Execution.isExplorer IF public void setRowspan(int rowspan) throws WrongValueException{ IF if (rowspan <= 0) throw new WrongValueException(...); if ( rowspan != rowspan) { Auxheader.invalidate Auxheader.invalidate rowspan = rowspan; Usage in changed code Auxheader.smartUpdate final Execution exec = Executions.getCurrent(); Auxheader.smartUpdate if (exec != null && exec.isExplorer()) invalidate(); smartUpdate(”rowspan” Integer.toString( rowspan));... , Figure 2: Graph-based Object Usages for Figure 1 Nguyen et al., Recurring Bug Fixes in Object-Oriented Programs,” ICSE 2010
  • 74. Word Level Defect Prediction
  • 75. Word Level Defect Prediction Fix suggestion ...
  • 76. Defect Prediction 2.0 Finer Granularity Noise Handling New Customers
  • 77. Source Repository Bug Database all commits C commit all bugs B commit commit commit commit commit fixed bugs Bf commit commit commit Bird et al., “Fair and Balanced? Bias in Bug-Fix Datasets,” FSE2009
  • 78. Source Repository Bug Database all commits C commit all bugs B commit commit commit commit commit fixed bugs Bf commit commit commit linked via log messages Bird et al., “Fair and Balanced? Bias in Bug-Fix Datasets,” FSE2009
  • 79. Source Repository Bug Database all commits C commit all bugs B commit commit commit commit commit fixed bugs Bf commit linked fixed bugs Bfl commit commit linked via log messages Bird et al., “Fair and Balanced? Bias in Bug-Fix Datasets,” FSE2009
  • 80. Source Repository Bug Database all commits C commit all bugs B commit commit commit commit commit fixed bugs Bf commit linked fixes Cfl linked fixed bugs Bfl commit commit linked via log messages Bird et al., “Fair and Balanced? Bias in Bug-Fix Datasets,” FSE2009
  • 81. Source Repository Bug Database all commits C commit all bugs B commit commit commit commit related, commit but not linked fixed bugs Bf commit linked fixes Cfl linked fixed bugs Bfl commit commit linked via log messages Bird et al., “Fair and Balanced? Bias in Bug-Fix Datasets,” FSE2009
  • 82. Source Repository Bug Database all commits C commit all bugs B commit commit commit commit bug fixes Cf related, commit but not linked fixed bugs Bf commit linked fixes Cfl linked fixed bugs Bfl commit commit linked via log messages Bird et al., “Fair and Balanced? Bias in Bug-Fix Datasets,” FSE2009
  • 83. Source Repository Bug Database all commits C oise! N all bugs B commit commit commit commit commit bug fixes Cf related, commit but not linked fixed bugs Bf commit linked fixes Cfl linked fixed bugs Bfl commit commit linked via log messages Bird et al., “Fair and Balanced? Bias in Bug-Fix Datasets,” FSE2009
  • 84. How resistant a defect prediction model is to noise? 1" 0.9" 0.8" 0.7" Buggy%F'measure SWT" 0.6" 0.5" Debug" 0.4" Columba" 0.3" Eclipse" 0.2" Scarab" 0.1" 0" 0" 0.1" 0.2" 0.3" 0.4" 0.5" 0.6" (c)%Training%set%false%nega6ve%(FN)%&%false%posi6ve%(FP)%rate Kim et al., “Dealing with Noise in Defect Prediction,” ICSE 2011
  • 85. How resistant a defect prediction model is to noise? 1" 0.9" 0.8" 0.7" Buggy%F'measure SWT" 0.6" 0.5" Debug" 0.4" Columba" 0.3" Eclipse" 0.2" Scarab" 0.1" 0" 0" 0.1" 0.2" 0.3" 0.4" 0.5" 0.6" (c)%Training%set%false%nega6ve%(FN)%&%false%posi6ve%(FP)%rate Kim et al., “Dealing with Noise in Defect Prediction,” ICSE 2011
  • 86. How resistant a defect prediction model is to noise? 1" 0.9" 0.8" 0.7" Buggy%F'measure SWT" 0.6" 0.5" Debug" 0.4" Columba" 0.3" 0.2" 0.1" 20% Eclipse" Scarab" 0" 0" 0.1" 0.2" 0.3" 0.4" 0.5" 0.6" (c)%Training%set%false%nega6ve%(FN)%&%false%posi6ve%(FP)%rate Kim et al., “Dealing with Noise in Defect Prediction,” ICSE 2011
  • 87. Closest List Noise return Aj Identification F igure 9. The pseudo-code of the C LN I algorit A Kim et al., “Dealing with Noise in Defect Prediction,” ICSE 2011
  • 88. Noise detection performance Precision Recall F-measure Debug 0.681 0.871 0.764 SWT 0.624 0.830 0.712 (noise level =20%) Kim et al., “Dealing with Noise in Defect Prediction,” ICSE 2011
  • 89. Bug prediction using cleaned data Noisey 100 75 SWT F-measure 50 25 0 0% 15% 30% 45% Noise level
  • 90. Bug prediction using cleaned data Noisey Cleaned 100 75 SWT F-measure 50 25 0 0% 15% 30% 45% Noise level
  • 91. Bug prediction using cleaned data Noisey Cleaned 100 75 SWT F-measure 50 25 76% F-measure with 45% noise 0 0% 15% 30% 45% Noise level
  • 92. ReLink Source code repository Traditional Unknown heuristics (link miner) links Bug database Recovering links using feature Links Features Links Combine Links Wu et al., “ReLink: Recovering Links between Bugs and Changes,” FSE 2011
  • 93. ReLink Source code repository Traditional Unknown heuristics (link miner) links Bug database Recovering links using feature Links Features Links Combine Links Wu et al., “ReLink: Recovering Links between Bugs and Changes,” FSE 2011
  • 94. ReLink Performance ZXing Projects OpenIntents Apache 0 20 40 60 80 100 F-measure Traditional ReLink Wu et al., “ReLink: Recovering Links between Bugs and Changes,” FSE 2011
  • 95. Label Historical Changes Change message: “fix for bug 28434” Rev 101 (with BUG) Rev 102 (no BUG) ... ... ... ... ... ... fixed ... ... Rev 1 Rev 100 Rev 101 Rev 102 ... ... ... ... ... ... …… ... ... change ... ... change ... ... ... ... ... ... Development history of a file Fischer et al, “Populating a Release History Database from Version Control and Bug Tracking Systems,” ICSM2003
  • 96. Atomic Change Change message: “fix for bug 28434” Rev 101 (with BUG) Rev 102 (no BUG) ... ... setText(“t”) insertTab() ... ... fixed ... ... Fischer et al, “Populating a Release History Database from Version Control and Bug Tracking Systems,” ICSM2003
  • 97. Composite Change public TimeSeriesDataItem addOrUpdate(RegularTimePeriod period, double value) hunk 1 677 678 } return this.addOrUpdate(period, new Double(value)); } return addOrUpdate(period, new Double(value)); this. this. public TimeSeries createCopy(RegularTimePeriod start, RegularTimePeriod end) this. 944 if (endIndex < 0) { if ((endIndex < 0) || (endIndex < startIndex)) { hunk 2 945 946 } emptyRange = true; } emptyRange = true; if (ti | ge public boolean equals(Object object) 973 if (!ObjectUtilities.equal( if (!ObjectUtilities.equal(getDomainDescription(), } hunk 3 974 )){ getDomainDescription(), s.getDomainDescription() s.getDomainDescription())) { 975 return false; return false; 976 } } pub 978 if (!ObjectUtilities.equal( if (!ObjectUtilities.equal(getRangeDescription(), hunk 4 979 )){ getRangeDescription(), s.getRangeDescription() s.getRangeDescription())) { } 980 return false; return false; 981 } } JFree revision 1083 Figure 5. JFreeChart revision 1083. Tao et al, “"How Do Software Engineers Understand Code Changes?” FSE 2012
  • 98. Defect Prediction 2.0 Finer Granularity Noise Handling New Customers
  • 99. Warning Developers “Safe” Files (Predicted as not buggy) “Risky” Files (Predicted as buggy)
  • 100. Change Classification Rev 1 Rev 2 Rev 3 Rev 4 ... ... ... ... ... change ... change ... change ... ... ... ... ... Kim et al., "Classifying Software Changes: Clean or Buggy?" TSE 2009
  • 101. Change Classification Rev 1 Rev 2 Rev 3 Rev 4 ... ... ... ... ... change ... change ... change ... ... ... ... ... “Safe” Files Rev 1 Rev 2 Rev 3 Rev 4 ... ... ... ... ... change ... change ... change ... ... ... ... ... “Risky” Files
  • 102. Change Classification Rev 1 Rev 2 Rev 3 Rev 4 ... ... ... ... ... change ... change ... change ... ... ... ... ... “Safe” Files Rev 1 Rev 2 Rev 3 Rev 4 ... ... ... ... ... change ... change ... change ... ... ... ... ... “Risky” Files
  • 103. Defect prediction based Change Classification Debug UI JDT JEdit Projects PDE POI Team UI 0 0.20 0.40 0.60 0.80 F-measure CC Cached CC
  • 104. Warning Developers “Safe” Location (Predicted as not buggy) “Risky” Location (Predicted as buggy)
  • 106. Test-case Selection Executing test cases
  • 107. Test-case Selection 1.00 0.75 Baseline APFD 0.50 History1 History2 0.25 0 R1.0 R1.1 R1.2 R1.3 R1.4 R1.5 Releases Runeson and Ljung, “Improving Regression Testing Transparency and Efficiency with History-Based Prioritization,” ICST 2011
  • 110. Warning Prioritization 18" 16" 14" 12" Precision)(%)) 10" 8" History" Tool" 6" 4" 2" 0" 0" 20" 40" 60" 80" 100" Warning)Instances)by)Priority) Kim and Ernst, “Which Warnings Should I Fix First?” FSE 2007
  • 111. Other Topics • Explanation - Why it has been predicted as defect-prone? • Cross-project prediction • Cost effectiveness measures • Active Learning/Refinement
  • 112. Defect Prediction 2.0 New metrics Algorithms Coarse granularity 1.0
  • 113. Defect Prediction 2.0 New metrics Finer granularity Algorithms Noise Handling Coarse granularity 1.0 New customers 2.0
  • 114. Defect Prediction 2.0 New metrics Finer granularity Algorithms Noise Handling Corse granularity 1.0 New customers 2.0
  • 115. 2013
  • 116. MSR$2013:$Back$to$roots$ Tom$Zimmermann$ Alberto$Bacchelli$$ General'chair' Mining'Challenge'Chair' Massimiliano$Di$Penta$and$Sung$Kim$ Program'co)chairs'
  • 117. MSR$2013:$Back$to$roots$ Tom$Zimmermann$ Alberto$Bacchelli$$ General'chair' Mining'Challenge'Chair' February 1 5 Massimiliano$Di$Penta$and$Sung$Kim$ Program'co)chairs'
  • 118. Some slides/data are borrowed with thanks from • Tom Zimmermann, Chris Bird • Andreas Zeller • Ahmed Hassan • David Lo • Jaechang Nam,Yida Tao • Tien Neguan • Steve Counsell, David Bowes, Tracy Hall and David Gray • Wen Zhang