Nothing else Matters:what Predictive Model    should I use?     Massimiliano Di Penta        University of Sannio, Italy  ...
University of... what?   FAQ when people met   me for the first time at        a conference
University of... what?
University of... what?
University of... what?
University of... what?
University of... what?
About me  M. Di Penta   4
About me• Not really a wizard of  predictor models• Software evolution• Mining software repositories• Experimental softwar...
Interests   M. Di Penta   5
InterestsDesign and experiment material                    Group 1          Group 2          Group 3          Group 4     ...
InterestsDesign and experiment material                                                                 Example of CS Pair...
InterestsDesign and experiment material                                                                 Example of CS Pair...
InterestsDesign and experiment material                                                                                   ...
InterestsDesign and experiment material                                                                                   ...
InterestsDesign and experiment material                                                                                   ...
Outline• Many models ...• Providing the right suggestions  to developers• Approaching causation• Bias in datasets• Model u...
Some popular    prediction models• Bug prediction models suggest artifacts that  will likely exhibit faults• Change impact...
A few examples...•   Code Metrics (e.g., CK suite):    [Basili et al., 1996, Gyimothy et al., 2005]•   Process Metrics [Mo...
The good news• Most of these models have very good  performances• Evaluated on industrial, as well as open  source data se...
Is that true?• Indeed, there have been substantial  research advances in this field• However, as a matter of fact, industry...
Open problems and barriers to    adoption of bug prediction models•   ESEC/FSE 2011 Project Working Group    •   http://pw...
Let’s start to see what kind of problem we       face off ...
Nothing else Matters• Defects are certainly inserted when  the code is very complex but...• ...there are many other charac...
Increasing the level of     abstraction• Often we look at the quality of code• Let’s try to observe the design instead• An...
Examples of antipatterns • LazyClass: a class does too little • MessageChain: a functionality requires a   long chain of m...
Antipatterns      and fault/change-proneness        • As metric models, but at a higher level of            abstraction   ...
Method•   H0: proportion of faulty antipattern classes = proportion    of faulty non-antipattern classes    •   Fisher’s e...
Antipatterns and Fault-Proneness                               ArgoUML                                                    ...
Fault-Proneness: What Antipatterns?                      ArgoUML         Eclipse          Mylyn     Rhino AntiSingleton   ...
Code Lexicon•   Various recent studies have investigated the relationship    between code lexicon and quality attributes  ...
Developers take care of renamingRenaming                  Exampleadd meaning               type ! authtype (T)            ...
Licensing can be faulty too!•   In 2004, MySQL AB changed the license of its client libraries    from LGPL v2.1 to GPL v2 ...
Wrong license changes                                           Mozilla       NPL                          NPL v1.1-style+...
Licensing Inconsistencies in RPM Packages                                        Binary package                           ...
Licensing Inconsistencies in RPM Packages                                        Binary package                           ...
Licensing Inconsistencies in RPM Packages                                        Binary package                           ...
License Dependency Issues  •   Two GPLv2 source packages (lvm2, pilot-link)      were using the library readline (GPLv3+) ...
In summary• Different characteristics of a software  system can induce defects• Some can be used to build predictors, some...
so... we know how to  correlate various kinds ofsymptoms to fault-proneness...          That’s great!
Incompatible                      Propagate     licensing!                   clone changes!  Poor                         ...
That’s too much!•   We could build models that warn the developer    against anything•   It would be better to    •   Avoi...
False Alarm: Clones•   Common wisdom suggests that code cloning could be harmful•   Recent (and past) studies suggested cl...
Promise 2011: Keynote 2 - "Nothing else Matters: What Predictive Model should I use?"
Promise 2011: Keynote 2 - "Nothing else Matters: What Predictive Model should I use?"
Promise 2011: Keynote 2 - "Nothing else Matters: What Predictive Model should I use?"
Promise 2011: Keynote 2 - "Nothing else Matters: What Predictive Model should I use?"
Promise 2011: Keynote 2 - "Nothing else Matters: What Predictive Model should I use?"
Promise 2011: Keynote 2 - "Nothing else Matters: What Predictive Model should I use?"
Promise 2011: Keynote 2 - "Nothing else Matters: What Predictive Model should I use?"
Promise 2011: Keynote 2 - "Nothing else Matters: What Predictive Model should I use?"
Promise 2011: Keynote 2 - "Nothing else Matters: What Predictive Model should I use?"
Promise 2011: Keynote 2 - "Nothing else Matters: What Predictive Model should I use?"
Promise 2011: Keynote 2 - "Nothing else Matters: What Predictive Model should I use?"
Promise 2011: Keynote 2 - "Nothing else Matters: What Predictive Model should I use?"
Promise 2011: Keynote 2 - "Nothing else Matters: What Predictive Model should I use?"
Promise 2011: Keynote 2 - "Nothing else Matters: What Predictive Model should I use?"
Promise 2011: Keynote 2 - "Nothing else Matters: What Predictive Model should I use?"
Promise 2011: Keynote 2 - "Nothing else Matters: What Predictive Model should I use?"
Promise 2011: Keynote 2 - "Nothing else Matters: What Predictive Model should I use?"
Promise 2011: Keynote 2 - "Nothing else Matters: What Predictive Model should I use?"
Promise 2011: Keynote 2 - "Nothing else Matters: What Predictive Model should I use?"
Promise 2011: Keynote 2 - "Nothing else Matters: What Predictive Model should I use?"
Promise 2011: Keynote 2 - "Nothing else Matters: What Predictive Model should I use?"
Promise 2011: Keynote 2 - "Nothing else Matters: What Predictive Model should I use?"
Promise 2011: Keynote 2 - "Nothing else Matters: What Predictive Model should I use?"
Promise 2011: Keynote 2 - "Nothing else Matters: What Predictive Model should I use?"
Promise 2011: Keynote 2 - "Nothing else Matters: What Predictive Model should I use?"
Promise 2011: Keynote 2 - "Nothing else Matters: What Predictive Model should I use?"
Promise 2011: Keynote 2 - "Nothing else Matters: What Predictive Model should I use?"
Promise 2011: Keynote 2 - "Nothing else Matters: What Predictive Model should I use?"
Promise 2011: Keynote 2 - "Nothing else Matters: What Predictive Model should I use?"
Promise 2011: Keynote 2 - "Nothing else Matters: What Predictive Model should I use?"
Promise 2011: Keynote 2 - "Nothing else Matters: What Predictive Model should I use?"
Promise 2011: Keynote 2 - "Nothing else Matters: What Predictive Model should I use?"
Promise 2011: Keynote 2 - "Nothing else Matters: What Predictive Model should I use?"
Promise 2011: Keynote 2 - "Nothing else Matters: What Predictive Model should I use?"
Promise 2011: Keynote 2 - "Nothing else Matters: What Predictive Model should I use?"
Promise 2011: Keynote 2 - "Nothing else Matters: What Predictive Model should I use?"
Promise 2011: Keynote 2 - "Nothing else Matters: What Predictive Model should I use?"
Promise 2011: Keynote 2 - "Nothing else Matters: What Predictive Model should I use?"
Promise 2011: Keynote 2 - "Nothing else Matters: What Predictive Model should I use?"
Promise 2011: Keynote 2 - "Nothing else Matters: What Predictive Model should I use?"
Promise 2011: Keynote 2 - "Nothing else Matters: What Predictive Model should I use?"
Promise 2011: Keynote 2 - "Nothing else Matters: What Predictive Model should I use?"
Promise 2011: Keynote 2 - "Nothing else Matters: What Predictive Model should I use?"
Promise 2011: Keynote 2 - "Nothing else Matters: What Predictive Model should I use?"
Promise 2011: Keynote 2 - "Nothing else Matters: What Predictive Model should I use?"
Promise 2011: Keynote 2 - "Nothing else Matters: What Predictive Model should I use?"
Promise 2011: Keynote 2 - "Nothing else Matters: What Predictive Model should I use?"
Promise 2011: Keynote 2 - "Nothing else Matters: What Predictive Model should I use?"
Promise 2011: Keynote 2 - "Nothing else Matters: What Predictive Model should I use?"
Promise 2011: Keynote 2 - "Nothing else Matters: What Predictive Model should I use?"
Promise 2011: Keynote 2 - "Nothing else Matters: What Predictive Model should I use?"
Promise 2011: Keynote 2 - "Nothing else Matters: What Predictive Model should I use?"
Upcoming SlideShare
Loading in …5
×

Promise 2011: Keynote 2 - "Nothing else Matters: What Predictive Model should I use?"

4,681 views

Published on

Promise 2011:
Keynote 2 - "Nothing else Matters: What Predictive Model should I use?"
Massimiliano Di Penta

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
4,681
On SlideShare
0
From Embeds
0
Number of Embeds
2,791
Actions
Shares
0
Downloads
28
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Promise 2011: Keynote 2 - "Nothing else Matters: What Predictive Model should I use?"

  1. 1. Nothing else Matters:what Predictive Model should I use? Massimiliano Di Penta University of Sannio, Italy dipenta@unisannio.it http://www.rcost.unisannio.it/mdipenta
  2. 2. University of... what? FAQ when people met me for the first time at a conference
  3. 3. University of... what?
  4. 4. University of... what?
  5. 5. University of... what?
  6. 6. University of... what?
  7. 7. University of... what?
  8. 8. About me M. Di Penta 4
  9. 9. About me• Not really a wizard of predictor models• Software evolution• Mining software repositories• Experimental software engineering• Search-based software engineering M. Di Penta 4
  10. 10. Interests M. Di Penta 5
  11. 11. InterestsDesign and experiment material Group 1 Group 2 Group 3 Group 4 C o n a lle n C o n a lle n UML UML Lab 1 Claros Claros WfMS WfMS C o n a lle n C o n a lle n UML UML Lab 2 WfMS WfMS Claros Claros!  Subjects received: "  Short description of the application "  Diagrams "  Source code M. Di Penta 5
  12. 12. InterestsDesign and experiment material Example of CS Pair Group 1 Group 2 Group 3 Group 4 CrNoIncomingTransitions.java (ver. 1.1) CrNoOutgoingTransitions.java (ver. 1.1) 1: package org.argouml.uml.cognitive.critics; 1: package org.argouml.uml.cognitive.critics; C o n a lle n C o n a lle n ... ... ... ... UML UML 12: 12: Lab 1 13: 14: ... public class CrNoOutgoingTransitions extends CrUML { ... 13: 14: ... public class CrNoIncomingTransitions extends CrUML { ... 30: public boolean predicate2(Object dm, Designer dsgr) { 1 30: public boolean predicate2(Object dm, Designer dsgr) { CS 31: if (!(dm instanceof MStateVertex)) return NO_PROBLEM; 31: if (!(dm instanceof MStateVertex)) return NO_PROBLEM; Claros 32: MStateVertex sv = (MStateVertex) dm; 32: MStateVertex sv = (MStateVertex) dm; Claros WfMS WfMS 33: 34: if (sv instanceof MState) { MStateMachine sm = ((MState)sv).getStateMachine(); 33: 34: if (sv instanceof MState) { MStateMachine sm = ((MState)sv).getStateMachine(); 35: if (sm != null && sm.getTop() == sv) return NO_PROBLEM; 35: if (sm != null && sm.getTop() == sv) return NO_PROBLEM; 36: } 36: } C o n a lle n C o n a lle n 2 37: Collection outgoing = sv.getOutgoings(); 37: //Vector outgoing = sv.getOutgoing(); UML UML CS Lab 2 38: boolean needsOutgoing = outgoing == null || outgoing.size() == 0; 38: Collection incoming = sv.getIncomings(); 39: if (sv instanceof MFinalState) { 3 39: //boolean needsOutgoing = outgoing == null || outgoing.size() == 0; 40: needsOutgoing = false; CS 40: boolean needsIncoming = incoming == null || incoming.size() == 0; 41: } 41: if (sv instanceof MPseudostate) { 42: if (needsOutgoing) return PROBLEM_FOUND; 42: MPseudostateKind k = ((MPseudostate)sv).getKind(); 43: return NO_PROBLEM; 43: if (k.equals(MPseudostateKind.INITIAL)) needsIncoming = false; WfMS WfMS Claros Claros 44: 45: } 44: 45: //if (k.equals(MPseudostateKind.FINAL)) needsOutgoing = false; } 46: } /* end class CrNoOutgoingTransitions */ 4 46: // if (needsIncoming && !needsOutgoing) return PROBLEM_FOUND; CS 47: if (needsIncoming) return PROBLEM_FOUND; 48: return NO_PROBLEM;!  Subjects received: 49: } 50: 51: } /* end class CrNoIncomingTransitions */ "  Short description of the application "  Diagrams "  Source code 8 M. Di Penta 5
  13. 13. InterestsDesign and experiment material Example of CS Pair Evolution of vulnerability density Group 1 Group 2 Group 3 Group 4 CrNoIncomingTransitions.java (ver. 1.1) CrNoOutgoingTransitions.java (ver. 1.1) 1: package org.argouml.uml.cognitive.critics; 1: package org.argouml.uml.cognitive.critics; C o n a lle n C o n a lle n ... ... ... ... UML UML 12: 12: Lab 1 13: 14: ... public class CrNoOutgoingTransitions extends CrUML { ... 13: 14: ... public class CrNoIncomingTransitions extends CrUML { ... 30: public boolean predicate2(Object dm, Designer dsgr) { 1 30: public boolean predicate2(Object dm, Designer dsgr) { CS 31: if (!(dm instanceof MStateVertex)) return NO_PROBLEM; 31: if (!(dm instanceof MStateVertex)) return NO_PROBLEM; Claros 32: MStateVertex sv = (MStateVertex) dm; 32: MStateVertex sv = (MStateVertex) dm; Claros WfMS WfMS 33: 34: if (sv instanceof MState) { MStateMachine sm = ((MState)sv).getStateMachine(); 33: 34: if (sv instanceof MState) { MStateMachine sm = ((MState)sv).getStateMachine(); 35: if (sm != null && sm.getTop() == sv) return NO_PROBLEM; 35: if (sm != null && sm.getTop() == sv) return NO_PROBLEM; 36: } 36: } C o n a lle n C o n a lle n 2 37: Collection outgoing = sv.getOutgoings(); 37: //Vector outgoing = sv.getOutgoing(); UML UML CS Lab 2 38: boolean needsOutgoing = outgoing == null || outgoing.size() == 0; 38: Collection incoming = sv.getIncomings(); 39: if (sv instanceof MFinalState) { 3 39: //boolean needsOutgoing = outgoing == null || outgoing.size() == 0; 40: needsOutgoing = false; CS 40: boolean needsIncoming = incoming == null || incoming.size() == 0; 41: } 41: if (sv instanceof MPseudostate) { 42: if (needsOutgoing) return PROBLEM_FOUND; 42: MPseudostateKind k = ((MPseudostate)sv).getKind(); Samba - Overall Squid – Buffer Overflows 43: return NO_PROBLEM; 43: if (k.equals(MPseudostateKind.INITIAL)) needsIncoming = false; WfMS WfMS Claros Claros 44: 45: } 44: 45: //if (k.equals(MPseudostateKind.FINAL)) needsOutgoing = false; } Splint vulnerabilities tend to have 4 •  46: } /* end class CrNoOutgoingTransitions */ 46: // if (needsIncoming && !needsOutgoing) return PROBLEM_FOUND; CS 47: if (needsIncoming) return PROBLEM_FOUND; 48: return NO_PROBLEM; a lower density (thorough •  Buffer Overflows introduced at!  Subjects received: 49: } 50: 51: } /* end class CrNoIncomingTransitions */ analysis) release 2.3 STABLE3 •  Initially, a high number •  Then removed in the subsequent "  Short description of the application vulnerabilities detected by RATS releases 2.4STABLE7 and "  Diagrams –  Pre-release, then 2.5STABLE7 with proper security vulnerabilities removed by patches "  Source code security patches –  As documented in the system •  No trend detected (ADF test) history 8 66 M. Di Penta 5
  14. 14. InterestsDesign and experiment material Example of CS Pair Evolution of vulnerability density Group 1 Group 2 Group 3 Group 4 CrNoIncomingTransitions.java (ver. 1.1) CrNoOutgoingTransitions.java (ver. 1.1) 1: package org.argouml.uml.cognitive.critics; 1: package org.argouml.uml.cognitive.critics; C o n a lle n C o n a lle n ... ... ... ... UML UML 12: 12: Lab 1 13: 14: ... public class CrNoOutgoingTransitions extends CrUML { ... 13: 14: ... public class CrNoIncomingTransitions extends CrUML { ... 30: public boolean predicate2(Object dm, Designer dsgr) { 1 30: public boolean predicate2(Object dm, Designer dsgr) { CS 31: if (!(dm instanceof MStateVertex)) return NO_PROBLEM; 31: if (!(dm instanceof MStateVertex)) return NO_PROBLEM; Claros 32: MStateVertex sv = (MStateVertex) dm; 32: MStateVertex sv = (MStateVertex) dm; Claros WfMS WfMS 33: 34: if (sv instanceof MState) { MStateMachine sm = ((MState)sv).getStateMachine(); 33: 34: if (sv instanceof MState) { MStateMachine sm = ((MState)sv).getStateMachine(); 35: if (sm != null && sm.getTop() == sv) return NO_PROBLEM; 35: if (sm != null && sm.getTop() == sv) return NO_PROBLEM; 36: } 36: } C o n a lle n C o n a lle n 2 37: Collection outgoing = sv.getOutgoings(); 37: //Vector outgoing = sv.getOutgoing(); UML UML CS Lab 2 38: boolean needsOutgoing = outgoing == null || outgoing.size() == 0; 38: Collection incoming = sv.getIncomings(); 39: if (sv instanceof MFinalState) { 3 39: //boolean needsOutgoing = outgoing == null || outgoing.size() == 0; 40: needsOutgoing = false; CS 40: boolean needsIncoming = incoming == null || incoming.size() == 0; 41: } 41: if (sv instanceof MPseudostate) { 42: if (needsOutgoing) return PROBLEM_FOUND; 42: MPseudostateKind k = ((MPseudostate)sv).getKind(); Samba - Overall Squid – Buffer Overflows 43: return NO_PROBLEM; 43: if (k.equals(MPseudostateKind.INITIAL)) needsIncoming = false; WfMS WfMS Claros Claros 44: 45: } 44: 45: //if (k.equals(MPseudostateKind.FINAL)) needsOutgoing = false; } Splint vulnerabilities tend to have 4 •  46: } /* end class CrNoOutgoingTransitions */ 46: // if (needsIncoming && !needsOutgoing) return PROBLEM_FOUND; CS 47: if (needsIncoming) return PROBLEM_FOUND; 48: return NO_PROBLEM; a lower density (thorough •  Buffer Overflows introduced at!  Subjects received: 49: } 50: 51: } /* end class CrNoIncomingTransitions */ analysis) release 2.3 STABLE3 •  Initially, a high number •  Then removed in the subsequent "  Short description of the application vulnerabilities detected by RATS releases 2.4STABLE7 and "  Diagrams –  Pre-release, then 2.5STABLE7 with proper security vulnerabilities removed by patches "  Source code security patches –  As documented in the system •  No trend detected (ADF test) history 8 66Recall the content of a licensing…/* -*- Mode: C++; tab-width: 2; indent-tabs-mode: nil; c-basic-offset: 2 -*- *//* ***** BEGIN LICENSE BLOCK ***** * Version: MPL 1.1/GPL 2.0/LGPL 2.1 * * The contents of this file are subject to the Mozilla Public License Version * 1.1 (the "License"); you may not use this file except in compliance with * the License. You may obtain a copy of the License at * http://www.mozilla.org/MPL/ …. * Portions created by the Initial Developer are Copyright (C) 2002 * the Initial Developer. All Rights Reserved. License * (MPL+GPL+LGPL) * Contributor(s): * Brian Ryner <bryner@brianryner.com> …. * decision by deleting the provisions above and replace them with the notice * and other provisions required by the GPL or the LGPL. If you do not delete * the provisions above, a recipient may use your version of this file under * the terms of any one of the MPL, the GPL or the LGPL. * * ***** END LICENSE BLOCK ***** */#include "nsXULAppAPI.h"#ifdef XP_WIN#include <windows.h> Contributor Copyright Copyright statement year D. M. German and M. Di Penta 11 M. Di Penta 5
  15. 15. InterestsDesign and experiment material Example of CS Pair Evolution of vulnerability density Group 1 Group 2 Group 3 Group 4 CrNoIncomingTransitions.java (ver. 1.1) CrNoOutgoingTransitions.java (ver. 1.1) 1: package org.argouml.uml.cognitive.critics; 1: package org.argouml.uml.cognitive.critics; C o n a lle n C o n a lle n ... ... ... ... UML UML 12: 12: Lab 1 13: 14: ... public class CrNoOutgoingTransitions extends CrUML { ... 13: 14: ... public class CrNoIncomingTransitions extends CrUML { ... 30: public boolean predicate2(Object dm, Designer dsgr) { 1 30: public boolean predicate2(Object dm, Designer dsgr) { CS 31: if (!(dm instanceof MStateVertex)) return NO_PROBLEM; 31: if (!(dm instanceof MStateVertex)) return NO_PROBLEM; Claros 32: MStateVertex sv = (MStateVertex) dm; 32: MStateVertex sv = (MStateVertex) dm; Claros WfMS WfMS 33: 34: if (sv instanceof MState) { MStateMachine sm = ((MState)sv).getStateMachine(); 33: 34: if (sv instanceof MState) { MStateMachine sm = ((MState)sv).getStateMachine(); 35: if (sm != null && sm.getTop() == sv) return NO_PROBLEM; 35: if (sm != null && sm.getTop() == sv) return NO_PROBLEM; 36: } 36: } C o n a lle n C o n a lle n 2 37: Collection outgoing = sv.getOutgoings(); 37: //Vector outgoing = sv.getOutgoing(); UML UML CS Lab 2 38: boolean needsOutgoing = outgoing == null || outgoing.size() == 0; 38: Collection incoming = sv.getIncomings(); 39: if (sv instanceof MFinalState) { 3 39: //boolean needsOutgoing = outgoing == null || outgoing.size() == 0; 40: needsOutgoing = false; CS 40: boolean needsIncoming = incoming == null || incoming.size() == 0; 41: } 41: if (sv instanceof MPseudostate) { 42: if (needsOutgoing) return PROBLEM_FOUND; 42: MPseudostateKind k = ((MPseudostate)sv).getKind(); Samba - Overall Squid – Buffer Overflows 43: return NO_PROBLEM; 43: if (k.equals(MPseudostateKind.INITIAL)) needsIncoming = false; WfMS WfMS Claros Claros 44: 45: } 44: 45: //if (k.equals(MPseudostateKind.FINAL)) needsOutgoing = false; } Splint vulnerabilities tend to have 4 •  46: } /* end class CrNoOutgoingTransitions */ 46: // if (needsIncoming && !needsOutgoing) return PROBLEM_FOUND; CS 47: if (needsIncoming) return PROBLEM_FOUND; 48: return NO_PROBLEM; a lower density (thorough •  Buffer Overflows introduced at!  Subjects received: 49: } 50: 51: } /* end class CrNoIncomingTransitions */ analysis) release 2.3 STABLE3 •  Initially, a high number •  Then removed in the subsequent "  Short description of the application vulnerabilities detected by RATS releases 2.4STABLE7 and "  Diagrams –  Pre-release, then 2.5STABLE7 with proper security vulnerabilities removed by patches "  Source code security patches –  As documented in the system •  No trend detected (ADF test) history 8 66Recall the content of a licensing… RQ3 – CSBF Graph (excerpt)/* -*- Mode: C++; tab-width: 2; indent-tabs-mode: nil; c-basic-offset: 2 -*- */ Blue/cyan: FreeBSD/* ***** BEGIN LICENSE BLOCK ***** * Version: MPL 1.1/GPL 2.0/LGPL 2.1 Red/orange: OpenBSD * * The contents of this file are subject to the Mozilla Public License Version Yellow: common * 1.1 (the "License"); you may not use this file except in compliance with * the License. You may obtain a copy of the License at * http://www.mozilla.org/MPL/ …. * Portions created by the Initial Developer are Copyright (C) 2002 * the Initial Developer. All Rights Reserved. License * (MPL+GPL+LGPL) * Contributor(s): * Brian Ryner <bryner@brianryner.com> …. * decision by deleting the provisions above and replace them with the notice * and other provisions required by the GPL or the LGPL. If you do not delete * the provisions above, a recipient may use your version of this file under * the terms of any one of the MPL, the GPL or the LGPL. * * ***** END LICENSE BLOCK ***** */#include "nsXULAppAPI.h"#ifdef XP_WIN#include <windows.h> Contributor Copyright Copyright statement year D. M. German and M. Di Penta 11 M. Di Penta 5
  16. 16. InterestsDesign and experiment material Example of CS Pair Evolution of vulnerability density Group 1 Group 2 Group 3 Group 4 CrNoIncomingTransitions.java (ver. 1.1) CrNoOutgoingTransitions.java (ver. 1.1) 1: package org.argouml.uml.cognitive.critics; 1: package org.argouml.uml.cognitive.critics; C o n a lle n C o n a lle n ... ... ... ... UML UML 12: 12: Lab 1 13: 14: ... public class CrNoOutgoingTransitions extends CrUML { ... 13: 14: ... public class CrNoIncomingTransitions extends CrUML { ... 30: public boolean predicate2(Object dm, Designer dsgr) { 1 30: public boolean predicate2(Object dm, Designer dsgr) { CS 31: if (!(dm instanceof MStateVertex)) return NO_PROBLEM; 31: if (!(dm instanceof MStateVertex)) return NO_PROBLEM; Claros 32: MStateVertex sv = (MStateVertex) dm; 32: MStateVertex sv = (MStateVertex) dm; Claros WfMS WfMS 33: 34: if (sv instanceof MState) { MStateMachine sm = ((MState)sv).getStateMachine(); 33: 34: if (sv instanceof MState) { MStateMachine sm = ((MState)sv).getStateMachine(); 35: if (sm != null && sm.getTop() == sv) return NO_PROBLEM; 35: if (sm != null && sm.getTop() == sv) return NO_PROBLEM; 36: } 36: } C o n a lle n C o n a lle n 2 37: Collection outgoing = sv.getOutgoings(); 37: //Vector outgoing = sv.getOutgoing(); UML UML CS Lab 2 38: boolean needsOutgoing = outgoing == null || outgoing.size() == 0; 38: Collection incoming = sv.getIncomings(); 39: if (sv instanceof MFinalState) { 3 39: //boolean needsOutgoing = outgoing == null || outgoing.size() == 0; 40: needsOutgoing = false; CS 40: boolean needsIncoming = incoming == null || incoming.size() == 0; 41: } 41: if (sv instanceof MPseudostate) { 42: if (needsOutgoing) return PROBLEM_FOUND; 42: MPseudostateKind k = ((MPseudostate)sv).getKind(); Samba - Overall Squid – Buffer Overflows 43: return NO_PROBLEM; 43: if (k.equals(MPseudostateKind.INITIAL)) needsIncoming = false; WfMS WfMS Claros Claros 44: 45: } 44: 45: //if (k.equals(MPseudostateKind.FINAL)) needsOutgoing = false; } Splint vulnerabilities tend to have 4 •  46: } /* end class CrNoOutgoingTransitions */ 46: // if (needsIncoming && !needsOutgoing) return PROBLEM_FOUND; CS 47: if (needsIncoming) return PROBLEM_FOUND; 48: return NO_PROBLEM; a lower density (thorough •  Buffer Overflows introduced at!  Subjects received: 49: } 50: 51: } /* end class CrNoIncomingTransitions */ analysis) release 2.3 STABLE3 •  Initially, a high number •  Then removed in the subsequent "  Short description of the application vulnerabilities detected by RATS releases 2.4STABLE7 and "  Diagrams –  Pre-release, then 2.5STABLE7 with proper security vulnerabilities removed by patches "  Source code security patches –  As documented in the system •  No trend detected (ADF test) history 8 66Recall the content of a licensing… RQ3 – CSBF Graph (excerpt) Association rules vs. Granger/* -*- Mode: C++; tab-width: 2; indent-tabs-mode: nil; c-basic-offset: 2 -*- */ Blue/cyan: FreeBSD/* ***** BEGIN LICENSE BLOCK ***** A A A A * Version: MPL 1.1/GPL 2.0/LGPL 2.1 Red/orange: OpenBSD * * The contents of this file are subject to the Mozilla Public License Version Yellow: common * 1.1 (the "License"); you may not use this file except in compliance with B B B * the License. You may obtain a copy of the License at * http://www.mozilla.org/MPL/ Files C C C C …. * Portions created by the Initial Developer are Copyright (C) 2002 * the Initial Developer. All Rights Reserved. License D D D D * * Contributor(s): (MPL+GPL+LGPL) D * Brian Ryner <bryner@brianryner.com> …. E E E E * decision by deleting the provisions above and replace them with the notice * and other provisions required by the GPL or the LGPL. If you do not delete * the provisions above, a recipient may use your version of this file under S1 S2 S3 S4 S5 S6 S7 S8 S9 * the terms of any one of the MPL, the GPL or the LGPL. * * ***** END LICENSE BLOCK ***** */ Changes occurring in snapshots#include "nsXULAppAPI.h"#ifdef XP_WIN#include <windows.h> Contributor Copyright Copyright statement year Association rules: A→C, B→D, D→E Granger causality test: A→{B,D}, C→{D,E} D. M. German and M. Di Penta 11 76 M. Di Penta 5
  17. 17. Outline• Many models ...• Providing the right suggestions to developers• Approaching causation• Bias in datasets• Model usability M. Di Penta 6
  18. 18. Some popular prediction models• Bug prediction models suggest artifacts that will likely exhibit faults• Change impact models suggest artifacts likely impacted by changes occurring to other artifacts M. Di Penta 7
  19. 19. A few examples...• Code Metrics (e.g., CK suite): [Basili et al., 1996, Gyimothy et al., 2005]• Process Metrics [Moser et al. 2009, Hassan 2009]• Bug caching/previous defects [Ostrand et al. , 2005, Kim et al. 2007]• Bug introducing changes [Kim et al., 2008]• Recent survey and comparison: • Marco D’Ambros, Michele Lanza, and Romain Robbes: Evaluating defect prediction approaches: a benchmark and an extensive comparison. Empir. Software Eng., 2011 (available online) M. Di Penta 8
  20. 20. The good news• Most of these models have very good performances• Evaluated on industrial, as well as open source data sets• They capture different facets of software complexity • that is likely to be a symptom (and cause?) of fault-proneness M. Di Penta 9
  21. 21. Is that true?• Indeed, there have been substantial research advances in this field• However, as a matter of fact, industry seldom uses predictive models • Or use very simple ones... • Of course there are exceptions... M. Di Penta 10
  22. 22. Open problems and barriers to adoption of bug prediction models• ESEC/FSE 2011 Project Working Group • http://pwg.sed.hu• We surveyed conference participants• Awarded as the best working group• Thanks to the exceptional team: • Emitzá Guzmán Ortega, Amir Molzam Sharifloo, Dávid Tengeri, Melinda Tóth, Zuoning Yin, and Marco D’Ambros (group leader) M. Di Penta 11
  23. 23. Let’s start to see what kind of problem we face off ...
  24. 24. Nothing else Matters• Defects are certainly inserted when the code is very complex but...• ...there are many other characteristics of the software we should be aware of • Design, lexicon, legal issues, when changes are performed ... • They can also relate to bugs M. Di Penta 13
  25. 25. Increasing the level of abstraction• Often we look at the quality of code• Let’s try to observe the design instead• Antipatterns encode poor design choices • As design patterns encode (possibly) good design choices• Various catalogues, very popular the one by Brown (40 antipatterns) M. Di Penta 14
  26. 26. Examples of antipatterns • LazyClass: a class does too little • MessageChain: a functionality requires a long chain of method calls between classes • Blob: large class centralizing behavior M. Di Penta 15
  27. 27. Antipatterns and fault/change-proneness • As metric models, but at a higher level of abstraction • Empirical study carried out on several releases of four systems: • ArgoUML, Eclipse, Mylyn, and RhinoFoutse Khomh, Massimiliano Di Penta, Yann-Gael Guéhéneuc, and Giuliano Antoniol : AnExploratory Study of the Impact of Antipatterns on Class Change- and Fault-Proneness. In Emp. Soft. Engineering, 2011 (available online) M. Di Penta 16
  28. 28. Method• H0: proportion of faulty antipattern classes = proportion of faulty non-antipattern classes • Fisher’s exact test and Odds Ratio (OR) p/(1 p) OR = q/(1 q)• Logistic regression model to study the significant effect of each kind of antipattern eC0 +C1 ·X1 +···+Cn ·Xn ⇡(X1 , X2 , . . . , Xn ) = 1 + eC0 +C1 ·X1 +···+Cn ·Xn M. Di Penta 17
  29. 29. Antipatterns and Fault-Proneness ArgoUML Eclipse 20 4Odds Ratio Odds Ratio 15 3 10 2 5 1 0 0 0.10.1 0.14 0.18.1 0.22 0.26 1.0 2.1.2 3.0.1 3.2.1 3.3.1 Releases Releases Mylyn Rhino 30 40 Odds Ratio Odds Ratio 23 30 15 20 8 10 0 0 1.0.1 2..0M1 2.0M3 1.4.R3 1.5R3 1.5R5 1.6R3 1.6R6 Releases Releases M. Di Penta 18
  30. 30. Fault-Proneness: What Antipatterns? ArgoUML Eclipse Mylyn Rhino AntiSingleton Blob CDSBPComplexClass LargeClass LazyClass LongMethod LPLMessageChain RPB 0% 25% 50% 75% 100% % of releases where the antipattern significantly correlates with fault proneness M. Di Penta 19
  31. 31. Code Lexicon• Various recent studies have investigated the relationship between code lexicon and quality attributes • Maintainability, Fault proneness [Takang et al. , 1996, Lawrie et al., 2006, 2007]• “Conceptual” CK metrics and use to predict fault-proneness • Conceptual Cohesion [Marcus et al., 2005, 2008] • Conceptual Coupling [Poshyvanyk and Marcus et al., 2006] • Predictive models [Ujhazi et al., 2010] • Conceptual metrics capture different components of fault- proneness than structural metrics M. Di Penta 20
  32. 32. Developers take care of renamingRenaming Exampleadd meaning type ! authtype (T) resource ! visitedResource (E)remove meaning copyJAR ! copy (T) fTypeBinding ! fBinding (E)same meaning committed ! commited (T) methodsBu↵er ! methodsBu↵ered (E)gen/spec scanCurrentPosition ! scanCurrentLine (E) thrownExceptionSize ! thrownExceptionLength (E)opposite meaning findNextLevelChildrenByElementName ! findNextLevelParentByElementName (E) hasClosingBracket ! hasOpeningBracket (E)unrelated meaning createContents ! createControl (E) getClusterReceiver ! getChannelReceiver (T)Laleh Mousavi Eshkevari,Venera Arnaoudova, Massimiliano Di Penta, Rocco Oliveto,Yann-GaëlGuéhéneuc, Giuliano Antoniol: An exploratory study of identifier renamings. MSR 2011: 33-42 M. Di Penta 21
  33. 33. Licensing can be faulty too!• In 2004, MySQL AB changed the license of its client libraries from LGPL v2.1 to GPL v2 to prevent industrial companies from using the libraries within proprietary products• Unintended consequences: • PHP systems were no longer able to connect to MySQL • PHP license is incompatible with the GPL v2• MySQL addressed this problem by adding the MySQL FOSS License Exception to the GPL v2Changing the license of a FOSS system might have unintended/ undesirable consequences to its legitimate users M. Di Penta 22
  34. 34. Wrong license changes Mozilla NPL NPL v1.1-style+GPL v2+LGPL DUAL 2914 v2.1 NPL Dual MPL GPL-style+MPL DUAL 1274 Dual MPL GPL-style+MPL NPL BUG 1194 • Mozilla changed its license from the NPL (commercial) to a combination of multiple open source licenses (MPL + GPL) • At some point someone changed back on some files to NPL (bug #98089)Massimiliano Di Penta, Daniel M. Germán,Yann-Gaël Guéhéneuc, Giuliano Antoniol: An exploratory study of the evolution of software licensing. ICSE (1) 2010: 145-154 M. Di Penta 23
  35. 35. Licensing Inconsistencies in RPM Packages Binary package Lib 1 Lic: GPLv3Different kinds of Requires: Lib1problems: Src package License: GPLv2 Source 1 1. declared license Lic: GPLv2 inconsistent wrt. Source 2 source code Lic: LGPL Binary 1 2. dependencies create Source 3 Lic: BSD license incompatibility Source 4 Binary 1 Lic: GPLv3 M. Di Penta 24
  36. 36. Licensing Inconsistencies in RPM Packages Binary package Lib 1 Lic: GPLv3Different kinds of Requires: Lib1problems: Src package License: GPLv2 Source 1 1. declared license Lic: GPLv2 inconsistent wrt. Source 2 source code Lic: LGPL Binary 1 2. dependencies create Source 3 Lic: BSD license incompatibility Source 4 Binary 1 Lic: GPLv3 M. Di Penta 24
  37. 37. Licensing Inconsistencies in RPM Packages Binary package Lib 1 Lic: GPLv3Different kinds of Requires: Lib1problems: Src package License: GPLv2 Source 1 1. declared license Lic: GPLv2 inconsistent wrt. Source 2 source code Lic: LGPL Binary 1 2. dependencies create Source 3 Lic: BSD license incompatibility Source 4 Binary 1 Lic: GPLv3 M. Di Penta 24
  38. 38. License Dependency Issues • Two GPLv2 source packages (lvm2, pilot-link) were using the library readline (GPLv3+) • License evolution problem • PHP was dynamically linking readline, a violation of the GPLv3+ • Problem was created by a build script • PHP either uses readline (GPLv3+) or libedit (BSD3) depending on what it finds M. Di Penta 25
  39. 39. In summary• Different characteristics of a software system can induce defects• Some can be used to build predictors, some are good just to raise warnings• Many studies showed that these models captures different dimensions of fault- proneness M. Di Penta 26
  40. 40. so... we know how to correlate various kinds ofsymptoms to fault-proneness... That’s great!
  41. 41. Incompatible Propagate licensing! clone changes! Poor Poorlexicon! design! You’ve Code You’re just changedis getting too touching a pointer ref.! complex! too many files! M. Di Penta 28
  42. 42. That’s too much!• We could build models that warn the developer against anything• It would be better to • Avoid information overload [Murphy, 2007] • Avoid false alarms based on common wisdom • Provide hints at the right time, in the right context• Also, we should qualitative justification to our models • To at least justify the cause-effect relation M. Di Penta 29
  43. 43. False Alarm: Clones• Common wisdom suggests that code cloning could be harmful• Recent (and past) studies suggested clones are not necessarily harmful [Kapser and Godfrey, 2008, and Krinke, 2007, Koschke and Gode, 2011]• Koschke and Gode reported that only 15% of clones undergo unintended inconsistent changes• Developers use cloning as a development practices M. Di Penta 30

×