Size Doesn’t Matter? On the Value of Software Size Features for Effort Estimation


Published on

Ekrem Kocaguneli, Tim Menzies : WVU,USA
Jairus Hihn : JPL, USA
Byeong Ho Kang : UTAS, Aus

PROMISE'12, Lund Sweden

Published in: Technology, Education
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Size Doesn’t Matter? On the Value of Software Size Features for Effort Estimation

  1. 1. Size Doesn’t Matter?On the Value of Software SizeFeatures for Effort Estimation Ekrem Kocaguneli, Tim Menzies : WVU,USA Jairus Hihn : JPL, USA Byeong Ho Kang : UTAS, Aus
  2. 2. Sept2012 Sound bites Size matters! But, lack of size features can be tolerated • caveat: need to first prune irrelevanciesPROMISE’12 2
  3. 3. Sept2012 Role of Size Features in SEE Size features are at the heart of some of the most widely used SEE methods COCOMO is based on LOC Function points (FP) is based on logical transactions Various others exist such as number of requirements, number of modules, number of web pages and so on…PROMISE’12 3
  4. 4. Sept2012 Role of Size Features in SEE (cntd.) Size features have their advantages and disadvantages LOC can be automated for counting and is good a posteriori, but is difficult to estimate early on FP provides a way of a size metric based on early design information; hence more accurate a priori FP cannot be automated and is subjective… Even though training reduces the estimate variationPROMISE’12 4
  5. 5. Sept2012 Objections to Size Features Although particular size features may have their advantages in certain scenarios, there is a strong opposition… “Measuring software productivity by lines of code is like measuring progress on an airplane by how much it weighs.” Bill Gates “This (referring to LOC) is a very costly measuring unit because it encourages the writing of insipid code, but today I am less interested in how foolish a unit it is from even a pure business point of view.” E. W. Dijkstra So we question: Under what conditions are size features actually a “must” and can we compensate their absence?PROMISE’12 5
  6. 6. Sept2012 So let’s check… If we throw away size attributes, what happens?PROMISE’12 6
  7. 7. Sept2012 If we remove “size”, what happens? Compare standard successful methods run on reduced and full data sets, using 7 error measures and 13 data sets… Full data set includes size features Reduced data sets lacks size features Methods Error Measures Datasets Cocomo81 Nasa93 Sdr CART MAR Cocomo81o Nasa93c1 Desharnais 1NN MMRE Cocomo81e Nasa93c2 DesharnaisL1 MdMRE Cocomo81s Nasa93c5 DesharnaisL2 Pred(25) DesharnaisL3 MMER MBRE MIBREPROMISE’12 7
  8. 8. Sept2012 Evaluation (cntd.) Methods Error Measures Datasets pop1NN MAR Cocomo81 Nasa93 Sdr CART MMRE Cocomo81o Nasa93c1 Desharnais 1NN Cocomo81e Nasa93c2 DesharnaisL1 MdMRE Cocomo81s Nasa93c5 DesharnaisL2 Pred(25) DesharnaisL3 MMER Compare pop1NN against CART & MBRE On multiple data sets 1NN MIBRE collected via COCOMO, COCOMOII and FP Using 7 error Why CART? measures Mann-Whitney 95% Dejaeger et al. TSE 2012PROMISE’12 8
  9. 9. Sept2012 Results (full data has “size”, reduced has not) CART on reduced-dataset vs. CART on full-dataset Last column shows total loss count of CART run on reduced dataset (i.e. no size features) In 7 of 13 tests, taking out size makes CART perform worsePROMISE’12 9
  10. 10. Sept2012 Results (full data has “size”, reduced has not) Total loss counts of CART and 1NN run on reduced data vs. their variants run on full data… Standard methods are better off with size attributes of the data sets… I.e. they cannot compensate for the lack of size attributes well (copied fromPROMISE’12 last slide) 10
  11. 11. Sept2012 New idea If we prune data irrelevancies, can we survive losing size attributes?PROMISE’12 11
  12. 12. Sept2012 Instance selection • Chang (1974) – Most of the instances are uninformative. – Reduced data sets of size 514, 150, 66 to 34, 14,6 prototypes . • Li et al. (2009) – genetic algorithm for instance selection • Turhan et al. (2009) – instance selection as a filter for cross-company defect data – See also, Kocaguneli et al. 2011 • Kocaguneli et al. (2011) variance-based selection: – Dendogram of clusters: prune sub-trees with large variances • Keung et al.’s (2011) Analogy-X – instance selection method for analogous entry • New idea, 1popNN : a very simple instance selectorPROMISE’12 12
  13. 13. Sept2012 pop1NN : the urchin shape We propose that a “popularity” based method can compensate the lack of size features The “popularity” of an instance is the number of times it is the nearest-neighbor of other instances Sea urchin is a good example for SEE data… Popular central instances that are closest neighbors to scattered neighbors…PROMISE’12 13
  14. 14. Sept2012 Formally, this is rNN • rNN = – Reverse Nearest Neighbor – E.g. how many residential areas would find a new store as their nearest choice. – E.g. predict popularity of a new cell phone plan, determine how many profiles have the plan as their best match, against the existing plans in the market. • Can be computed efficiently (rNN chaining) – see Lopez-Sastre et al., – Fast Reciprocal Nearest Neighbors Clustering, – Signal Processing, 2012, Vol. 92, pages 270—275)PROMISE’12 14
  15. 15. Sept2012 So let’s check… If we (1) throw away size attributes and (2) irrelevant rows, then what happens?PROMISE’12 15
  16. 16. Sept2012 Details: pop1NN (cntd.) pop1NN is a 6-step procedure… 1. Calculate distances between every training instance-tuple 2. Convert distances of Step 1 into ordering of neighbors 3. Mark closest neighbors and calculate popularity 4. Order training instances in decreasing popularity 5. Decide which instances to select • Experiments with nearest neighbor on a hold-out set 6. Return Estimates for the test instancesPROMISE’12 16
  17. 17. Sept2012 Results (reduced data) Loss values of pop1NN (on reduced data) vs. CART and 1NN (on full data) pop1NN loses 2 out of 13 data sets against 1NN pop1NN loses 4 out of 13 data sets against 1NNPROMISE’12 17
  18. 18. Sept2012 DiscussionPROMISE’12 18
  19. 19. Sept2012 Conclusions Successful methods (1NN & CART) cannot compensate the lack of size attributes very well  Lack of size features decreases their performance in majority of the data sets When 1NN is augmented with a popularity-based pre- processor to come up with pop1NN  Lack of size features can be tolerated in most of the datasets  Caveat: need to first prune irrelevancies Size features are essential for standard learners  Practitioners with enough resources to correctly collect size features should do so  In the lack of such resources, pop1NN-like methods can compensate for the lack of the size featuresPROMISE’12 19
  20. 20. Sept2012 Future Work • Pop1NN as a feature selector? – Lipowezky (1998) : • feature and case selection are similar tasks, • both remove cells in the hypercube of all instances times all features. – So it should be possible to convert a case selection mechanism into a feature selector. • Transpose data • Nearby columns are correlated • Keep columns that are near no other • Active learning: – pop1NN does not use dependent variable information. – can identify the popular instances of a data set, guide expert reflection on collect dependent variable informationPROMISE’12 20
  21. 21. Sept2012 Questions? Comments?PROMISE’12 21