The	
  Effects	
  of	
  
Duration-­‐based	
  Moving	
  Windows	
  
with	
  
Estimation	
  by	
  Analogy	
 
Sousuke	
  Amasaki*	
  
Chris	
  Lokan†	
  
	
  
Okayama	
  Prefectural	
  University*	
  
UNSW	
  Canberra†	
 
Mensura	
  2015	
  in	
  Kracow,	
  Poland	
  1
In	
  Mensura	
  2012,	
  we	
  focused	
  on	
  
Moving	
  Windows	
  for	
  Effort	
  Estimation	
  
with	
  Estimation	
  by	
  Analogy	
 
2	
 Mensura	
  2015	
  in	
  Kracow,	
  Poland	
 
Project Data for Training
EbA Effort Estimation Model
A target project
to be estimated	
 
Drop off old
project,
it maybe useless	
 
Retain
Window Size	
 
A new target	
 
past	
 future	
Conclusion in Mensura 2012 paper
•  MW could improve accuracy with EbA
•  Weaker effects with EbA than Linear Regression
Window	
  policies	
  matter	
 
3	
 
p  MW	
  was	
  examined	
  with	
  LR	
  and	
  two	
  policies	
  [IST2014]*	
  
p  Fixed-­‐size	
  
p Retain	
  N	
  projects	
  in	
  a	
  window	
  
p  Fixed-­‐duration	
  
p Retain	
  projects	
  within	
  N	
  months	
  
p  Results	
  show	
  the	
  difference	
  in	
  accuracy	
  improvement	
  
	
  	
  	
  *	
  C.	
  Lokan,	
  E.	
  Mendes.	
  Investigating	
  the	
  use	
  of	
  duration-­‐based	
  moving	
  windows	
  to	
  improve	
  software	
  effort	
  prediction:	
  A	
  replicated	
  study,	
  
Information	
  and	
  Software	
  Technology	
  56(9)	
  ,	
  pp.	
  1063–1075,	
  2014.	
  
Mensura	
  2015	
  in	
  Kracow,	
  Poland
Today’s	
  talk	
  is	
  about	
  
Duration-­‐based	
  Moving	
  Windows	
 
4	
 Mensura	
  2015	
  in	
  Kracow,	
  Poland	
 
past	
 future	
Fixed-size (Mensura 2012)
Fixed-duration
EbA with pre-selected
features (Mensura 2012)
EbA with on-time feature
selection (for reality)
Research	
  Questions	
 
5	
 Mensura	
  2015	
  in	
  Kracow,	
  Poland	
 
Is	
  there	
  a	
  difference	
  in	
  the	
  accuracy	
  of	
  estimates	
  between	
  EbA	
  
with	
  pre-­‐	
  and	
  on-­‐time	
  selections	
  using	
  fixed-­‐size	
  windows?	
  
RQ1. Reconfirmation of Mensura 2012 results	
 
Is	
  there	
  a	
  difference	
  in	
  the	
  accuracy	
  of	
  estimates	
  
with	
  and	
  without	
  MW	
  with	
  the	
  revised	
  EbA	
  and	
  fixed	
  duration	
  windows?	
  
RQ2. Evaluation of Fixed-Duration Windows	
 
RQ3. Comparison between window policies	
 
How	
  do	
  these	
  results	
  compare	
  with	
  results	
  based	
  on	
  fixed-­‐size	
  windows?	
  
The	
  revised	
  EbA	
 
Mensura	
  2015	
  in	
  Kracow,	
  Poland	
  6	
 
p  Select	
  features	
  on	
  the	
  basis	
  of	
  the	
  whole	
  dataset	
  
p  Wrapper	
  approach	
  
p  Use	
  simple	
  mean	
  for	
  estimation	
  
Mensura 2012	
 
p  Select	
  features	
  for	
  every	
  new	
  target	
  project	
  
p  Lasso	
  for	
  reducing	
  computation	
  costs	
  
p  Use	
  inverse	
  rank	
  weighted	
  mean	
  (IRWM)	
  for	
  estimation	
  
This study	
 
Unrealistic to use
future projects	
 
Contribute to
estimation accuracy
Dataset	
 
Mensura	
  2015	
  in	
  Kracow,	
  Poland	
  7	
 
Properties
p  Highly quality rated as A or B by ISBSG
p  Size Measured with IFPUG 4.0 or later
p  Known Actual effort
p  Not web projects
p  228 projects
Candidate predictors
p  Unadjusted FP
p  Language types
p  Development types
p  Platform types
p  Domain Sector types
As same as Mensura 2012
Experiments	
 
Mensura	
  2015	
  in	
  Kracow,	
  Poland	
  8	
 
p  Mensura	
  2012	
  EbA	
  vs.	
  the	
  revised	
  EbA	
  (for	
  RQ1)	
  
p  Growing	
  Portfolio	
  (use	
  all	
  past	
  projects)	
  vs.	
  Moving	
  Windows	
  (for	
  RQ2,	
  RQ3)	
  
Performance	
  trend	
  analysis	
  
Preference	
Preference	
Statistical	
  
significance	
 
Statistical	
  
significance	
 
Comparisons	
  between:	
p From	
  12	
  to	
  84	
  months	
  (fixed-­‐duration)	
  
p From	
  20	
  to	
  120	
  projects	
  (fixed-­‐size)
Results:	
  fixed-­‐size	
  windows	
  with	
  
the	
  revised	
  EbA	
 
Mensura	
  2015	
  in	
  Kracow,	
  Poland	
  9	
 
8 Sousuke Amasaki and Chris Lokan
20 40 60 80 100 120
Window Size (number of projects)
10
5
0
5
DifferencesinmeanAE(%)
(a) Di↵erences in mean MAE
8 Sousuke Amasaki and Chris Lokan
(a) Di↵erences in mean MAE
20 40 60 80 100 120
Window Size (number of projects)
15
10
5
0
5
10
DifferencesinmeanMRE(%) (b) Di↵erences in mean MRE
Fig. 1: Results with Fixed-size Window, modified EbA with k = 5
Figure 1 and Table 2 revealed characteristics of moving windows compared
to the growing portfolio:
– With windows of up to 60 projects, MAE showed no significant preference
for any approach. The line starts below zero and quickly goes above zero
(favoring the growing portfolio), but the di↵erence was not significant as shown
p  GP was advantageous in smaller window sizes but not significant
p  MW got significantly advantageous in medium window size
Num of Neighbors k = 5
Results:	
  comparisons	
  between	
  
the	
  old	
  and	
  the	
  revised	
  EbA	
 
Mensura	
  2015	
  in	
  Kracow,	
  Poland	
  10	
 
8 Sousuke Amasaki and Chris Lokan
20 40 60 80 100 120
Window Size (number of projects)
10
5
0
5
DifferencesinmeanAE(%)
(a) Di↵erences in mean MAE
8 Sousuke Amasaki and Chris Lokan
(a) Di↵erences in mean MAE
20 40 60 80 100 120
Window Size (number of projects)
15
10
5
0
5
10
DifferencesinmeanMRE(%)
(b) Di↵erences in mean MRE
Fig. 1: Results with Fixed-size Window, modified EbA with k = 5
Figure 1 and Table 2 revealed characteristics of moving windows compared
to the growing portfolio:
– With windows of up to 60 projects, MAE showed no significant preference
for any approach. The line starts below zero and quickly goes above zero
(favoring the growing portfolio), but the di↵erence was not significant as shown
Num of Neighbors k = 5	
 
p  Trends were same but effective sizes and ranges were differentp  Trends were same but effective sizes and ranges were different
p  The best k moved from k=2 (Mensura 2012) to k=5
p  Trends were same but effective sizes and ranges were different
p  The best k moved from k=2 (Mensura 2012) to k=5
p  The improvement by MW was clearer in statistical significance
Results:	
  fixed-­‐duration	
  windows	
  
with	
  the	
  revised	
  EbA	
 
Mensura	
  2015	
  in	
  Kracow,	
  Poland	
  11	
 
12 Sousuke Amasaki and Chris Lokan
20 30 40 50 60 70 80
Window Size (calendar months)
10
5
0
5
DifferencesinmeanAE(%)
(a) Di↵erences in mean MAE
(a) Di↵erences in mean MAE
20 30 40 50 60 70 80
Window Size (calendar months)
15
10
5
0
5
DifferencesinmeanMRE(%)
(b) Di↵erences in mean MRE
Fig. 2: Results with Fixed-duration Windows, EbA with k = 5
growing portfolio are larger with EbA than with LR, and the range of durations
for which windows are advantageous is narrower with EbA than with LR. The
di↵erence in advantageous window sizes and their number between EbA and
LR were reported in [4]. These observations were common between this study
and [4].
p  GP was advantageous in smaller window sizes but not significant
p  MW got significantly advantageous in medium window size
p  Less significant window sizes than fixed-size windows
Num of Neighbors k = 5
Results:	
  comparison	
  to	
  
the	
  past	
  study	
  [IST2014]	
 
Mensura	
  2015	
  in	
  Kracow,	
  Poland	
  12	
 
12 Sousuke Amasaki and Chris Lokan
20 30 40 50 60 70 80
Window Size (calendar months)
10
5
0
5
DifferencesinmeanAE(%)
(a) Di↵erences in mean MAE
(a) Di↵erences in mean MAE
20 30 40 50 60 70 80
Window Size (calendar months)
15
10
5
0
5
DifferencesinmeanMRE(%)
(b) Di↵erences in mean MRE
Fig. 2: Results with Fixed-duration Windows, EbA with k = 5
growing portfolio are larger with EbA than with LR, and the range of durations
for which windows are advantageous is narrower with EbA than with LR. The
di↵erence in advantageous window sizes and their number between EbA and
LR were reported in [4]. These observations were common between this study
and [4].
Num of Neighbors k = 5	
 
p  Overall trend was same between the two studies
p  Fixed-size windows was more effective than fixed-duration
p  The effective window size became larger and its range is narrower
Answers	
  to	
  RQs	
 
13	
 Mensura	
  2015	
  in	
  Kracow,	
  Poland	
 
The	
  change	
  in	
  estimation	
  method	
  made	
  a	
  difference,	
  improving	
  
the	
  accuracy	
  of	
  estimates.	
  
RQ1. Reconfirmation of Mensura 2012 results	
 
The	
  fixed-­‐duration	
  windows	
  can	
  make	
  a	
  difference,	
  and	
  effective	
  
to	
  improve	
  estimation	
  accuracy.	
  
RQ2. Evaluation of Fixed-Duration Windows	
 
RQ3. Comparison between window policies	
 
The	
  fixed-­‐size	
  and	
  fixed-­‐duration	
  window	
  policies	
  can	
  lead	
  to	
  significantly	
  
better	
  estimation	
  accuracy.	
  But	
  fixed-­‐size	
  made	
  clearer	
  difference.	
  
Practical	
  implications	
 
14	
 Mensura	
  2015	
  in	
  Kracow,	
  Poland	
 
This	
  and	
  past	
  studies	
  showed	
  its	
  effectiveness	
  with	
  major	
  effort	
  
estimation	
  method,	
  LR	
  and	
  EbA.	
  
1. Moving Windows is effective	
 
This	
  and	
  past	
  studies	
  showed	
  clearer	
  difference	
  when	
  using	
  fixed-­‐size	
  
windows.	
  Rethink	
  practitioners’	
  mind	
  regarding	
  reference	
  projects.	
  
2. Fixed-size policy looks better for estimation	
 
3. Effective window sizes might be different even among practitioners 	
 
EbA	
  resembles	
  practitioners’	
  thinking.	
  The	
  fact	
  that	
  the	
  difference	
  in	
  
options	
  resulted	
  in	
  different	
  window	
  ranges	
  partly	
  explain	
  the	
  difference	
  
among	
  practitioners	
  
Threats	
  to	
  Validity	
 
Mensura	
  2015	
  in	
  Kracow,	
  Poland	
  15	
 
p  The	
  result	
  was	
  based	
  on	
  only	
  ISBSG	
  dataset	
  
p  It	
  is	
  difficult	
  to	
  generalize	
  the	
  results	
  
Dataset	
  
EbA	
  
p  Limited	
  to	
  specific	
  options	
  
p More	
  accurate	
  or	
  more	
  realistic	
  settings	
  
Conclusion	
 
p  Fixed-­‐duration	
  windows	
  works	
  with	
  EbA	
  
p  Under	
  more	
  realistic	
  situation	
  
p  The	
  results	
  brought	
  some	
  practical	
  implications	
  
p  ex.	
  Fixed-­‐size	
  policy	
  is	
  more	
  suitable	
  
p  Exploration	
  of	
  EbA	
  options	
  
p  Additional	
  experiments	
  on	
  other	
  datasets	
  
16	
 Mensura	
  2015	
  in	
  Kracow,	
  Poland	
 
Future Work
Mensura	
  2015	
  in	
  Kracow,	
  Poland	
  17	
 
We	
  welcome	
  questions	
  !	
  
Sousuke	
  Amasaki:	
  amasaki@cse.oka-­‐pu.ac.jp	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  Chris	
  Lokan:	
  c.lokan@adfa.edu.au	
  	
Contact	
  info:	
  

The effects of duration based moving windows with estimation by analogy - sousouke amasaki

  • 1.
    The  Effects  of   Duration-­‐based  Moving  Windows   with   Estimation  by  Analogy Sousuke  Amasaki*   Chris  Lokan†     Okayama  Prefectural  University*   UNSW  Canberra† Mensura  2015  in  Kracow,  Poland 1
  • 2.
    In  Mensura  2012,  we  focused  on   Moving  Windows  for  Effort  Estimation   with  Estimation  by  Analogy 2 Mensura  2015  in  Kracow,  Poland Project Data for Training EbA Effort Estimation Model A target project to be estimated Drop off old project, it maybe useless Retain Window Size A new target past future Conclusion in Mensura 2012 paper •  MW could improve accuracy with EbA •  Weaker effects with EbA than Linear Regression
  • 3.
    Window  policies  matter 3 p  MW  was  examined  with  LR  and  two  policies  [IST2014]*   p  Fixed-­‐size   p Retain  N  projects  in  a  window   p  Fixed-­‐duration   p Retain  projects  within  N  months   p  Results  show  the  difference  in  accuracy  improvement        *  C.  Lokan,  E.  Mendes.  Investigating  the  use  of  duration-­‐based  moving  windows  to  improve  software  effort  prediction:  A  replicated  study,   Information  and  Software  Technology  56(9)  ,  pp.  1063–1075,  2014.   Mensura  2015  in  Kracow,  Poland
  • 4.
    Today’s  talk  is  about   Duration-­‐based  Moving  Windows 4 Mensura  2015  in  Kracow,  Poland past future Fixed-size (Mensura 2012) Fixed-duration EbA with pre-selected features (Mensura 2012) EbA with on-time feature selection (for reality)
  • 5.
    Research  Questions 5 Mensura  2015  in  Kracow,  Poland Is  there  a  difference  in  the  accuracy  of  estimates  between  EbA   with  pre-­‐  and  on-­‐time  selections  using  fixed-­‐size  windows?   RQ1. Reconfirmation of Mensura 2012 results Is  there  a  difference  in  the  accuracy  of  estimates   with  and  without  MW  with  the  revised  EbA  and  fixed  duration  windows?   RQ2. Evaluation of Fixed-Duration Windows RQ3. Comparison between window policies How  do  these  results  compare  with  results  based  on  fixed-­‐size  windows?  
  • 6.
    The  revised  EbA Mensura  2015  in  Kracow,  Poland 6 p  Select  features  on  the  basis  of  the  whole  dataset   p  Wrapper  approach   p  Use  simple  mean  for  estimation   Mensura 2012 p  Select  features  for  every  new  target  project   p  Lasso  for  reducing  computation  costs   p  Use  inverse  rank  weighted  mean  (IRWM)  for  estimation   This study Unrealistic to use future projects Contribute to estimation accuracy
  • 7.
    Dataset Mensura  2015  in  Kracow,  Poland 7 Properties p  Highly quality rated as A or B by ISBSG p  Size Measured with IFPUG 4.0 or later p  Known Actual effort p  Not web projects p  228 projects Candidate predictors p  Unadjusted FP p  Language types p  Development types p  Platform types p  Domain Sector types As same as Mensura 2012
  • 8.
    Experiments Mensura  2015  in  Kracow,  Poland 8 p  Mensura  2012  EbA  vs.  the  revised  EbA  (for  RQ1)   p  Growing  Portfolio  (use  all  past  projects)  vs.  Moving  Windows  (for  RQ2,  RQ3)   Performance  trend  analysis   Preference Preference Statistical   significance Statistical   significance Comparisons  between: p From  12  to  84  months  (fixed-­‐duration)   p From  20  to  120  projects  (fixed-­‐size)
  • 9.
    Results:  fixed-­‐size  windows  with   the  revised  EbA Mensura  2015  in  Kracow,  Poland 9 8 Sousuke Amasaki and Chris Lokan 20 40 60 80 100 120 Window Size (number of projects) 10 5 0 5 DifferencesinmeanAE(%) (a) Di↵erences in mean MAE 8 Sousuke Amasaki and Chris Lokan (a) Di↵erences in mean MAE 20 40 60 80 100 120 Window Size (number of projects) 15 10 5 0 5 10 DifferencesinmeanMRE(%) (b) Di↵erences in mean MRE Fig. 1: Results with Fixed-size Window, modified EbA with k = 5 Figure 1 and Table 2 revealed characteristics of moving windows compared to the growing portfolio: – With windows of up to 60 projects, MAE showed no significant preference for any approach. The line starts below zero and quickly goes above zero (favoring the growing portfolio), but the di↵erence was not significant as shown p  GP was advantageous in smaller window sizes but not significant p  MW got significantly advantageous in medium window size Num of Neighbors k = 5
  • 10.
    Results:  comparisons  between   the  old  and  the  revised  EbA Mensura  2015  in  Kracow,  Poland 10 8 Sousuke Amasaki and Chris Lokan 20 40 60 80 100 120 Window Size (number of projects) 10 5 0 5 DifferencesinmeanAE(%) (a) Di↵erences in mean MAE 8 Sousuke Amasaki and Chris Lokan (a) Di↵erences in mean MAE 20 40 60 80 100 120 Window Size (number of projects) 15 10 5 0 5 10 DifferencesinmeanMRE(%) (b) Di↵erences in mean MRE Fig. 1: Results with Fixed-size Window, modified EbA with k = 5 Figure 1 and Table 2 revealed characteristics of moving windows compared to the growing portfolio: – With windows of up to 60 projects, MAE showed no significant preference for any approach. The line starts below zero and quickly goes above zero (favoring the growing portfolio), but the di↵erence was not significant as shown Num of Neighbors k = 5 p  Trends were same but effective sizes and ranges were differentp  Trends were same but effective sizes and ranges were different p  The best k moved from k=2 (Mensura 2012) to k=5 p  Trends were same but effective sizes and ranges were different p  The best k moved from k=2 (Mensura 2012) to k=5 p  The improvement by MW was clearer in statistical significance
  • 11.
    Results:  fixed-­‐duration  windows   with  the  revised  EbA Mensura  2015  in  Kracow,  Poland 11 12 Sousuke Amasaki and Chris Lokan 20 30 40 50 60 70 80 Window Size (calendar months) 10 5 0 5 DifferencesinmeanAE(%) (a) Di↵erences in mean MAE (a) Di↵erences in mean MAE 20 30 40 50 60 70 80 Window Size (calendar months) 15 10 5 0 5 DifferencesinmeanMRE(%) (b) Di↵erences in mean MRE Fig. 2: Results with Fixed-duration Windows, EbA with k = 5 growing portfolio are larger with EbA than with LR, and the range of durations for which windows are advantageous is narrower with EbA than with LR. The di↵erence in advantageous window sizes and their number between EbA and LR were reported in [4]. These observations were common between this study and [4]. p  GP was advantageous in smaller window sizes but not significant p  MW got significantly advantageous in medium window size p  Less significant window sizes than fixed-size windows Num of Neighbors k = 5
  • 12.
    Results:  comparison  to   the  past  study  [IST2014] Mensura  2015  in  Kracow,  Poland 12 12 Sousuke Amasaki and Chris Lokan 20 30 40 50 60 70 80 Window Size (calendar months) 10 5 0 5 DifferencesinmeanAE(%) (a) Di↵erences in mean MAE (a) Di↵erences in mean MAE 20 30 40 50 60 70 80 Window Size (calendar months) 15 10 5 0 5 DifferencesinmeanMRE(%) (b) Di↵erences in mean MRE Fig. 2: Results with Fixed-duration Windows, EbA with k = 5 growing portfolio are larger with EbA than with LR, and the range of durations for which windows are advantageous is narrower with EbA than with LR. The di↵erence in advantageous window sizes and their number between EbA and LR were reported in [4]. These observations were common between this study and [4]. Num of Neighbors k = 5 p  Overall trend was same between the two studies p  Fixed-size windows was more effective than fixed-duration p  The effective window size became larger and its range is narrower
  • 13.
    Answers  to  RQs 13 Mensura  2015  in  Kracow,  Poland The  change  in  estimation  method  made  a  difference,  improving   the  accuracy  of  estimates.   RQ1. Reconfirmation of Mensura 2012 results The  fixed-­‐duration  windows  can  make  a  difference,  and  effective   to  improve  estimation  accuracy.   RQ2. Evaluation of Fixed-Duration Windows RQ3. Comparison between window policies The  fixed-­‐size  and  fixed-­‐duration  window  policies  can  lead  to  significantly   better  estimation  accuracy.  But  fixed-­‐size  made  clearer  difference.  
  • 14.
    Practical  implications 14 Mensura  2015  in  Kracow,  Poland This  and  past  studies  showed  its  effectiveness  with  major  effort   estimation  method,  LR  and  EbA.   1. Moving Windows is effective This  and  past  studies  showed  clearer  difference  when  using  fixed-­‐size   windows.  Rethink  practitioners’  mind  regarding  reference  projects.   2. Fixed-size policy looks better for estimation 3. Effective window sizes might be different even among practitioners EbA  resembles  practitioners’  thinking.  The  fact  that  the  difference  in   options  resulted  in  different  window  ranges  partly  explain  the  difference   among  practitioners  
  • 15.
    Threats  to  Validity Mensura  2015  in  Kracow,  Poland 15 p  The  result  was  based  on  only  ISBSG  dataset   p  It  is  difficult  to  generalize  the  results   Dataset   EbA   p  Limited  to  specific  options   p More  accurate  or  more  realistic  settings  
  • 16.
    Conclusion p  Fixed-­‐duration  windows  works  with  EbA   p  Under  more  realistic  situation   p  The  results  brought  some  practical  implications   p  ex.  Fixed-­‐size  policy  is  more  suitable   p  Exploration  of  EbA  options   p  Additional  experiments  on  other  datasets   16 Mensura  2015  in  Kracow,  Poland Future Work
  • 17.
    Mensura  2015  in  Kracow,  Poland 17 We  welcome  questions  !   Sousuke  Amasaki:  amasaki@cse.oka-­‐pu.ac.jp                        Chris  Lokan:  c.lokan@adfa.edu.au   Contact  info: