Benchmarking Software Estimation
Methods
Adam Boyd, Sameer Huque, Kristine Pachuta, Travis Pincoski, John Trumble
July 11,...
• Product Director at
Bloomberg LP
• Former IT Sourcing
Director at Bloomberg
and Pricing Analyst at
General Dynamics

OUR...
AGENDA
1 Review of Project Scope
2 Best Practices Research
3 Interview Results
4 Data Analysis and Benchmarking
5 Data Rec...
OUR PROJECT SCOPE COVERED BEST PRACTICE
RESEARCH AND DATA BENCHMARKING
PROJECT SCOPE
• Compile a list of software estimati...
OUR RECOMMENDATIONS ARE BASED ON THE RESULTS
OF OUR RESEARCH AND ANALYSIS
1
Researched Industry
Best Practices
2

• Search...
A ROADMAP OF INCREMENTAL IMPROVEMENTS WILL
HELP YOU MEET YOUR GOALS

Productivity

Support timely
and effective
estimates,...
AGENDA
1 Review of Project Scope
2 Best Practices Research
3 Interview Results
4 Data Analysis and Benchmarking
5 Data Rec...
SEVERAL THEMES EMERGED IN OUR RESEARCH OF
BEST PRACTICES
Theme

Research

The best
measurements
are quantitative
and quali...
DATA MODELS SHOULD HAVE FIVE KEY
CHARACTERISTICS
Breadth

Depth

Explicit

Actionable

Automated

To handle multiple techn...
TECHNICAL GOALS SHOULD ALIGN WITH YOUR
CORPORATE AND STRATEGIC GOALS
Technical
Goals

Software
Quality

Developer
Efficien...
AGENDA
1 Review of Project Scope
2 Best Practices Research
3 Interview Results
4 Data Analysis and Benchmarking
5 Data Rec...
WE INTERVIEWED MANY INDUSTRY EXPERTS
Chief Information Officer, Philips IT

Size

Head of Digital Content Development, AOL...
THE EXPERTS HAVE STRONG OPINIONS ABOUT THE
NECESSITY OF MEASURING DEVELOPER OUTPUT

most formal measurement
systems may de...
MOST EXPERTS STRESSED THE DIFFICULTY OF
MEASUREMENT (1)
Interviewee
Co-Director,
Security and Software
Engineering Researc...
MOST EXPERTS STRESSED THE DIFFICULTY OF
MEASUREMENT (2)
Interviewee

Former Head,
AOL Digital Content
Development

Key Tak...
ONLY ONE OF THE COMPANIES WE INTERVIEWED
MEASURES INDIVIDUAL DEVELOPER PRODUCTIVITY
Measures Productivity

Do Not Measure ...
AGENDA
1 Review of Project Scope
2 Best Practices Research
3 Interview Results
4 Data Analysis and Benchmarking
5 Data Rec...
WE CONDUCTED AN ANALYSIS OF THE GMD PROVIDED
DATA SET
Approach 1

•Comparison of GMD
data to data from ISBSG
(Internationa...
WE ALSO CONDUCTED AN ANALYSIS OF A THIRD PARTY
DATASET
GOAL: Benchmark GMD data to an independent data set
•Sample data se...
SEVERAL EXPERIMENTS WERE RUN WITH THE GMD
DATA TO DETERMINE THE CRITICAL PATH
GOAL: Develop a predictive model with a high...
SUGGESTION: BASELINING TO THE AVERAGE

SLOWER THAN
AVERAGE

FASTER THAN
AVERAGE
Defining the average allows for better pro...
THIS RESULTED IN THE NEW MODEL
• Seven programmers used to define the average
• Java: #4 & #9
• .NET: #19
• LN4: #27
• COB...
THE NEW MODEL
INPUT VARIABLES
Constant term

COEFFICIENT

P-VALUE

-4.349

0.000

Complexity

1.492

0.000

RPG

3.177

0....
RMS ERRORS INCREASE WITH COMPLEXITY

24
COBOL HAS THE HIGHEST RMS ERROR, LN4 AND JAVA
HAVE THE LOWEST

25
AGENDA
1 Review of Project Scope
2 Best Practices Research
3 Interview Results
4 Data Analysis and Benchmarking
5 Data Rec...
THREE KEY DATA RECOMMENDATIONS

1
2
3

Baseline the model to the average programmer
to establish relative performance
Re-d...
1ST RECOMMENDATION:
BASELINE THE MODEL TO THE AVERAGE

1
Baseline the model
to the average

•
•

•

Account for inherent t...
2ND RECOMMENDATION:
RE-DEFINE PROGRAMMING LANGUAGE VARIABLE

•
2

Re-define
programming
language variable

•
•

Current sc...
3RD RECOMMENDATION:
EXPAND THE COMPLEXITY VARIABLE

•
3
Expand the
complexity variable

•
•

Current error measurement
gro...
MOVING BEYOND THE DATA CREATES A HIGH
PERFORMANCE CULTURE

Aligns

Data
Soft
Skills

High Performance Culture
31
AGENDA
1 Review of Project Scope
2 Best Practices Research
3 Interview Results
4 Data Analysis and Benchmarking
5 Data Rec...
OUTLINE OF HOW TO ACHIEVE THESE GOALS
Dedicated
Estimation
Program Office
(EPO)

Maintenance of existing and
development o...
OTHER OPTIONS: STRATEGIC APPLICATION OF AGILE

Business Issue
Deployment
Strategy

Project Approach

Organizational
Cultur...
OTHER OPTIONS: EFFECTIVE ESTIMATION CAPABILITY
In addition to organizational commitment, executive sponsor support, and ad...
OTHER OPTIONS: DEVELOP RESOURCE CATALOGS
Resource Catalogs include reusable size increments
•
•
•
•

A resource catalog is...
OTHER OPTIONS: OVERVIEW OF 3RD PARTY SOFTWARE
OPTIONS
Options for the best software tools to track productivity
•

Accordi...
TAKING THESE STEPS WILL RESULT IN SOUND
MEASUREMENTS AND ESTIMATIONS
1
Collect More data

• Start collecting project actua...
Gracias!

39
Appendix

40
WE REVIEWED RESEARCH FROM MULTIPLE SOURCES
Sample Works Reviewed:
Measuring Developers, Aligning Perspectives and other Be...
OTHER OPTIONS: STRATEGIC APPLICATION OF AGILE
Business Issue
Deployment
Strategy

Project Approach

Organizational
Culture...
SEVERAL EXPERIMENTS WERE RUN WITH THE GMD
DATA
• Different methods were attempted to achieve a higher R2 for the
predictiv...
OTHER OPTIONS: DEVELOP PRODUCT-ORIENTED COST
ELEMENT STRUCTURE
• Develop product-oriented cost element
structure (CES) tha...
Upcoming SlideShare
Loading in …5
×

Benchmarking Software Estimation Methods

764 views
621 views

Published on

Presentación elaborada por colegas de la McDonough School of Business de Georgetown University (Washintong D.C. - USA) como resultado de un trabajo de 2 mese de investigación en conjunto con el equipo de arquitectura de GMD (Lima-Perú) el cual lideré.

Published in: Technology, Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
764
On SlideShare
0
From Embeds
0
Number of Embeds
146
Actions
Shares
0
Downloads
15
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Thank you so much for the opportunity to work with your team over the last few months. We are excited to be here in Lima and share the results of our work.
  • Today we will briefly review the project scope and deliverables that we set a couple of months ago. Then we will present our findings across our best practices research and the interviews we conducted. Next we will dive into the data, providing recommendations for improving your data set and the results of our comparisons to a third party data set. Finally, we want to conclude by offering several suggestions outside of the data modeling for you to consider.
  • As we agreed to in May, the scope of this project was to:Compile a list of software estimation methods, best practices and benchmarks used at competing firms in USAnalyze GMD data and summarize findingsAnalyze third party dataPresent data and suggestions to GMD project sponsors
  • We went about this in the follow way: First, we research industry best practices. We will discuss these in a moment. Then, we went and interviewed subject matter experts across companies of multiple sizes and academia. We will summarize those results today. Next, we analyzed both your data set and a third party data set in an effort to generate recommendations and an equation for your to use moving forward. Finally, we looked beyond the data, taking what we learned in the best practices research and interviews to offer additional suggestions to process improvements to promote efficiencies.
  • We wanted to provide you with a roadmap. A roadmap to improve productivity and estimation consistency. To do this, we feel you need to follow the following steps. First, continually improve the consistency, reliability and productivity of your developers. Next, support consistent and reliable project planning. Then, support timely and effective project estimates. Include bug fixes, re-work and documentation in these estimates. Finally, build and deeply in house estimating and productivity planning capabilities. Throughout our presentation today, we will provide with data and analysis to help your move along this roadmap.
  • Thank you so much for the opportunity to work with your team over the last few months. We are excited to be here in Lima and share the results of our work. Today we will briefly review the project scope and deliverables that we set a couple of months ago. Then we will present our findings across our best practices research and the interviews we conducted. Next we will dive into the data, providing recommendations for improving your data set and the results of our comparisons to a third party data set. Finally, we want to conclude by offering several suggestions outside of the data modeling for you to consider.
  • We reviewed journal articles, white papers, news and blogs to determine several key themes in current development best practices research:First, the best measurements are quantitative and qualitative. Data should be refined and measuredData analysis should be coupled with soft skills analysisThe best developers may not make the best leadersMeasures are best used in large distributed projectsNext, some data is easy to game. For example, counting lines of code is easy to trick. A developer can write one line or three and accomplish the same task. To get around this, make sure that:The metrics are not demoting. They should not be perceived to be putative and should be tracked at a peer, not manager level. Finally, 5 key characteristics should be tracked. First, the data should have the breadth to handle multiple technologies. Then the depth to look at all levels of the application. It should be explicit enough to test against industry standards, actionable to show where improvements can be made and automated as to not detract from the workload.
  • Finally, 5 key characteristics should be tracked. First, the data should have the breadth to handle multiple technologies. Then the depth to look at all levels of the application. It should be explicit enough to test against industry standards, actionable to show where improvements can be made and automated as to not detract from the workload.
  • One of the best recommendations that surfaced from our best practices research was the need to align the goals of your corporation with both your strategic goals as a firm and your technical goals. ON your website, we noted that quality and efficiency, as well as customer satisfaction, were a few of your corporate goals. These can all be easily aligned with several of the themes in our best practices research. Your corporate goal of quality can be translated into a GMD specific goal of product quality and an individual developer goal of software quality. The corporate goal of efficiency means on time delivery for GMD and therefore for the developers on the project. Customer satisfaction remains an overarching goal at all levels. Translating and instilling these goals in your employees at all levels will help the entire organization.
  • To further understand industry best practices being used in the field, we talked to representatives of corporations of all sizes and industry functions—from consulting to small government contractors, to huge multinational corporations
  • One recurring them in the interviews we conducted was that it is difficult to quantify individual developers. Why? Experts gave many reasons:1. Demotivation 2. Non quantifiable skills are better predictors of productivity and quality “ identify the developers …” “ peer code review is key” 3. Doesn’t allow for possibility that great developers influence the solution “a great developer …” “ don’t penalize a developer for taking the time exceed expectations”
  • So let’s briefly take a closer look at the feedback we collected from industry experts …Co-Director, Gtown + theme: challenges to lines of code analysisChief Technologist,SWComplete + theme:soft skills
  • Former Head, AOL Digital Content + theme: teams and codeCIO, Philips IT + theme: teams and soft skillsDirector, Thoughtworks Consulting + theme: multiple developer sources
  • Common theme: apart from IBM most don’t try to track performance of individual developers for varying reasons: + focus on teams + focus on balance of hard and soft skills + focus on testing
  • Travis has gone through our approach to the dataset provided, the 3rd party dataset we acquired, the details of the model based a on a few small changesIn doing so, he has already touched on each of these recommendations but I’m going to take a step back from the details and discuss each of them in a little bit of detail with some examples to emphasize precisely what we believe needs to happen going forward to increase the accuracy and reliability of any predictive modelPlease note that these recommendations apply only to the dataset provided, and not necessarily overall recommendations for GMD in using this practiceAfter I cover each of these data recommendations, Sameer will provide top-level recommendations for GMD going forwardOK, now back to the data set…The first is to re-baseline the regression model being used to the “average” GMD programmer, across the different languagesSecond is to re-define the variable used for the different programming languagesThird is the expand upon the current complexity variable used in the dataset
  • The current model prior to any of our adjustments is built to account for all of the programmers at GMD, or at least everyone that was included in the dataset providedA major variable that will drive a low coefficient will be the wide variation in the talent levels of the programmers on staff at GMDWhile the initial model does include variables for time on the job or experience with the programming language, there is no variable for the inherent talent level of each of the programmersOur approach was to rebaseline the model to the programmers of “average” talent level, provided by Luis, and then apply this model with a higher demonstrated coefficient (or R-squared measure) to the entire workforce to check for errorsThis model will help further illustrate who the high-performing programmers are (who repeatedly beat the time estimate predictions as measured by their errors to the predicted times) and possibly point out which developers need more seasoning or training in order to achieve what is considered to be average performanceFor example: if Travis was consistently producing output at time estimates that were lower than what the model was predicting, it would leave you to believe that he was one of your higher performersManagement can apply these data points to what they already know and believe in terms of talent level of the individual programmers and the “average” programmer in the model can be adjusted accordingly
  • The current model appears to order the programming language variable in the order it was created, older languages have low identifying numbers while more recent languages such as SQL and Crystal have high identifying numbersThe implication in this context has no predictive relationship to the construction time estimates that are currently trying to be modeledSaid another way, there is no direct relationship between “time to construct” and the values applied to each programming language on the current scaleWe recommend simplifying this variable by grouping them by a certain language type such as imperative, declarative, functional or object-oriented and numbering them as such, or ordering the language in terms of “difficulty” so the scale applied actually pertains to the outcome of predicted time to constructIf either of these approaches is not used moving forward, then this variable should be treated as categorical and treated as a binary variable, that is to say the programming language is either Java or it is not, it is either Visual Basic or it is notThis way the predicted time to construct is not artificially inflated or deflated based on the current numbering system that has no predictive contextJust to be clear, without the understanding needed to reorder the programming languages or to provide an accurate grouping, we treated the variable as binary in our adjusted model we have provided and Travis discussed earlier
  • The current model prior to any of our adjustments uses a complexity scale of baja, media, or altaWhile it may be more easier and more efficient to apply this simple approach to the complexity of each job, Travis did touch on the residual errors that grew as the complexity measurement of the job was definedBaja had a low root mean squared error of 1.14, but the alta defined projects had an error four times as high at 4.29There are a few different ways you could approach adjusting this variableThe first would be to assign difficulty ratings that further stratify the complexity of these projectsFor example, baja would be a 1, media would be defined as 3 times difficult – so 3, and alta would be 3 times as difficult as the media – so 9This difference would help show the significant variance in the difficulties and may lead to a more accurate modelA second approach building off of this first one would be to add further difficulty definitions between the three that are already in use, something between baja and media and something else between media and altaIt wouldn’t necessarily be an even split between the projects either, for example one alta project might be twice as difficult as another and would need to be defined as suchFor example, in Kristine’s experience at Bloomberg, different projects are delineated with 1, 2, 3, 5, 8 and 12 as difficulty measurementsThis leads for a more accurate representation of the spread of difficulty across projects and prevents projects from being inaccurately lumped into a category that gets too broad for accurate statistical analysisUnfortunately, in our model we were not able to provide this added level of detail because of our lack of knowledge on the different projects, so it is our hope that the re-baselined model with an correlation of 65% can be improved upon with these complexity adjustmentsAdditionally the error measurements on the media and alta projects can be reduced and brought into line with the rest of the jobs
  • To summarize, the three recommendations we have come up for improving the reliability and accuracy of the model are to:Baseline the model to the average programmerRe-define the programming language variableAnd to expand the complexity variableWith these changes, we are confident that GMD will have a much more useful model to use in predicting construction time for the various software engineersBut at this point, we’d like to re-emphasize some of our findings from research and evaluating known industry standardsThe data will help, but will reach maximum impact when combined with other performance measurements such as soft skills, and is in alignment with overall corporate goalsFrom here, I’ll hand it over to Sameer who will recap our recommendations to this point and will provide some insight into where to go from here, thank you
  • There are many commercially available tools that can help in tracking developer productivity. According to a University of California survey conducted in 2009, in most work environments, developers spend an average of 11 minutes writing code at one time before they are distracted by an unrelated task. They remain on the new task for 25 minutes before returning to the previous task. This is one of the many reasons why tracking software is important, it pulls in data on all tasks performed, not just the primary coding work. There are open source options, options that link to related project management software and options that run on tools from firms such as Google. We have noted the best three options here.
  • Benchmarking Software Estimation Methods

    1. 1. Benchmarking Software Estimation Methods Adam Boyd, Sameer Huque, Kristine Pachuta, Travis Pincoski, John Trumble July 11, 2013 1
    2. 2. • Product Director at Bloomberg LP • Former IT Sourcing Director at Bloomberg and Pricing Analyst at General Dynamics OUR TEAM • Sr. Education Associate at the American Chemical Society Kristine Adam • Mechanical Engineer at US Bureau of Engraving & Printing • Sr. Consultant at Deloitte Consulting LLP Travis • Senior Financial Planning Analyst at BAE Systems Sameer John 2
    3. 3. AGENDA 1 Review of Project Scope 2 Best Practices Research 3 Interview Results 4 Data Analysis and Benchmarking 5 Data Recommendations 6 Go-Forward Plan 3
    4. 4. OUR PROJECT SCOPE COVERED BEST PRACTICE RESEARCH AND DATA BENCHMARKING PROJECT SCOPE • Compile a list of software estimation methods, best practices and benchmarks used at competing firms in US • Analyze GMD data and summarize findings • Analyze third party data • Present data and suggestions to GMD project sponsors 4
    5. 5. OUR RECOMMENDATIONS ARE BASED ON THE RESULTS OF OUR RESEARCH AND ANALYSIS 1 Researched Industry Best Practices 2 • Searched Industry Journals • Comprehensive News Searches • Established Summaries Interviewed Subject Matter Experts • Interviewed experts across the industry • Used standardized questions • Established Summaries Analyzed Available Data • Analyzed GMD provided data • Analyzed third party data • Established Equation and Recommendations Formulated Recommendations • Compiled themes across all research areas • Formulated the data model • Delivered multiple options for recommendations 3 4 5
    6. 6. A ROADMAP OF INCREMENTAL IMPROVEMENTS WILL HELP YOU MEET YOUR GOALS Productivity Support timely and effective estimates, bug fixes and re-work Continually improve the consistency, reliabi lity and productivity of your developers and the data and resources to track and estimate Build and deploy in-house estimating and productivity planning capability Support consistent and reliable project planning Estimation Consistency
    7. 7. AGENDA 1 Review of Project Scope 2 Best Practices Research 3 Interview Results 4 Data Analysis and Benchmarking 5 Data Recommendations 6 Go-Forward Plan 7
    8. 8. SEVERAL THEMES EMERGED IN OUR RESEARCH OF BEST PRACTICES Theme Research The best measurements are quantitative and qualitative • Data should be refined and measured • Data analysis should be coupled with soft skills analysis • The best developers may not make the best leaders • Measures are best used in large distributed projects ---------------------------------------------------------------------------------------------------------------Some data is easy to game • Metrics like lines of code are easy to trick • Mentoring or leading projects may result in low metrics • Refactoring or documenting are often not accounted ---------------------------------------------------------------------------------------------------------------• Metrics should not be used punitively Metrics can be • Comparing across projects is difficult and possibly inaccurate de-motivating • Project managers, not management should control
    9. 9. DATA MODELS SHOULD HAVE FIVE KEY CHARACTERISTICS Breadth Depth Explicit Actionable Automated To handle multiple technologies To looks at all levels of the application Enough to test against industry standards To show where improvements can be made To not detract from the workload
    10. 10. TECHNICAL GOALS SHOULD ALIGN WITH YOUR CORPORATE AND STRATEGIC GOALS Technical Goals Software Quality Developer Efficiency Customer Satisfaction Strategic Goals Product Quality On Time Delivery Customer Satisfaction Corporate Goals Quality Efficiency Customer Satisfaction 10
    11. 11. AGENDA 1 Review of Project Scope 2 Best Practices Research 3 Interview Results 4 Data Analysis and Benchmarking 5 Data Recommendations 6 Go-Forward Plan 11
    12. 12. WE INTERVIEWED MANY INDUSTRY EXPERTS Chief Information Officer, Philips IT Size Head of Digital Content Development, AOL Director, Thoughtworks Consulting Chief Technologist, SWComplete Co-Director, Security and Software Engineering Research Center Georgetown University Consulting Academia Government Business Sector Corporate 12
    13. 13. THE EXPERTS HAVE STRONG OPINIONS ABOUT THE NECESSITY OF MEASURING DEVELOPER OUTPUT most formal measurement systems may de-motivate developers identify the developer that other developers go to for advice we don't base pay or increase on development metrics at all a great developer will have vocal ideas about changing things AND make it happen peer code review is key 13
    14. 14. MOST EXPERTS STRESSED THE DIFFICULTY OF MEASUREMENT (1) Interviewee Co-Director, Security and Software Engineering Research Center, Georgetown University Key Takeaways lines of code per day delivered (debugged) best metric but backward looking function point analysis allows interoperability but no consensus on translation evaluations must account for code complexity Chief Technologist, SWComplete soft skills (communication, attention to detail) equally important to quality peer evaluations critical component of developer evaluation important not to penalize developers for taking time to exceed expectations 14
    15. 15. MOST EXPERTS STRESSED THE DIFFICULTY OF MEASUREMENT (2) Interviewee Former Head, AOL Digital Content Development Key Takeaways difficult to disaggregate contributions of individual team members leadership important but hard to quantify how to quantify extensibility of code? CIO, Philips IT look at teams not individuals track what and how — both equally important Director, Thoughtworks Consulting poor developers will always be exposed regardless of measurement estimations should be done with input from multiple developers 15
    16. 16. ONLY ONE OF THE COMPANIES WE INTERVIEWED MEASURES INDIVIDUAL DEVELOPER PRODUCTIVITY Measures Productivity Do Not Measure Productivity 16
    17. 17. AGENDA 1 Review of Project Scope 2 Best Practices Research 3 Interview Results 4 Data Analysis and Benchmarking 5 Data Recommendations 6 Go-Forward Plan 17
    18. 18. WE CONDUCTED AN ANALYSIS OF THE GMD PROVIDED DATA SET Approach 1 •Comparison of GMD data to data from ISBSG (International Software Benchmarking Standards Group) •Research to see what other metrics may be tracked Approach 2 •Analysis of GMD regression model and raw data Outcome Combined best practices found in research with experimental work done with GMD data to determine best recommendation for productivity measurements moving forward •Re-define existing variables •Experiment with new approaches 18
    19. 19. WE ALSO CONDUCTED AN ANALYSIS OF A THIRD PARTY DATASET GOAL: Benchmark GMD data to an independent data set •Sample data set obtained from ISBSG •ISBSG tracked roughly thirty different metrics in estimation •They tracked many similar variables •Task type (creation/modification) •Language and standards used •They also tracked a few items that may be helpful to GMD in the future •Job Size •Quantity of defects 19
    20. 20. SEVERAL EXPERIMENTS WERE RUN WITH THE GMD DATA TO DETERMINE THE CRITICAL PATH GOAL: Develop a predictive model with a high R2 Re-defined variables as categorical Tested Java programming jobs only Major improvement when base lining to average programmer Eliminated jobs in which programmers had <6 months experience Assigning different weighting to job difficulty Grouped programming languages by type 20
    21. 21. SUGGESTION: BASELINING TO THE AVERAGE SLOWER THAN AVERAGE FASTER THAN AVERAGE Defining the average allows for better productivity evaluation - Better performers will be faster than the average time assuming acceptable quality 21
    22. 22. THIS RESULTED IN THE NEW MODEL • Seven programmers used to define the average • Java: #4 & #9 • .NET: #19 • LN4: #27 • COBOL: #40 • RPG: #45 • JavaScript: #79 • Additional programmers added to GMD list to create a broad set of data • Programming languages categorized as 0 or 1 and each language given its own variable 22
    23. 23. THE NEW MODEL INPUT VARIABLES Constant term COEFFICIENT P-VALUE -4.349 0.000 Complexity 1.492 0.000 RPG 3.177 0.000 COBOL 1.007 0.287 LN4 2.476 0.000 Java 2.867 0.000 .NET 0.490 0.674 Residual df 264 JavaScript 2.860 0.001 SQL 0.000 N/A Multiple R-squared 0.65 -0.123 0.896 Std. Dev. estimate 0.89 0.607 0.007 Residual SS -0.004 0.917 0.043 0.806 -0.068 0.046 0.029 0.018 0.083 0.011 Development? Creation? Developing Experience (Months) Knowledge of the Business Programming Language Domain Technology/Framework Domain Knowledge of Tool (Software?) 208.50 23
    24. 24. RMS ERRORS INCREASE WITH COMPLEXITY 24
    25. 25. COBOL HAS THE HIGHEST RMS ERROR, LN4 AND JAVA HAVE THE LOWEST 25
    26. 26. AGENDA 1 Review of Project Scope 2 Best Practices Research 3 Interview Results 4 Data Analysis and Benchmarking 5 Data Recommendations 6 Go-Forward Plan 26
    27. 27. THREE KEY DATA RECOMMENDATIONS 1 2 3 Baseline the model to the average programmer to establish relative performance Re-define programming language variable to establish proper context Expand upon the current complexity variable to reduce error measurements 27
    28. 28. 1ST RECOMMENDATION: BASELINE THE MODEL TO THE AVERAGE 1 Baseline the model to the average • • • Account for inherent talent Distinguish great performance from poor performance Adjust as necessary to maintain the baseline average 28
    29. 29. 2ND RECOMMENDATION: RE-DEFINE PROGRAMMING LANGUAGE VARIABLE • 2 Re-define programming language variable • • Current scale for programming language has no predictive relationship Group by language type or on a difficulty scale If neither is used, treat as binary 29
    30. 30. 3RD RECOMMENDATION: EXPAND THE COMPLEXITY VARIABLE • 3 Expand the complexity variable • • Current error measurement grows with project difficulty Change variable to represent more variation in difficulty Add additional break points as needed for accuracy 30
    31. 31. MOVING BEYOND THE DATA CREATES A HIGH PERFORMANCE CULTURE Aligns Data Soft Skills High Performance Culture 31
    32. 32. AGENDA 1 Review of Project Scope 2 Best Practices Research 3 Interview Results 4 Data Analysis and Benchmarking 5 Data Recommendations 6 Go-Forward Plan 32
    33. 33. OUTLINE OF HOW TO ACHIEVE THESE GOALS Dedicated Estimation Program Office (EPO) Maintenance of existing and development of new resource estimating models and tools by the EPO Next Steps Productivity • Standard productbased CES tailored for different offices in the Department Resource estimating models and tools • Reusable resource catalogs Continual collection of actuals to assist with refinement of estimating models and tools Standard cost estimating quality assessment Estimation Consistency Adopt commercially available tools Consider Agile
    34. 34. OTHER OPTIONS: STRATEGIC APPLICATION OF AGILE Business Issue Deployment Strategy Project Approach Organizational Culture Determine Business Issue Examine Organizational Culture Assess Deployment Strategy Tailor Project Approach Determine if Agile is a good fit for the needs of the business Examine the beliefs and values articulated by members of the organization Assess the organization to ensure business issue is addressed and organization is minimally disrupted Tailor relevant pieces of the methodology to confirm business issue is fully supported by project team
    35. 35. OTHER OPTIONS: EFFECTIVE ESTIMATION CAPABILITY In addition to organizational commitment, executive sponsor support, and adequate resources, the following is a partial list of critical success factors in building a sustainable resource analysis capability: Success Factor Description Dedicated Program, Team, or Support Expert stewards of the processs and coaches who maintain momnetum and quality across the organization. Product-oriented Cost Element Structure A well understood CES that communicates to the organization what the investment will include. An itemized “invoice” for the investment. Resource Catalogs Predefined reusable increments of scope that have been previously socialized and endorsed/approved. Enables quick, consistent, and defensible definition of a cost/resource estimate. Resource Estimating Models and Tools Standard models and tools which support and enforce the estimating process. Software Sizing Methodology Software costs are a function of the volume of software to be developed. Measuring the volume is critical for all software resource estimation. Quality Assurance / Validation of Investment Proposal QA checklist help ensure thoroughness. Assessments provide maturity measure of process and program capability.
    36. 36. OTHER OPTIONS: DEVELOP RESOURCE CATALOGS Resource Catalogs include reusable size increments • • • • A resource catalog is a small portion of scope for an IT project A resource catalog will include reusable size increments that will reflect cost, effort, staffing, schedule, labor rates, labor types, and risk Size increments include both non-recurring and recurring costs, and are defined logically or in “T-shirt sizes” Resource catalogs enable: • Quick, consistent, repeatable, and defensible development of lifecycle estimates • Tracking and trend analysis of size increments, enabling better understanding of efficiency gains • Structure for continual estimation process improvement 36
    37. 37. OTHER OPTIONS: OVERVIEW OF 3RD PARTY SOFTWARE OPTIONS Options for the best software tools to track productivity • According to a 2009 University of California study, developers spend an average of 11 minutes on a task before they are distracted by a separate task • It takes them 25 minutes to get back to the original task Mylyn – open source option Tasktop and Tasktop pro integrate with IBM Rational Cubeon – runs on Google Code 37
    38. 38. TAKING THESE STEPS WILL RESULT IN SOUND MEASUREMENTS AND ESTIMATIONS 1 Collect More data • Start collecting project actuals • Provide standard data collection templates to enable data analysis • Manage projects against resource estimates Create a resource library • Quick, consistent, repeatable, and defensible development of resource estimates • Tracking and trend analysis of size increments, • Structure for continual estimation process improvement 2 3 Consider alternatives • Third Party Software • Cost oriented product structure • Adoption of Agile development methods
    39. 39. Gracias! 39
    40. 40. Appendix 40
    41. 41. WE REVIEWED RESEARCH FROM MULTIPLE SOURCES Sample Works Reviewed: Measuring Developers, Aligning Perspectives and other Best Practices by Medha Umarji and Forrest Shull published in the IEEE Software Journal in 2009 The Futility of Developer Productivity Metrics by Neil Mcallister November 17, 2011 http://www.infoworld.com/d/application-development/the-futility-developerproductivity-metrics-179244 Five Requirements for Measuring Application Quality by Jitendra Subramanyam, director of research, CAST Inc. June 17, 2011 http://www.networkworld.com/news/tech/2011/061611-application-quality.html Establishing a Measurement Program, Whitepaper by the Construx Staff 41
    42. 42. OTHER OPTIONS: STRATEGIC APPLICATION OF AGILE Business Issue Deployment Strategy Project Approach Organizational Culture Determine Business Issue Examine Organizational Culture Assess Deployment Strategy Tailor Project Approach Determine if Agile is a good fit for the needs of the business Examine the beliefs and values articulated by members of the organization Assess the organization to ensure business issue is addressed and organization is minimally disrupted Tailor relevant pieces of the methodology to confirm business issue is fully supported by project team • Project charter • List of key dependencies • Stakeholder interviews • Stakeholder survey & questionnaire results • Deployment strategy plan • Project processes • Project team charter • Project team values Is there high market uncertainty or need for customer involvement? Is the environment rapidly changing? What are the values of the organization? Is the organizational culture conducive to Agile adoption? Who are the key stakeholders? What strategy fits the business need? What strategy is supportable by the organization? What is required to deliver the project? What is the commitment of team members to each other?
    43. 43. SEVERAL EXPERIMENTS WERE RUN WITH THE GMD DATA • Different methods were attempted to achieve a higher R2 for the predictive model • Re-defined variables as categorical •Assigned different weighting to job difficulty • Tested Java programming jobs only • Grouped programming languages by type • Eliminated jobs in which programmers had <6 months experience •High variability • Also experimented with other data mining approaches other than multiple linear regression • Regression trees •Good results (Lower RMS Error), not precise enough for GMD • Principal Components Analysis • Major improvement when base lining to average programmer 43
    44. 44. OTHER OPTIONS: DEVELOP PRODUCT-ORIENTED COST ELEMENT STRUCTURE • Develop product-oriented cost element structure (CES) that will facilitate decision making • Utilize resource catalogs to populate individual cost elements • Standardize as many of the cost elements as possible to facilitate consistent cost estimation while allowing easy integration of project-specific custom cost elements • Facilitate repeatable and consistent cost reporting with CES standardization • Facilitate consistent and reliable reporting of cost information

    ×