Are Estimates Based on Historical Data or Subject Matter Experts Better?


Published on

In this Trusted Advisor report, we seek to answer the question of whether estimates based on historical data or estimates from subject matter experts are better - and why.

The report is available for download here:

To access more Trusted Advisor articles, visit:

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Are Estimates Based on Historical Data or Subject Matter Experts Better?

  1. 1. ©2014 David Consulting Group Page 1 of 4 Which are better - estimates based on historical data or estimates from subject matter experts? February 2014 Scope of this Report The discussion of whether estimates based on historical data are better than estimates from subject matter experts (SME) is a difficult question. We suggest that as SMEs are actually repositories of historical data (their memory – as good or bad as that might be – assuming that they are ever informed about actuals) the question is a false dichotomy. Rather the question is more a reflection of whether a tool based estimate is better than a SME/expert based estimate. Types of Estimates It is easy to decide that there are only two categories of estimates, expert estimates, sometimes known as expertise driven estimates and model based estimates. A more complete taxonomy was suggested by Boehm, Abts and Chulani in 2000. The survey of estimation techniques developed a six category framework:  Model-Based – SEER, SLIM1 , COCOMO  Learning-Oriented - Neural, Case-based (Analogy) – models learn with more data  Dynamics-Based – Abdel-Hamid  Regression-Based – OLS Robust  Composite – COCOMO II  Expertise based – Delphi, WBS Only the expertise based methods were not based on some form of model or mathematical network generated from collections of historical data. While Boehm, Abts and Chulani suggested that experience based techniques were useful in the absence of quantified, empirical data what they are suggesting is more a matter of experimental rigor than an expression that SME based methods do not have experiential data at their disposal. History / Collaboration Comparison: We can view popular estimation techniques through two separate perspectives: data and algorithms. Figure One, The Data / Collaboration Continuum, compares both perspectives. First, from the 1 SEER is a registered trademark of Galorath, Inc. SLIM is a registered trademark of Quantitative Software Management, Inc.
  2. 2. ©2014 David Consulting Group Page 2 of 4 perspective of experiential data / historical data and algorithmic / collaborative techniques, we see that many of the experiential based techniques leverage collaborative techniques to combat perceived weaknesses. For example, one perceived problem that is seen in expert estimates is that of memory. Estimates based on memory are subject to the cognitive bias of the estimator therefore involvement of others provides a balance that helps to cancel the potentially negative impacts of bias. For example, the Scotty from Star Trek syndrome (bias) creeps into play with expert estimates. The estimator estimates high knowing it’ll be corrected anyway and the ultimate result may be a sensible figure. Figure 1: Data / Collaboration Continuum Each category of estimates described in the Boehm paper seeks to compensate for the perceived shortcomings of expert based estimates and other model based techniques. Therefore, we would hypothesize that models built on historical data are superior to expert bases estimates. History Based Estimates v. Expert Estimates – The results: Jorgensen defined expert estimation where “the estimation work is conducted by a person recognized as an expert on the task, and that a significant part of the estimation process is based on a non-explicit and non-recoverable reasoning process, i.e., “intuition”. Expert estimation is by far the dominant form of estimation. Jorgensen noted that no study he found indicated that manual or automated estimation models were in use the majority of the time. Of note Jorgensen reviewed 15 academic, peer reviewed studies comparing expert estimates with estimates based on formal estimation models. The results were at best split. The review of the studies showed that five indicated expert estimations provided superior results, five showed formal estimation models provided better results and finally five showed no difference. While it might be possible to poke holes in any of the studies what is important is that there is no clear pattern in the evidence support the conclusion that formal estimation models outperform expert estimation. If the performance of expert estimation versus formal estimation models is a dead heat then two further questions need to be asked. 1. What are the attributes of an expert in the estimation context? Historical Data Tools Analogy Delphi WBS Experiential Data Collaborative Planning Poker Algorithm Based
  3. 3. ©2014 David Consulting Group Page 3 of 4 2. Are calibrated estimation tools more or less effective than non-calibrated models? An expert is defined as a person who has a comprehensive and authoritative knowledge of or skill in a particular area. An expert estimator has to have knowledge in three separate macro areas. The three are: 1. Technical: The technical architecture of the environment that that the project is being done within will affect both the productivity rates and tasks needed to deliver a project. For example, all things being equal, delivering the same functionality in Ruby on the Web versus in macro assembler on the mainframe will generally require less effort and less calendar time. 2. Business: Understanding of the business environment will allow the estimator to interpret the requirements and understand who will have specific knowledge that may be needed to generate the estimate if the estimator does not know. 3. Estimation: The process of estimation is not the same as making up a number. The person developing an estimate has to understand and have the skills required to generate an estimate. Skills include the ability to collect and interpret information on the scope of the project, the capabilities of the organization and how the work is going to be done (different methods will have different effort profiles). While expertise in each of these three areas is needed for effective expert estimates, expertise in each may not be sufficient. Dr Ricardo Valerdi in an interview with Tom Cagley (see link to SPAM Cast #84 in Sources below) stated that his research indicated that estimator optimism was a significant source of estimation error. A standard estimation process and historical data is a mechanism to correct for type of error. Expertise causes a belief bias that generates optimism, the use of outside models and historical data is one mechanism to combat this form of cognitive bias. Assuming an organization leverages expert estimates they will need to find expertise in all of three categories. Finding a single individual with all three categories is difficult. The difficulty finding a single person with all these attributes is one of the reasons why many organization use estimation teams to develop estimates. Estimation teams make sense in applications that perform significant numbers of important estimates. Where estimation teams don’t make sense or all of the expertise can’t be made available, historical project performance data and tools can be used to augment or replace the need for much of the human expertise needed for an expert estimate. Calibrated Models Compared to Un-calibrated Models As noted earlier, the academic jury is out on whether estimates based on historical models are better than expert estimates. However if finding or empaneling the estimators with the proper level of expertise is difficult then organizations will naturally turn to tools with historical data in scenarios where estimates are an important component in the project life cycle. Tools and historical data are not sufficient to deliver an effective estimate. Calibration is needed to generate a good tool based estimate. The calibration process mines project information to establish relationships between the project complexity and behavioral parameters and quantitative performance. Logically if the studies suggest that expert estimates and historical data, model based estimates provide equivalent results and that calibrated model based estimates are better than un-calibrated, out of the
  4. 4. ©2014 David Consulting Group Page 4 of 4 box estimates then it behooves organizations that use model based estimates using historical data to collect the data needed to calibrate the tool. Interestingly, from DCG experience over the years, the collection of the complexity and behavioral data required to build or calibrate statistical models tends to be the first place measurement programs economize. Failing to collect complexity and behavioral data significantly reduces the value of project quantitative performance data to estimation. Conclusion Historical data is used both in model based and expert estimates. Estimating without memory of the past is not possible. The bigger issue is whether models derived from historical data are clearly superior to expert estimates. If you are trying to remove the need for expert estimators he answer is unfortunately . . .no. However what is true is that expert estimates require a level of expertise that is sometimes not readily available which then requires leveraging tools that can be used to validate estimates. If estimates are important and the required level of expertise is not available then the choice is far starker. Estimates generated from models leveraging historical data in calibrated tools are the only logical choice. Sources 1. Software Development Cost Estimation Approaches – A Survey, March 2000, 2. Jorgensen, M., (2004), “A Review of Studies on Expert Estimation of Software Development Effort,” Journal of Systems & Software, Vol 70, no 1, pp. 37-60 3. Software Process and Measurement Cast 84 4. Kemerer, C. F., (1987), “An empirical validation of software cost estimation models,” Communications of the ACM, Vol 30, No 5, pp. 416-429. 5. Stark, G, (2011), A Comparison of Parametric Software Estimation Models Using Real Project Data, (Last accessed 2/24/14)