Software Effort
Estimation
By:
Aneesa Ehsan 2021-SE-17
Ahtsham Ul Haq 2021-SE-25
“Any fool can know….
The Point is to understand….”
-Albert Einstein
Software Effort
Estimation
Successful software projects are delivered on time, within budget, and with
the required quality. This implies that targets are set which the project
manager then tries to meet. However, these targets must be reasonable -
project managers cannot be expected to achieve record levels of productivity
from their teams if the initial estimates were incorrect. Realistic estimates are
therefore crucial for software project success.
4
Problems with Over and under-estimating
• Over-estimate may cause the project to take
longer than it would otherwise. This can be
explained by the application of two 'laws‘:
– Parkinson’s Law: ‘Work expands to fill the time
available’. That is, given an easy target staff will work
less hard.
– Brooks Law:‘ putting more people on a late job makes it
later‘’. That is, If there is an overestimate of the effort
required, this could lead to more staff being allocated
than needed and managerial overheads being
increased.
Where are Estimates Done?
1 Strategic Planning
Project portfolio management involves
estimating the costs and benefits of new
applications to allocate priorities. These
estimates may also influence staffing
decisions.
2 Feasibility Study
This confirms that the benefits of the
potential system will justify the costs.
3 System Specification
Estimates at the design stage will confirm
that the feasibility study is still valid.
4 Supplier Proposals
Potential contractors produce estimates
as the basis of their bids, which can be
compared to in-house development costs.
Software estimation Methods: Bottom-Up Estimating
1 Break Down Project
The estimator breaks the project into its component tasks, decomposing each
task into subtasks until they are small enough for an individual to complete in a
week or two.
2 Estimate Each Task
The effort for each activity is calculated and then summed to get an overall
estimate. This approach works best in the later, more detailed stages of project
planning.
3 Procedural Code Approach
This involves envisioning the software modules, estimating the source lines of
code for each, and accounting for complexity and technical difficulty to calculate
the overall work effort.
Software estimation Methods: Top-Down Estimating
Parametric Models
These models relate project
effort to variables associated
with characteristics of the
final system, using formulae
like "effort = (system size) x
(productivity rate)". Examples
include COCOMO and
function point analysis.
Productivity Factors
Top-down models often focus
on assessing the amount of
work (e.g. KLOC) and the
productivity rate (e.g. days
per KLOC) to derive the
overall effort estimate.
Regression Analysis
Statistical techniques like
least squares regression can
be used to derive effort
estimation equations from
historical data on project size
and effort.
Top down estimation: Algorithmic/Parametric models
• COCOMO (lines of code) and function points
examples of these.
• Problem with COCOMO etc:
guess algorithm estimate
but what is desired is
system
characteristic
algorithm estimate
9
Parametric models- the need for historical data
• simplistic model for an estimate
– Estimated effort = (System size) /Productivity
e.g.
system size = lines of code
productivity = lines of code per day
– Productivity = (System size) / Effort
• based on past projects
Function Point Analysis
External User Types
The basis of function point analysis is that
information systems comprise five major
components that are of benefit to users:
external inputs, external outputs, external
inquiries, logical internal files, and external
interface files.
Complexity Weighting
Each component is classified as having high,
average, or low complexity, and the counts
are multiplied by specified weights to
calculate an overall function point score
indicating the information processing size.
Productivity Estimation
Function point analysis can be used to
estimate productivity by relating the function
point count to historical effort data,
providing a top-down approach to software
sizing and effort estimation.
Advantages
Function point analysis is language-
independent and can be applied early in the
development lifecycle, making it useful for
feasibility studies and supplier proposal
evaluations.
Software estimation Methods: Function Point Analysis
The analyst identifies each instance of each external user type in the application.
Each component is then classified as having either high, average or low
complexity. The counts of each external usertype in each complexity band are
mult¡plied by specified weights (see Table 5.2) to get FP scores which are summed
to obtain an overall FP count which indicate the information processing size.
Software estimation Methods: Function Point Mark I
The process of measuring the size and complexity of an information system. The key
steps involve calculating Unadjusted Function Points (UFPs) based on the number of
input/output data elements and entity types referenced, then applying a Technical
Complexity Adjustment (TCA) to account for additional implementation factors.
Calculating UFPs
The UFPs are calculated
using a formula that
considers the number of
input data element types
(Wi), entity types
referenced (Wx), and
output data element types
(Wo). Weightings are
applied to each component
based on the relative effort
required for inputs, data
access, and outputs.
Technical Complexity
Adjustment
The TCA recognizes that
two systems with the same
functionality may have
different implementation
complexities, such as
additional security
measures. Identifying
further factors to suit local
circumstances is
encouraged to refine the
effort estimation.
Function points Mk II: UFPS
• For each transaction,
count
– data items input (Ni)
– data items output
(No)
– entity types accessed
(Ne)
#entities
accessed
#input
items
#output
items
FP count (unadjusted function point) =
Ni * 0.58 + Ne * 1.66 + No * 0.26
The Function Point Analysis technique is used to assess the functionality delivered by software
and a 'function point' is the unit of measurement.
Exercise
• A cash receipt transaction in the IOE maintenance accounts subsystem
access two entity types-INVOICE and CASH-RECEIPT.
– The data inputs are:
• Invoice number
• Date received
• Cash received
• If an INVOICE record is not found for the invoice number then an error
message is issued, If the invoice number is found then a CASH-RECEIPT
record is created. The error message is the only output of the
transaction. Calculate the unadjusted FP count for this
transaction.
The unadjusted function points, using the industry average weightings,
for this transaction would therefore be:
(0.58*3) + (1.66 x 2) + (0.26 x 1) = 5.32
Software estimation Methods:
COCOMO II: A Parametric Productivity
Model
COCOMO (Constructive Cost Model), developed by Barry Boehm, is a widely recognized
family of software cost estimation models. The original COCOMO model was based on a
study of 63 projects, with only 7 being business systems. The basic model used an
equation relating effort to system size, with constants adjusted based on the project's
technical nature and development environment.
1
Organic Mode
2
Embedded Mode
3
Semi-Detached Mode
COCOMO II: A Parametric Productivity
Model
(Constructive Cost Model):The term COCOMO really refers to a group
of models
Allows an organization to benchmark its software development
productivity
Basic model
effort = c x sizek
where effort was measured in pm or the number of 'person-months‘
also called a Man-Month or a Staff-Month (one month of effort by
one person) consisting of units of 152 working hours
C and k depend on the type of system: organic, semi-detached,
embedded
Size is measured in ‘kloc’ ie. Thousands of lines of code
The COCOMO constants
System type c k
Organic (broadly,
information systems)
2.4 1.05
Semi-detached 3.0 1.12
Embedded (broadly,
real-time)
3.6 1.20
k exponentiation – ‘to the power of…’
adds disproportionately more effort to the larger projects
takes account of bigger management overheads
COCOMO System types
• Organic mode: This would typically be the case when relatively small
teams developed software in a highly familiar in-house environment
and when the system being developed was small and the interface
requirements were flexible.
• Embedded mode: This meant that the product being developed had
to operate within very tight constraints and changes to the system
Were very costly.
• Semi-detach mode: This combined elements of the organic and the
embedded modes or had characteristics that came between the two.
EXERCISE
• Problem: Assume that the size of an organic type software product
has been estimated to be 32,000 lines of source code. Assume that
the average salary of software engineers be 15,000/- per month.
• Determine
– The effort required to develop the software product
– Cost required to develop the product
COCOM II
• The following model can there be used to calculate an estimate of
person-months’
• Each of the scale factors for project is rated according to a range of
judgments: very low, low, nominal, high, very high, extra high.
• There is a number related to each rating of the individual scale factors -
see Table 5.5.
• These are summed, then multiplied by 0.01 and added to the constant
(B=0.91) to get the overall exponent scale factor.
COCOMO II-SF values
• Precedentedness (PREC) This quality is the degree to which there are precedents or similar
past cases for the current project.
• Development flexibility (FLEX) This reflects the number of different ways there are of
meeting the requirements.
• Architecture/risk resolution (RESL) This reflects the degree of uncertainty about the
requirements.
• Team cohesion (TEAM) This reflects the degree to which there is a large dispersed team
(perhaps in several countries) as opposed to there being a small tightly knit team.
• Process maturity (PMAT) The more structured and organized the way the software is
produced, the lower the uncertainty.
COCOM II EXERCISE
• A new project has 'average' novelty for the software supplier that is going to execute it
and is thus given a nominal rating on this account for predestines. Development flexibility
is high, but requirements may change radically and so the risk resolution exponent is
rated very low. The development team are all located in the same office and this leads to
team cohesion being rated as very high, but the software
house as a whole tends to be very informal in its standards and procedures and the
process maturity driver has therefore been given a rating of 'low'.
• (i) What would be the scale factor (sf) in this case?
• (ii) What would the estimate of effort if the size of the application was estimated as in
the region of 2000 lines of code?
COCOM II EXERCISE solution
• (i) the over al scale factor would be:
– sf = 0.91 + 0.01 X(3.72+2.03+7.07+1.1+6.24) = 1.112
• (ii) The estimate effort would be:
– 2.94X(size)sf
=2.94X(2)1.112
=6.35 staff-months (pm)
Software estimation Methods: Estimating by Analogy
Estimation by analogy, also known as case-based reasoning, is a technique that leverages information about
completed projects to estimate the effort for a new project. The estimator identifies past projects (source
cases) with similar characteristics to the new project (the target case), and then adjusts the recorded effort
from the source case to produce an estimate for the target.
Identifying Similarities
The key challenge is accurately
identifying the similarities and
differences between the target project
and the available source cases,
especially when dealing with a large
number of past projects.
Automated Assistance
Tools like ANCEL have been developed
to automate the process of selecting the
most relevant source case by measuring
the Euclidean distance between the
target and source project parameters.
Advantages and Limitations
Estimation by analogy can be effective when limited historical data is available, but it
requires careful analysis to ensure the source cases are truly representative of the target
project.
Estimating by analogy
• This is also called case-based reasoning.
• The estimator identifies completed projects (source cases) with similar
characteristics to the new project (the target case).
• The effort recorded for the matching source case is then used as a base
estimate for the target.
• The estimator then identifies differences between the target and the source
and adjusts the base estimate to produce an estimate for the new project.
Estimating by analogy
• A problem is identifying the similarities and differences between applications where you
have a large number of past projects to analyze.
• One attempt to automate this selection process is the ANGEL software tool.
• This identifies the source case that is nearest the target by measuring the Euclidean
distance between cases.
• The Euclidean distance is calculated as:
EXERCISE
• Say that the cases are being matched on the basis of two parameters, the number
of inputs to and the number of outputs from the application to be built. The new
project is known to require 7 inputs and 15 outputs. One of the past cases, project
A, has 8 inputs and 17 outputs.
• Calculate The Euclidean distance between the source and the target
• Project B has 5 inputs and 10 outputs. What would be the Euclidean distance
between this project and the target new project being considered above?
• Is project B a better analogy with the target than project A?
Estimating by analogy
source cases
attribute values
effort
attribute values ?????
target case
attribute values
attribute values
attribute values
attribute values
attribute values
effort
effort
effort
effort
effort
Select case
with closet attribute
values
Use effort
from source as
estimate
Stages: identify
• Significant features of the current project
• previous project(s) with similar features
• differences between the current and previous projects
• possible reasons for error (risk)
• measures to reduce uncertainty
Machine assistance for source selection (ANGEL)
30
Num
ber
of
inpu
ts
Number of outputs
target
Source A
Source B
Euclidean distance = sq root ((It - Is)2 + (Ot - Os)2 )
It-Is
Ot-Os
Some conclusions: how to review estimates
Ask the following questions about an estimate
• What are the task size drivers?
• What productivity rates have been used?
• Is there an example of a previous project of
about the same size?
• Are there examples of where the productivity
rates used have actually been found?

Group-5-presentation_SPM, here is deatiled version.pptx

  • 1.
    Software Effort Estimation By: Aneesa Ehsan2021-SE-17 Ahtsham Ul Haq 2021-SE-25
  • 2.
    “Any fool canknow…. The Point is to understand….” -Albert Einstein
  • 3.
    Software Effort Estimation Successful softwareprojects are delivered on time, within budget, and with the required quality. This implies that targets are set which the project manager then tries to meet. However, these targets must be reasonable - project managers cannot be expected to achieve record levels of productivity from their teams if the initial estimates were incorrect. Realistic estimates are therefore crucial for software project success.
  • 4.
    4 Problems with Overand under-estimating • Over-estimate may cause the project to take longer than it would otherwise. This can be explained by the application of two 'laws‘: – Parkinson’s Law: ‘Work expands to fill the time available’. That is, given an easy target staff will work less hard. – Brooks Law:‘ putting more people on a late job makes it later‘’. That is, If there is an overestimate of the effort required, this could lead to more staff being allocated than needed and managerial overheads being increased.
  • 5.
    Where are EstimatesDone? 1 Strategic Planning Project portfolio management involves estimating the costs and benefits of new applications to allocate priorities. These estimates may also influence staffing decisions. 2 Feasibility Study This confirms that the benefits of the potential system will justify the costs. 3 System Specification Estimates at the design stage will confirm that the feasibility study is still valid. 4 Supplier Proposals Potential contractors produce estimates as the basis of their bids, which can be compared to in-house development costs.
  • 6.
    Software estimation Methods:Bottom-Up Estimating 1 Break Down Project The estimator breaks the project into its component tasks, decomposing each task into subtasks until they are small enough for an individual to complete in a week or two. 2 Estimate Each Task The effort for each activity is calculated and then summed to get an overall estimate. This approach works best in the later, more detailed stages of project planning. 3 Procedural Code Approach This involves envisioning the software modules, estimating the source lines of code for each, and accounting for complexity and technical difficulty to calculate the overall work effort.
  • 7.
    Software estimation Methods:Top-Down Estimating Parametric Models These models relate project effort to variables associated with characteristics of the final system, using formulae like "effort = (system size) x (productivity rate)". Examples include COCOMO and function point analysis. Productivity Factors Top-down models often focus on assessing the amount of work (e.g. KLOC) and the productivity rate (e.g. days per KLOC) to derive the overall effort estimate. Regression Analysis Statistical techniques like least squares regression can be used to derive effort estimation equations from historical data on project size and effort.
  • 8.
    Top down estimation:Algorithmic/Parametric models • COCOMO (lines of code) and function points examples of these. • Problem with COCOMO etc: guess algorithm estimate but what is desired is system characteristic algorithm estimate
  • 9.
    9 Parametric models- theneed for historical data • simplistic model for an estimate – Estimated effort = (System size) /Productivity e.g. system size = lines of code productivity = lines of code per day – Productivity = (System size) / Effort • based on past projects
  • 10.
    Function Point Analysis ExternalUser Types The basis of function point analysis is that information systems comprise five major components that are of benefit to users: external inputs, external outputs, external inquiries, logical internal files, and external interface files. Complexity Weighting Each component is classified as having high, average, or low complexity, and the counts are multiplied by specified weights to calculate an overall function point score indicating the information processing size. Productivity Estimation Function point analysis can be used to estimate productivity by relating the function point count to historical effort data, providing a top-down approach to software sizing and effort estimation. Advantages Function point analysis is language- independent and can be applied early in the development lifecycle, making it useful for feasibility studies and supplier proposal evaluations.
  • 11.
    Software estimation Methods:Function Point Analysis The analyst identifies each instance of each external user type in the application. Each component is then classified as having either high, average or low complexity. The counts of each external usertype in each complexity band are mult¡plied by specified weights (see Table 5.2) to get FP scores which are summed to obtain an overall FP count which indicate the information processing size.
  • 12.
    Software estimation Methods:Function Point Mark I The process of measuring the size and complexity of an information system. The key steps involve calculating Unadjusted Function Points (UFPs) based on the number of input/output data elements and entity types referenced, then applying a Technical Complexity Adjustment (TCA) to account for additional implementation factors. Calculating UFPs The UFPs are calculated using a formula that considers the number of input data element types (Wi), entity types referenced (Wx), and output data element types (Wo). Weightings are applied to each component based on the relative effort required for inputs, data access, and outputs. Technical Complexity Adjustment The TCA recognizes that two systems with the same functionality may have different implementation complexities, such as additional security measures. Identifying further factors to suit local circumstances is encouraged to refine the effort estimation.
  • 13.
    Function points MkII: UFPS • For each transaction, count – data items input (Ni) – data items output (No) – entity types accessed (Ne) #entities accessed #input items #output items FP count (unadjusted function point) = Ni * 0.58 + Ne * 1.66 + No * 0.26 The Function Point Analysis technique is used to assess the functionality delivered by software and a 'function point' is the unit of measurement.
  • 14.
    Exercise • A cashreceipt transaction in the IOE maintenance accounts subsystem access two entity types-INVOICE and CASH-RECEIPT. – The data inputs are: • Invoice number • Date received • Cash received • If an INVOICE record is not found for the invoice number then an error message is issued, If the invoice number is found then a CASH-RECEIPT record is created. The error message is the only output of the transaction. Calculate the unadjusted FP count for this transaction. The unadjusted function points, using the industry average weightings, for this transaction would therefore be: (0.58*3) + (1.66 x 2) + (0.26 x 1) = 5.32
  • 15.
    Software estimation Methods: COCOMOII: A Parametric Productivity Model COCOMO (Constructive Cost Model), developed by Barry Boehm, is a widely recognized family of software cost estimation models. The original COCOMO model was based on a study of 63 projects, with only 7 being business systems. The basic model used an equation relating effort to system size, with constants adjusted based on the project's technical nature and development environment. 1 Organic Mode 2 Embedded Mode 3 Semi-Detached Mode
  • 16.
    COCOMO II: AParametric Productivity Model (Constructive Cost Model):The term COCOMO really refers to a group of models Allows an organization to benchmark its software development productivity Basic model effort = c x sizek where effort was measured in pm or the number of 'person-months‘ also called a Man-Month or a Staff-Month (one month of effort by one person) consisting of units of 152 working hours C and k depend on the type of system: organic, semi-detached, embedded Size is measured in ‘kloc’ ie. Thousands of lines of code
  • 17.
    The COCOMO constants Systemtype c k Organic (broadly, information systems) 2.4 1.05 Semi-detached 3.0 1.12 Embedded (broadly, real-time) 3.6 1.20 k exponentiation – ‘to the power of…’ adds disproportionately more effort to the larger projects takes account of bigger management overheads
  • 18.
    COCOMO System types •Organic mode: This would typically be the case when relatively small teams developed software in a highly familiar in-house environment and when the system being developed was small and the interface requirements were flexible. • Embedded mode: This meant that the product being developed had to operate within very tight constraints and changes to the system Were very costly. • Semi-detach mode: This combined elements of the organic and the embedded modes or had characteristics that came between the two.
  • 19.
    EXERCISE • Problem: Assumethat the size of an organic type software product has been estimated to be 32,000 lines of source code. Assume that the average salary of software engineers be 15,000/- per month. • Determine – The effort required to develop the software product – Cost required to develop the product
  • 20.
    COCOM II • Thefollowing model can there be used to calculate an estimate of person-months’ • Each of the scale factors for project is rated according to a range of judgments: very low, low, nominal, high, very high, extra high. • There is a number related to each rating of the individual scale factors - see Table 5.5. • These are summed, then multiplied by 0.01 and added to the constant (B=0.91) to get the overall exponent scale factor.
  • 21.
    COCOMO II-SF values •Precedentedness (PREC) This quality is the degree to which there are precedents or similar past cases for the current project. • Development flexibility (FLEX) This reflects the number of different ways there are of meeting the requirements. • Architecture/risk resolution (RESL) This reflects the degree of uncertainty about the requirements. • Team cohesion (TEAM) This reflects the degree to which there is a large dispersed team (perhaps in several countries) as opposed to there being a small tightly knit team. • Process maturity (PMAT) The more structured and organized the way the software is produced, the lower the uncertainty.
  • 22.
    COCOM II EXERCISE •A new project has 'average' novelty for the software supplier that is going to execute it and is thus given a nominal rating on this account for predestines. Development flexibility is high, but requirements may change radically and so the risk resolution exponent is rated very low. The development team are all located in the same office and this leads to team cohesion being rated as very high, but the software house as a whole tends to be very informal in its standards and procedures and the process maturity driver has therefore been given a rating of 'low'. • (i) What would be the scale factor (sf) in this case? • (ii) What would the estimate of effort if the size of the application was estimated as in the region of 2000 lines of code?
  • 23.
    COCOM II EXERCISEsolution • (i) the over al scale factor would be: – sf = 0.91 + 0.01 X(3.72+2.03+7.07+1.1+6.24) = 1.112 • (ii) The estimate effort would be: – 2.94X(size)sf =2.94X(2)1.112 =6.35 staff-months (pm)
  • 24.
    Software estimation Methods:Estimating by Analogy Estimation by analogy, also known as case-based reasoning, is a technique that leverages information about completed projects to estimate the effort for a new project. The estimator identifies past projects (source cases) with similar characteristics to the new project (the target case), and then adjusts the recorded effort from the source case to produce an estimate for the target. Identifying Similarities The key challenge is accurately identifying the similarities and differences between the target project and the available source cases, especially when dealing with a large number of past projects. Automated Assistance Tools like ANCEL have been developed to automate the process of selecting the most relevant source case by measuring the Euclidean distance between the target and source project parameters. Advantages and Limitations Estimation by analogy can be effective when limited historical data is available, but it requires careful analysis to ensure the source cases are truly representative of the target project.
  • 25.
    Estimating by analogy •This is also called case-based reasoning. • The estimator identifies completed projects (source cases) with similar characteristics to the new project (the target case). • The effort recorded for the matching source case is then used as a base estimate for the target. • The estimator then identifies differences between the target and the source and adjusts the base estimate to produce an estimate for the new project.
  • 26.
    Estimating by analogy •A problem is identifying the similarities and differences between applications where you have a large number of past projects to analyze. • One attempt to automate this selection process is the ANGEL software tool. • This identifies the source case that is nearest the target by measuring the Euclidean distance between cases. • The Euclidean distance is calculated as:
  • 27.
    EXERCISE • Say thatthe cases are being matched on the basis of two parameters, the number of inputs to and the number of outputs from the application to be built. The new project is known to require 7 inputs and 15 outputs. One of the past cases, project A, has 8 inputs and 17 outputs. • Calculate The Euclidean distance between the source and the target • Project B has 5 inputs and 10 outputs. What would be the Euclidean distance between this project and the target new project being considered above? • Is project B a better analogy with the target than project A?
  • 28.
    Estimating by analogy sourcecases attribute values effort attribute values ????? target case attribute values attribute values attribute values attribute values attribute values effort effort effort effort effort Select case with closet attribute values Use effort from source as estimate
  • 29.
    Stages: identify • Significantfeatures of the current project • previous project(s) with similar features • differences between the current and previous projects • possible reasons for error (risk) • measures to reduce uncertainty
  • 30.
    Machine assistance forsource selection (ANGEL) 30 Num ber of inpu ts Number of outputs target Source A Source B Euclidean distance = sq root ((It - Is)2 + (Ot - Os)2 ) It-Is Ot-Os
  • 31.
    Some conclusions: howto review estimates Ask the following questions about an estimate • What are the task size drivers? • What productivity rates have been used? • Is there an example of a previous project of about the same size? • Are there examples of where the productivity rates used have actually been found?

Editor's Notes

  • #19 Effort Estimation Size (KLOC): 32,000 lines of code (KLOC stands for Kilo Lines Of Code, which is 1,000 lines) Basic COCOMO Formula for Organic Projects: Effort (Person-Months) = 2.4 * (KLOC) ^ 1.05 Calculation: Effort = 2.4 * (32) ^ 1.05 ≈ 91 Person-Months Effort Estimation Size (KLOC): 32,000 lines of code (KLOC stands for Kilo Lines Of Code, which is 1,000 lines) Basic COCOMO Formula for Organic Projects: Effort (Person-Months) = 2.4 * (KLOC) ^ 1.05 Calculation: Effort = 2.4 * (32) ^ 1.05 ≈ 91 Person-Months
  • #27 Distance between Target and Project A: Euclidean distance formula: sqrt((x1 - x2)^2 + (y1 - y2)^2) Calculation: Distance = sqrt((7 - 8)^2 + (15 - 17)^2) Distance ≈ 2.236 Distance between Target and Project B: Distance = sqrt((7 - 5)^2 + (15 - 10)^2) Distance ≈ 5.385