Leiden Institute of Advanced Computer Science




            System s Development and Project
            Management –
            Software effort estimation


            Prof. Dr. Thomas Bäck




                                                1
Leiden Institute of Advanced Computer Science                           Dates

Feb. 1               14:45 – 17:30           Introduction, Project Description
Feb. 2               13:45 – 16:30           STEP WISE Approach to Project Planning
Feb. 9               13:10 – 15:45           Selecting an Appropriate Software Dev.
                                             Approach
Feb. 15              14:45 – 17:30           Activity Planning and Resource Allocation
Feb. 16              15:15 – 18:00           Software Effort Estimation
Feb. 22              14:45 – 17:30           Risk management, project escalation
Feb. 23              13:45 – 16:30           Project monitoring and control
Mar. 1               14:45 – 17:00           Exam
Mar. 2               13:45 – 16:30           Software Quality Assurance
Mar. 8               14:45 – 17:30           Managing People; Contract Management
Mar. 9               13:45 – 16:30           Various
Mar. 15              14:45 – 17:30           Trade Fair


                                                                                      2
Leiden Institute of Advanced Computer Science




STEP WISE overview
      1. Identify project objectives           0. Select Project             2. Identify project infrastructure




                                         3. Analyze pr. characteristics


                                       4. Identify products and activities
                  Review lower
                  level detail
                                         5. Estimate effort for activity
                                                                                  For each activity

                                            6. Identify activity risks


   10. Lower level planning                  7. Allocate resources


       9. Execute plan                     8. Review / publicize plan



                                                                                                        3
Leiden Institute of Advanced Computer Science




What makes a successful project?

  Delivering:                   Stages:
  !   Agreed functionality      1. Set targets
  !   On time                   2. Attempt to achieve
  !   At the agreed cost           targets
  !   With the required quality


                But what if the targets are not achievable?


                                                        4
Leiden Institute of Advanced Computer Science



Software effort estimation:
Notoriously difficult …
  !       Complexity and invisibility of software
  !       Intensely human activities of system development
  !       Cannot be treated in a purely mechanistic way
  !       Novel applications of software
  !       Changing technology
  !       Lack of homogeneity of project experience




                                                             5
Leiden Institute of Advanced Computer Science




The Cone of Uncertainty
     4x


   1.5x                                                        Detailed Design
                                                               Complete
    2x
 1.25x        Requirements
  1.0x        complete
  0.8x
 0.67x                                                                           Software complete
                                       User Interface Design
                                       Complete
   0.5x
                    Approved product
                    definition

 0.25x
Initial concept

                                                        Time


                                                                                                     6
Leiden Institute of Advanced Computer Science




TAXONOMY OF METHODS


                                                 7
Leiden Institute of Advanced Computer Science




Over and under-estimating
   !   Parkinson s Law: Work expands to fill the time
       available
   !   An over-estimate is likely to cause project to take
       longer than it would otherwise
   !   Brook s Law: Putting more people on a late job
       makes it later.
   !   Weinberg s Zeroth Law of reliability: a software
       project that does not have to meet a reliability
       requirement can meet any other requirement
   !   If you don t care about quality, you can meet any
       other requirement
                                                             8
-      Only 50% kept project data on past projects - but 60.8% used analogy!
                                            -  35% did not produce estimates
                      Leiden Institute of Advanced Computer Science
            -         62% used methods based on intuition - only 16% used formalized methods

Effort estimation: Taxonomy       -  Function point users produced worse estimates!
                                                                                             Expert-estimation       25.5%
                                              Effort
                                            estimation                                       Analogy                 60.8%
                                             methods                                         „Capacity problem“      20.8%
                                                                                             Price-to-win            8.9%
       Algorithmic/                                                    Empirical             Parametric models       13.7%
       Parametric
                                                                                                   Heemstra & Kuster

 Function                                                                                                         Expert-
  Points              COCOMO        Price-to-win         Parkinson                       Percentage-
                                                                     Analogy-based         based                 estimation



  Data Points           COCOMO II                                                                                  Delphi



 Object Points                                                                                                      PERT

                                    Why do we need them ?
  Web Points                        •    Complexity and invisibility of software                             Single person
                                                                                                                estimate
                                    •    Intensely human activities of system
                                         development
    Use Case                        •    Cannot be treated in a purely mechanistic way
     Points
                                    •    Novel applications of software
                                    •    Changing technology
Leiden Institute of Advanced Computer Science                 Overview
Method                Complexity                  Pros                  Cons
Function Points       High                        Standardized,         Requires data
                                                  transparent           foundation,
                                                                        partially subjective
COCOMO II             Average to high             Good for first raw    System size
                                                  estimate, tool        estimate required
                                                  support available
Expert-estimation     Small to average            Quick, flexible       Subjective,
                                                                        missing
                                                                        transparency
Analogy-based         Small to average            Quick for similar     Not applicable to
                                                  projects              new projects, data
                                                                        foundation
                                                                        required
Percentage-based      Small                       Applicable early      Requires data
                                                  on, after initial     foundation,
                                                  phase                 variance for new
                                                                        projects
Leiden Institute of Advanced Computer Science




A taxonomy of estimating methods

  !   Expert opinion - just guessing?
  !   Bottom-up: activity based
     !    Components are identified and sized
     !    Estimates are aggregated
     !    More appropriate at later, more detailed stages of
          project planning
  !   Parametric: e.g. function points
     !    Use effort drivers representing characteristics of the
          target system and the implementation environment to
          predict effort

                                                               11
Leiden Institute of Advanced Computer Science




A taxonomy of estimating methods (cont d)
                                                      Not recommended
  !   Analogy
     !    A similar, completed project is identified, and its actual
          effort is used as the basis for the new project
  !   Artificial neural networks - a view of the future?
  !   Parkinson: based on staff-effort available
     !    Cf. Parkinson s law – use staff effort available
  !   Price to win : figure sufficiently low to win
     contract
     !    Estimate what the client s budget is
                                                      Not recommended
                                                                   12
Leiden Institute of Advanced Computer Science




Top-down vs. bottom-up                                Hours per KLOC


  !   Top-down (usually parametric models)
    !    Produce overall estimate based on project cost
         drivers
    !    Based on past project data
    !    Normally formulas such as
         effort = system size x productivity rate
    !    FP focuses on system size
    !    COCOMO focuses on productivity factors
  !   Bottom-up
    !    Use when no past project data is available
                                                                   13
Leiden Institute of Advanced Computer Science




Top-down estimating

                     Estimate                     !   Produce overall
             Overall
                     100 days                         estimate using effort
             project
                                                      driver(s)
                                                  !   Distribute proportions of
  Design       Code              Test                 overall estimate to
                                                      components
  30%         30%             40%
  i.e.        i.e.            i.e. 40 days
  30 days     30 days




                                                                              14
Leiden Institute of Advanced Computer Science




Bottom-up estimating

  1. Break project into smaller and smaller
    components
  [2. Stop when you get to what one person can
    do in one/two weeks]
  3. Estimate costs for the lowest level activities
  4. At each higher level calculate estimate by
    adding estimates for lower levels


                                                      15
Leiden Institute of Advanced Computer Science       Empirical Methods

 Price-to-win                 •  Figure sufficiently low to win contract




                              •  Based on staff-effort available
  Parkinson                   •  Parkinson’s Law: “Work expands to fill the time available”



                              •  A similar, completed project is identified, and its actual effort is
Analogy-based                    used as the basis for the new project



 Percentage-                  •  Top-down approach, based on overall effort estimate
                              •  Distribute percentages of effort to components in work
   based                         breakdown structure


   Expert-                    •  Delphi: Multiple estimates
                              •  Single expert estimate
  estimation                  •  PERT: Schedule risk estimation approach
Leiden Institute of Advanced Computer Science




Algorithmic/parametric methods
           Function                  •  Focus on system size/
            points                      functionality


          COCOMO                     •  Focus on productivity factors


  !   Parametric = top-down
    !    Produce overall estimate based on project cost
         drivers
    !    Based on past project data
    !    Normally formulas such as
         effort = system size / productivity rate (KLOC/h)
Leiden Institute of Advanced Computer Science




PARAMETRIC MODELS


                                                 18
Leiden Institute of Advanced Computer Science



Parametric Models

  !   Simplistic model:
     estimated effort = (system size) / productivity rate

  !   E.g.:
     !    System size = lines of code (KLOC)
     !    Productivity rate = KLOC/day
  !   How to derive productivity rate?
      productivity rate= (system size) / effort
     !    Based on data from past projects
Leiden Institute of Advanced Computer Science



Basic approach

         Function points                                     COCOMO

   •  Used to estimate                               •  Focus on productivity
      system size (SLOC,                             •  Lines of code as
      source lines of code)                             input

     Number of I/O transaction types


                                   System           System
               Model                size             size        Model              Effort


    Number of file types
                                                             Productivity factors
Leiden Institute of Advanced Computer Science




Parametric models

  !   COCOMO (lines of code) and function points
      examples of these
  !   Problem with COCOMO etc:

             Guess                       Algorithm   Estimate


       but what is desired is…

          System
                                         Algorithm   Estimate
        characteristic

                                                                21
Leiden Institute of Advanced Computer Science




Parametric models (cont d)

  !   Examples of system characteristics
    !    No. of screens x 4 hours
    !    No. of reports x 2 days
    !    No. of entity types x 2 days
  !   The quantitative relationship between the
      input and output products of a process can be
      used as the basis of a parametric model



                                                      22
Leiden Institute of Advanced Computer Science




Parametric models (cont d)                            KLOC per hour


  !   Simplistic model for an estimate
      estimated effort = (system size) / productivity
      rate
  !   E.g.
     !    System size = lines of code
     !    Productivity = lines of code per day
  !   Productivity rate = (system size) / effort
     !    Based on past projects

                                                                      23
Leiden Institute of Advanced Computer Science




Productivity Focus:

COCOMO


                                                     24
Leiden Institute of Advanced Computer Science




COCOMO: Constructive Cost Model

  !   Based on industry productivity standards -
      database is constantly updated
  !   Allows an organization to benchmark its
      software development productivity
  !   Basic equation: effortnom = m * sizen
     !    Person-months
     !    Thousands of delivered source code instructions
          (kdsi)
  !   Refers to a group of models

                                                            25
Leiden Institute of Advanced Computer Science




COCOMO: Constructive Cost Model

  !   System types:
     !    Organic (small teams, in-house software
          development, small and flexible systems)
     !    Embedded (tight product operation constraints; costly
          changes)
     !    Semi-detached (combines elements of the above;
          intermediate)




                                                           26
Leiden Institute of Advanced Computer Science




COCOMO: Example

 !   System types:
    !    Organic:                           m=2.4, n=1.05, o=2.5, p=0.38
    !    Semi-detached:                     m=3.0, n=1.12, o=2.5, p=0.35
    !    Embedded:                          m=3.6, n=1.20, o=2.5, p=0.32
 !   Effort = m * size[kdsi]n
 !   Time = o * effortp = o * (m * sizen)p
 !   Example: Organic project, 1,500 dsi
    !    Effort = 2.4 * 1.51.05 = 3.67
    !    Time = 2.5 * 3.670.38 = 4.1

                                                                      27
Leiden Institute of Advanced Computer Science




COCOMO (cont d)
 !   Intermediate version of COCOMO incorporates 15
     cost drivers,
    !     Product attributes:
            •  Required software reliability, database size used, product
               complexity.
    !     Computer attributes:
            •  Execution time constraint, main storage constraint, virtual
               machine volatility, computer turnaround time.
    !     Personnel attributes:
            •  Analyst capability, application experience, programmer
               capability, virtual machine experience, language experience
    !     Project factors:
            •  Modern programming practice, software tools, development
               schedule.

                                                                             28
Leiden Institute of Advanced Computer Science




COCOMO (cont d)

 !   Complementary equation:
         Effortest = Effortnom * dem1 * … * dem15
     (demi: development effort multiplier)
 !   Effortnom as before (i.e., exponential function)




                                                        29
Leiden Institute of Advanced Computer Science




System Size Focus:

FUNCTION POINTS


                                                    30
Leiden Institute of Advanced Computer Science




System size: Function Points (FP)

  !   Based on work at IBM 1979 onwards
     !    Albrecht and Gaffney wanted to measure the
          productivity independently of lines of code
     !    Has now been developed by the International FP
          User Group (which is US based)
     !    Mark II FPs developed by Simons mainly used in
          UK
  !   Based on functionality of the program


                                                           31
Leiden Institute of Advanced Computer Science


Albrecht function points                                                           external interface files

                                                                   internal
                                                  external          logical
                                                   inputs             files                                          external
                                                                                                                     outputs
 •    Based on program functionality
 •    2 data function types
 •    3 transactional function types
 •    Each occurrence is judged simple,
      average, or complex

                                                                               external inquiries
                                          • Input transaction through screens, forms, dialog boxes.
       External input types               • Updates internal files


      External output types               • Data is output to user by screens, reports, graphs, control signals.


                                          • Standing files used by system
       Logical internal files             • One or more record types (group of data that is usually accessed together)


      External interface files            • Input and output passing through and from other computer applications


                                          • Transactions initiated by user
      External inquiry types              • Provide information, but do not update internal files
                                          • User inputs some information that directs system to details required
Leiden Institute of Advanced Computer Science



                                                     If judged …
Albrecht FP weightings
                                                                   … then assign
                                                                   … points
      Type            Simple              Average       Complex

      ILF                  7                   10           15

      EIF                  5                    7           10

      EI                   3                    4           6

      EO                   4                    5           7

      EQ                   3                    4           6


                                                                          33
Leiden Institute of Advanced Computer Science




IFPUG developments
Functional complexity was later defined by rules e.g.
internal logical files and external interface files as below:

    Record
    types                         <<< Data Types >>>
                             1-19       20-50        > 50
    1                      simple                  simple   average
    2-5                    simple                average    complex
    >5                    average                complex    complex

                                                                      34
Leiden Institute of Advanced Computer Science




IFPUG external input complexity

   File
   types                        <<< Data Types >>>
                             <5        5-15        > 15
   1                       simple                  simple   average
   2-5                     simple                average    complex
   >5                    average                 complex    complex




                                                                      35
Leiden Institute of Advanced Computer Science




IFPUG external output complexity

   File
   types                          <<< Data Types >>>
                             1-19       20-50        > 50
   0 or 1                  simple                  simple   average
   2-5                     simple                average    complex
   >5                     average                complex    complex




                                                                      36
Leiden Institute of Advanced Computer Science




IFPUG external inquiry complexity

  !   External inquiries are counted both as if they
      were an external input and an external output.
  !   Use higher score of the two.




                                                      37
Leiden Institute of Advanced Computer Science        Assignment 3
    EI                           EO ILF                EIF EQ
    2 FTR: LECTURER, DEP, 4      0                         0
1     Low: 3
    DETs                                                                31 :3 = 10.3
    1 FTR: LECTURER, 3 DETs      0                         1 FTR:
2                                                                       34 :5 = 6.8
      Low: 3                                               Low: 3
                                                           LECTURER,
                                                           3 DETs
                                          4 (all 1     0
    3 FTR: TEACHING,             0                         0
3                                         RET, < 20
      High: 6
    LECTURER, COURSE), 11
    DET (do not count
                                          data                          34 :6 = 5.6
                                          types)
    activity_ref, staff_id,
    course_code)
                                       4 x Low: 7
    3 FTR: LECTURER,             0                         1 FTR:
4     High: 6
    COURSE, TEACHING_ACT,                 = 28             Low: 3
                                                           TEACHING_    37 :4.5 = 8.2
    11 DETs                                                ACT, 7 DET
    4 FTR, 12 DETs               0                         2 FTR:
5     High: 6                                              Low: 3
                                                           (LECTURER,   37 :7.72 = 4.8
                                                           COURSE), 2
                                                           DET



                                                                                 38
Leiden Institute of Advanced Computer Science




From FPs to LoCs
  Language                   Minimum (minus 1            Mode (most                 Maximum (plus 1 s.-
                             s.-dev.)                    common)                    dev.)
  C                          60                          128                        170

  C#                         40                          55                         80

  C++                        40                          55                         140

  Cobol                      65                          107                        150

  Fortran 90                 45                          80                         125

  Fortran 95                 30                          71                         100

  Java                       40                          55                         80

  SQL                        7                           13                         15

  MS Visual Basic            15                          32                         41

 After: McConnel, Steve: Software Estimation, Microsoft Press, Redmond, WA, 2006, p. 202.

                                                                                                      39
Leiden Institute of Advanced Computer Science




Mark II function points

  !   Developed by Charles R. Symons
  !   Software sizing and estimating - Mark II
      FPA , Wiley & Sons, 1991.
  !   Builds on work by Albrecht
  !   Work originally for CCTA:
     !    Should be compatible with SSADM; mainly used in
          UK
  !   Has developed in parallel to IFPUG FPs

                                                        40
Leiden Institute of Advanced Computer Science




Mark II function points (cont d)

                                       For each TA count:
                          No. entities !   Data types input (Ni)
                          accessed !   Data types output (N
                                                                 o)
                                       !   Entity types accessed
                                           (Ne)

                                                      Technical complexity
 No. input            No. output                      adjustments (TCA)
data items            data items

        FP count = Ni * 0.58 + Ne * 1.66 + No * 0.26
                                                                       41
Leiden Institute of Advanced Computer Science




Mark II function points (cont d)
  !   Weightings derived by asking developers
     !    Which effort has been spend in previous projects
     !    … regarding processing inputs
     !    … regarding processing outputs
     !    … regarding accessing and modifying stored data.
  !   Work out average hours of work
  !   Normalize averages into ratios or weightings
      which add up to 2.5                Adjustment to
  !   … or: use industry averages.       Albrecht FPs

                                                         42
Leiden Institute of Advanced Computer Science




Using Mark II function points

  !   Calculate FPs for each transaction in a
      system
  !   Total transaction counts to get a count for the
      system
  !   Recall that estimated effort = size (FPs) x
      productivity rate (effort per FP)
  !   Productivity rate obtained from past projects


                                                      43
Leiden Institute of Advanced Computer Science




ANALOGY BASED


                                                 44
Leiden Institute of Advanced Computer Science



Estimating by analogy: case-based
reasoning                     Use effort
    Source cases                                             from source as
                                                             estimate
     attribute values          effort
     attribute values          effort                 Target case
     attribute values          effort                  attribute values ?????

     attribute values          effort
      attribute values         effort
     attribute values           effort
                                                  Select case
                                                  with closest attribute
                                                  values
                                                                                45
Leiden Institute of Advanced Computer Science




Estimating by analogy (cont d)

  !   Identify significant attributes ( drivers )
  !   Locate closest match amongst source cases
      for target
  !   Adjust for differences between source and
      target




                                                      46
Leiden Institute of Advanced Computer Science




Code-Oriented Approach

  !   Envisage number / types of modules in final
      system.
  !   Estimate SLOC of each identified program
     !    Implementation language specific
  !   Estimate work
     !    Take into account complexity and technical
          difficilty.
  !   Calculate work-days effort

                                                       47
Leiden Institute of Advanced Computer Science




Anchor and adjust
                                                         Pace distance
                                                     N
                                                         on a bearing

                                                               Church with
                                                               steeple


                   Forest
                                                             go to church
                                                             by line of sight
                                          Forest

        You are here:
        how do you get to
        red cross?
                                                                                48
Leiden Institute of Advanced Computer Science



Machine assistance for source selection
(ANGEL)
                                Source A

                                                              Source B
    Number of inputs




                                It-Is


                                           Ot-Os
                                                     Target

                                        Number of outputs
     Euclidean distance = sq root ((It - Is)2 + (Ot - Os)2 )
                                                                         49
Leiden Institute of Advanced Computer Science




Stages: identify

  !   Significant features of the current project
  !   Previous project(s) with similar features
  !   Differences between the current and previous
      projects
  !   Possible reasons for error (risk)
  !   Measures to reduce uncertainty



                                                      50
Leiden Institute of Advanced Computer Science




CONCLUSIONS


                                                 51
Leiden Institute of Advanced Computer Science



Some conclusions: how to review
estimates
  Ask the following questions about an estimate
  !   What are the task size drivers?
  !   What productivity rates have been used?
  !   Is there an example of a previous project of
      about the same size?
  !   Are there examples of where the productivity
      rates used have actually been found?


                                                      52
Leiden Institute of Advanced Computer Science




Strenghts and Weaknesses

  !   Expert judgement:
    !    Expert with relevant experience can provide good
         estimation.
    !    Fast estimation.
    !    Dependent on the „expert“.
    !    May be biased.
    !    Suffers from incomplete recall.




                                                            53
Leiden Institute of Advanced Computer Science




Strenghts and Weaknesses

  !   Analogy:
     !    Based on actual project data and past experience.
     !    Similar projects may not exist.
     !    Historical data may not be accurate.


  !   Parkinson, price-to-win:
     !    Often win the contract.
     !    Poor practice.
     !    May have large overruns.
                                                          54
Leiden Institute of Advanced Computer Science




Strenghts and Weaknesses

  !   Top-Down (parametric, FP, COCOMO):
    !    System level focus
    !    Faster and easier than bottom-up
    !    Require minimal project detail.
    !    Provide little detail for justifying estimates.
    !    Less accurate than the other methods.




                                                           55
Leiden Institute of Advanced Computer Science




Strenghts and Weaknesses

  !   Bottom-Up:
    !    Based on detailed analysis.
    !    Support project tracking better than other methods.
    !    Estimates address lower level tasks.
    !    May overlook system level cost factors.
    !    Requires more estimation effort than top-down.
    !    Difficult to perform the estimate early in the life cycle.



                                                               56
Leiden Institute of Advanced Computer Science




Strenghts and Weaknesses

  !   Algorithmic (FP, COCOMO):
    !    Objective, repeatable results.
    !    Gain a better understanding of the estimation method.
    !    Subjective inputs.
    !    Calibrated to past projects and may not reflect the
         current environment.
    !    Algorithms may be company specific and not be
         suitable for software development in general.



                                                         57
Leiden Institute of Advanced Computer Science




Heemstra and Kuster s survey (cont d)

  !   Only 50% kept project data on past projects -
      but 60.8% used analogy!
  !   35% did not produce estimates
  !   62% used methods based on intuition - only
      16% used formalized methods
  !   Function point users produced worse
      estimates!


                                                      58

SDPM - Lecture 5 - Software effort estimation

  • 1.
    Leiden Institute ofAdvanced Computer Science System s Development and Project Management – Software effort estimation Prof. Dr. Thomas Bäck 1
  • 2.
    Leiden Institute ofAdvanced Computer Science Dates Feb. 1 14:45 – 17:30 Introduction, Project Description Feb. 2 13:45 – 16:30 STEP WISE Approach to Project Planning Feb. 9 13:10 – 15:45 Selecting an Appropriate Software Dev. Approach Feb. 15 14:45 – 17:30 Activity Planning and Resource Allocation Feb. 16 15:15 – 18:00 Software Effort Estimation Feb. 22 14:45 – 17:30 Risk management, project escalation Feb. 23 13:45 – 16:30 Project monitoring and control Mar. 1 14:45 – 17:00 Exam Mar. 2 13:45 – 16:30 Software Quality Assurance Mar. 8 14:45 – 17:30 Managing People; Contract Management Mar. 9 13:45 – 16:30 Various Mar. 15 14:45 – 17:30 Trade Fair 2
  • 3.
    Leiden Institute ofAdvanced Computer Science STEP WISE overview 1. Identify project objectives 0. Select Project 2. Identify project infrastructure 3. Analyze pr. characteristics 4. Identify products and activities Review lower level detail 5. Estimate effort for activity For each activity 6. Identify activity risks 10. Lower level planning 7. Allocate resources 9. Execute plan 8. Review / publicize plan 3
  • 4.
    Leiden Institute ofAdvanced Computer Science What makes a successful project? Delivering: Stages: !   Agreed functionality 1. Set targets !   On time 2. Attempt to achieve !   At the agreed cost targets !   With the required quality But what if the targets are not achievable? 4
  • 5.
    Leiden Institute ofAdvanced Computer Science Software effort estimation: Notoriously difficult … !   Complexity and invisibility of software !   Intensely human activities of system development !   Cannot be treated in a purely mechanistic way !   Novel applications of software !   Changing technology !   Lack of homogeneity of project experience 5
  • 6.
    Leiden Institute ofAdvanced Computer Science The Cone of Uncertainty 4x 1.5x Detailed Design Complete 2x 1.25x Requirements 1.0x complete 0.8x 0.67x Software complete User Interface Design Complete 0.5x Approved product definition 0.25x Initial concept Time 6
  • 7.
    Leiden Institute ofAdvanced Computer Science TAXONOMY OF METHODS 7
  • 8.
    Leiden Institute ofAdvanced Computer Science Over and under-estimating !   Parkinson s Law: Work expands to fill the time available !   An over-estimate is likely to cause project to take longer than it would otherwise !   Brook s Law: Putting more people on a late job makes it later. !   Weinberg s Zeroth Law of reliability: a software project that does not have to meet a reliability requirement can meet any other requirement !   If you don t care about quality, you can meet any other requirement 8
  • 9.
    -  Only 50% kept project data on past projects - but 60.8% used analogy! -  35% did not produce estimates Leiden Institute of Advanced Computer Science -  62% used methods based on intuition - only 16% used formalized methods Effort estimation: Taxonomy -  Function point users produced worse estimates! Expert-estimation 25.5% Effort estimation Analogy 60.8% methods „Capacity problem“ 20.8% Price-to-win 8.9% Algorithmic/ Empirical Parametric models 13.7% Parametric Heemstra & Kuster Function Expert- Points COCOMO Price-to-win Parkinson Percentage- Analogy-based based estimation Data Points COCOMO II Delphi Object Points PERT Why do we need them ? Web Points •  Complexity and invisibility of software Single person estimate •  Intensely human activities of system development Use Case •  Cannot be treated in a purely mechanistic way Points •  Novel applications of software •  Changing technology
  • 10.
    Leiden Institute ofAdvanced Computer Science Overview Method Complexity Pros Cons Function Points High Standardized, Requires data transparent foundation, partially subjective COCOMO II Average to high Good for first raw System size estimate, tool estimate required support available Expert-estimation Small to average Quick, flexible Subjective, missing transparency Analogy-based Small to average Quick for similar Not applicable to projects new projects, data foundation required Percentage-based Small Applicable early Requires data on, after initial foundation, phase variance for new projects
  • 11.
    Leiden Institute ofAdvanced Computer Science A taxonomy of estimating methods !   Expert opinion - just guessing? !   Bottom-up: activity based !  Components are identified and sized !  Estimates are aggregated !  More appropriate at later, more detailed stages of project planning !   Parametric: e.g. function points !  Use effort drivers representing characteristics of the target system and the implementation environment to predict effort 11
  • 12.
    Leiden Institute ofAdvanced Computer Science A taxonomy of estimating methods (cont d) Not recommended !   Analogy !  A similar, completed project is identified, and its actual effort is used as the basis for the new project !   Artificial neural networks - a view of the future? !   Parkinson: based on staff-effort available !  Cf. Parkinson s law – use staff effort available !   Price to win : figure sufficiently low to win contract !  Estimate what the client s budget is Not recommended 12
  • 13.
    Leiden Institute ofAdvanced Computer Science Top-down vs. bottom-up Hours per KLOC !   Top-down (usually parametric models) !  Produce overall estimate based on project cost drivers !  Based on past project data !  Normally formulas such as effort = system size x productivity rate !  FP focuses on system size !  COCOMO focuses on productivity factors !   Bottom-up !  Use when no past project data is available 13
  • 14.
    Leiden Institute ofAdvanced Computer Science Top-down estimating Estimate !   Produce overall Overall 100 days estimate using effort project driver(s) !   Distribute proportions of Design Code Test overall estimate to components 30% 30% 40% i.e. i.e. i.e. 40 days 30 days 30 days 14
  • 15.
    Leiden Institute ofAdvanced Computer Science Bottom-up estimating 1. Break project into smaller and smaller components [2. Stop when you get to what one person can do in one/two weeks] 3. Estimate costs for the lowest level activities 4. At each higher level calculate estimate by adding estimates for lower levels 15
  • 16.
    Leiden Institute ofAdvanced Computer Science Empirical Methods Price-to-win •  Figure sufficiently low to win contract •  Based on staff-effort available Parkinson •  Parkinson’s Law: “Work expands to fill the time available” •  A similar, completed project is identified, and its actual effort is Analogy-based used as the basis for the new project Percentage- •  Top-down approach, based on overall effort estimate •  Distribute percentages of effort to components in work based breakdown structure Expert- •  Delphi: Multiple estimates •  Single expert estimate estimation •  PERT: Schedule risk estimation approach
  • 17.
    Leiden Institute ofAdvanced Computer Science Algorithmic/parametric methods Function •  Focus on system size/ points functionality COCOMO •  Focus on productivity factors !   Parametric = top-down !  Produce overall estimate based on project cost drivers !  Based on past project data !  Normally formulas such as effort = system size / productivity rate (KLOC/h)
  • 18.
    Leiden Institute ofAdvanced Computer Science PARAMETRIC MODELS 18
  • 19.
    Leiden Institute ofAdvanced Computer Science Parametric Models !   Simplistic model: estimated effort = (system size) / productivity rate !   E.g.: !  System size = lines of code (KLOC) !  Productivity rate = KLOC/day !   How to derive productivity rate? productivity rate= (system size) / effort !  Based on data from past projects
  • 20.
    Leiden Institute ofAdvanced Computer Science Basic approach Function points COCOMO •  Used to estimate •  Focus on productivity system size (SLOC, •  Lines of code as source lines of code) input Number of I/O transaction types System System Model size size Model Effort Number of file types Productivity factors
  • 21.
    Leiden Institute ofAdvanced Computer Science Parametric models !   COCOMO (lines of code) and function points examples of these !   Problem with COCOMO etc: Guess Algorithm Estimate but what is desired is… System Algorithm Estimate characteristic 21
  • 22.
    Leiden Institute ofAdvanced Computer Science Parametric models (cont d) !   Examples of system characteristics !  No. of screens x 4 hours !  No. of reports x 2 days !  No. of entity types x 2 days !   The quantitative relationship between the input and output products of a process can be used as the basis of a parametric model 22
  • 23.
    Leiden Institute ofAdvanced Computer Science Parametric models (cont d) KLOC per hour !   Simplistic model for an estimate estimated effort = (system size) / productivity rate !   E.g. !  System size = lines of code !  Productivity = lines of code per day !   Productivity rate = (system size) / effort !  Based on past projects 23
  • 24.
    Leiden Institute ofAdvanced Computer Science Productivity Focus: COCOMO 24
  • 25.
    Leiden Institute ofAdvanced Computer Science COCOMO: Constructive Cost Model !   Based on industry productivity standards - database is constantly updated !   Allows an organization to benchmark its software development productivity !   Basic equation: effortnom = m * sizen !  Person-months !  Thousands of delivered source code instructions (kdsi) !   Refers to a group of models 25
  • 26.
    Leiden Institute ofAdvanced Computer Science COCOMO: Constructive Cost Model !   System types: !  Organic (small teams, in-house software development, small and flexible systems) !  Embedded (tight product operation constraints; costly changes) !  Semi-detached (combines elements of the above; intermediate) 26
  • 27.
    Leiden Institute ofAdvanced Computer Science COCOMO: Example !   System types: !  Organic: m=2.4, n=1.05, o=2.5, p=0.38 !  Semi-detached: m=3.0, n=1.12, o=2.5, p=0.35 !  Embedded: m=3.6, n=1.20, o=2.5, p=0.32 !   Effort = m * size[kdsi]n !   Time = o * effortp = o * (m * sizen)p !   Example: Organic project, 1,500 dsi !  Effort = 2.4 * 1.51.05 = 3.67 !  Time = 2.5 * 3.670.38 = 4.1 27
  • 28.
    Leiden Institute ofAdvanced Computer Science COCOMO (cont d) !   Intermediate version of COCOMO incorporates 15 cost drivers, !  Product attributes: •  Required software reliability, database size used, product complexity. !  Computer attributes: •  Execution time constraint, main storage constraint, virtual machine volatility, computer turnaround time. !  Personnel attributes: •  Analyst capability, application experience, programmer capability, virtual machine experience, language experience !  Project factors: •  Modern programming practice, software tools, development schedule. 28
  • 29.
    Leiden Institute ofAdvanced Computer Science COCOMO (cont d) !   Complementary equation: Effortest = Effortnom * dem1 * … * dem15 (demi: development effort multiplier) !   Effortnom as before (i.e., exponential function) 29
  • 30.
    Leiden Institute ofAdvanced Computer Science System Size Focus: FUNCTION POINTS 30
  • 31.
    Leiden Institute ofAdvanced Computer Science System size: Function Points (FP) !   Based on work at IBM 1979 onwards !  Albrecht and Gaffney wanted to measure the productivity independently of lines of code !  Has now been developed by the International FP User Group (which is US based) !  Mark II FPs developed by Simons mainly used in UK !   Based on functionality of the program 31
  • 32.
    Leiden Institute ofAdvanced Computer Science Albrecht function points external interface files internal external logical inputs files external outputs •  Based on program functionality •  2 data function types •  3 transactional function types •  Each occurrence is judged simple, average, or complex external inquiries • Input transaction through screens, forms, dialog boxes. External input types • Updates internal files External output types • Data is output to user by screens, reports, graphs, control signals. • Standing files used by system Logical internal files • One or more record types (group of data that is usually accessed together) External interface files • Input and output passing through and from other computer applications • Transactions initiated by user External inquiry types • Provide information, but do not update internal files • User inputs some information that directs system to details required
  • 33.
    Leiden Institute ofAdvanced Computer Science If judged … Albrecht FP weightings … then assign … points Type Simple Average Complex ILF 7 10 15 EIF 5 7 10 EI 3 4 6 EO 4 5 7 EQ 3 4 6 33
  • 34.
    Leiden Institute ofAdvanced Computer Science IFPUG developments Functional complexity was later defined by rules e.g. internal logical files and external interface files as below: Record types <<< Data Types >>> 1-19 20-50 > 50 1 simple simple average 2-5 simple average complex >5 average complex complex 34
  • 35.
    Leiden Institute ofAdvanced Computer Science IFPUG external input complexity File types <<< Data Types >>> <5 5-15 > 15 1 simple simple average 2-5 simple average complex >5 average complex complex 35
  • 36.
    Leiden Institute ofAdvanced Computer Science IFPUG external output complexity File types <<< Data Types >>> 1-19 20-50 > 50 0 or 1 simple simple average 2-5 simple average complex >5 average complex complex 36
  • 37.
    Leiden Institute ofAdvanced Computer Science IFPUG external inquiry complexity !   External inquiries are counted both as if they were an external input and an external output. !   Use higher score of the two. 37
  • 38.
    Leiden Institute ofAdvanced Computer Science Assignment 3 EI EO ILF EIF EQ 2 FTR: LECTURER, DEP, 4 0 0 1 Low: 3 DETs 31 :3 = 10.3 1 FTR: LECTURER, 3 DETs 0 1 FTR: 2 34 :5 = 6.8 Low: 3 Low: 3 LECTURER, 3 DETs 4 (all 1 0 3 FTR: TEACHING, 0 0 3 RET, < 20 High: 6 LECTURER, COURSE), 11 DET (do not count data 34 :6 = 5.6 types) activity_ref, staff_id, course_code) 4 x Low: 7 3 FTR: LECTURER, 0 1 FTR: 4 High: 6 COURSE, TEACHING_ACT, = 28 Low: 3 TEACHING_ 37 :4.5 = 8.2 11 DETs ACT, 7 DET 4 FTR, 12 DETs 0 2 FTR: 5 High: 6 Low: 3 (LECTURER, 37 :7.72 = 4.8 COURSE), 2 DET 38
  • 39.
    Leiden Institute ofAdvanced Computer Science From FPs to LoCs Language Minimum (minus 1 Mode (most Maximum (plus 1 s.- s.-dev.) common) dev.) C 60 128 170 C# 40 55 80 C++ 40 55 140 Cobol 65 107 150 Fortran 90 45 80 125 Fortran 95 30 71 100 Java 40 55 80 SQL 7 13 15 MS Visual Basic 15 32 41 After: McConnel, Steve: Software Estimation, Microsoft Press, Redmond, WA, 2006, p. 202. 39
  • 40.
    Leiden Institute ofAdvanced Computer Science Mark II function points !   Developed by Charles R. Symons !   Software sizing and estimating - Mark II FPA , Wiley & Sons, 1991. !   Builds on work by Albrecht !   Work originally for CCTA: !  Should be compatible with SSADM; mainly used in UK !   Has developed in parallel to IFPUG FPs 40
  • 41.
    Leiden Institute ofAdvanced Computer Science Mark II function points (cont d) For each TA count: No. entities !   Data types input (Ni) accessed !   Data types output (N o) !   Entity types accessed (Ne) Technical complexity No. input No. output adjustments (TCA) data items data items FP count = Ni * 0.58 + Ne * 1.66 + No * 0.26 41
  • 42.
    Leiden Institute ofAdvanced Computer Science Mark II function points (cont d) !   Weightings derived by asking developers !  Which effort has been spend in previous projects !  … regarding processing inputs !  … regarding processing outputs !  … regarding accessing and modifying stored data. !   Work out average hours of work !   Normalize averages into ratios or weightings which add up to 2.5 Adjustment to !   … or: use industry averages. Albrecht FPs 42
  • 43.
    Leiden Institute ofAdvanced Computer Science Using Mark II function points !   Calculate FPs for each transaction in a system !   Total transaction counts to get a count for the system !   Recall that estimated effort = size (FPs) x productivity rate (effort per FP) !   Productivity rate obtained from past projects 43
  • 44.
    Leiden Institute ofAdvanced Computer Science ANALOGY BASED 44
  • 45.
    Leiden Institute ofAdvanced Computer Science Estimating by analogy: case-based reasoning Use effort Source cases from source as estimate attribute values effort attribute values effort Target case attribute values effort attribute values ????? attribute values effort attribute values effort attribute values effort Select case with closest attribute values 45
  • 46.
    Leiden Institute ofAdvanced Computer Science Estimating by analogy (cont d) !   Identify significant attributes ( drivers ) !   Locate closest match amongst source cases for target !   Adjust for differences between source and target 46
  • 47.
    Leiden Institute ofAdvanced Computer Science Code-Oriented Approach !   Envisage number / types of modules in final system. !   Estimate SLOC of each identified program !  Implementation language specific !   Estimate work !  Take into account complexity and technical difficilty. !   Calculate work-days effort 47
  • 48.
    Leiden Institute ofAdvanced Computer Science Anchor and adjust Pace distance N on a bearing Church with steeple Forest go to church by line of sight Forest You are here: how do you get to red cross? 48
  • 49.
    Leiden Institute ofAdvanced Computer Science Machine assistance for source selection (ANGEL) Source A Source B Number of inputs It-Is Ot-Os Target Number of outputs Euclidean distance = sq root ((It - Is)2 + (Ot - Os)2 ) 49
  • 50.
    Leiden Institute ofAdvanced Computer Science Stages: identify !   Significant features of the current project !   Previous project(s) with similar features !   Differences between the current and previous projects !   Possible reasons for error (risk) !   Measures to reduce uncertainty 50
  • 51.
    Leiden Institute ofAdvanced Computer Science CONCLUSIONS 51
  • 52.
    Leiden Institute ofAdvanced Computer Science Some conclusions: how to review estimates Ask the following questions about an estimate !   What are the task size drivers? !   What productivity rates have been used? !   Is there an example of a previous project of about the same size? !   Are there examples of where the productivity rates used have actually been found? 52
  • 53.
    Leiden Institute ofAdvanced Computer Science Strenghts and Weaknesses !   Expert judgement: !  Expert with relevant experience can provide good estimation. !  Fast estimation. !  Dependent on the „expert“. !  May be biased. !  Suffers from incomplete recall. 53
  • 54.
    Leiden Institute ofAdvanced Computer Science Strenghts and Weaknesses !   Analogy: !  Based on actual project data and past experience. !  Similar projects may not exist. !  Historical data may not be accurate. !   Parkinson, price-to-win: !  Often win the contract. !  Poor practice. !  May have large overruns. 54
  • 55.
    Leiden Institute ofAdvanced Computer Science Strenghts and Weaknesses !   Top-Down (parametric, FP, COCOMO): !  System level focus !  Faster and easier than bottom-up !  Require minimal project detail. !  Provide little detail for justifying estimates. !  Less accurate than the other methods. 55
  • 56.
    Leiden Institute ofAdvanced Computer Science Strenghts and Weaknesses !   Bottom-Up: !  Based on detailed analysis. !  Support project tracking better than other methods. !  Estimates address lower level tasks. !  May overlook system level cost factors. !  Requires more estimation effort than top-down. !  Difficult to perform the estimate early in the life cycle. 56
  • 57.
    Leiden Institute ofAdvanced Computer Science Strenghts and Weaknesses !   Algorithmic (FP, COCOMO): !  Objective, repeatable results. !  Gain a better understanding of the estimation method. !  Subjective inputs. !  Calibrated to past projects and may not reflect the current environment. !  Algorithms may be company specific and not be suitable for software development in general. 57
  • 58.
    Leiden Institute ofAdvanced Computer Science Heemstra and Kuster s survey (cont d) !   Only 50% kept project data on past projects - but 60.8% used analogy! !   35% did not produce estimates !   62% used methods based on intuition - only 16% used formalized methods !   Function point users produced worse estimates! 58