On the computation of Truck Factor

Marco Torchiano
Marco TorchianoAssociate Professor
Is My Project’s Truck Factor Low?
Theoretical and Empirical Considerations
   About the Truck Factor Threshold




  M. Torchiano,     F. Ricca,A.Marchetto
           Presenter: M.Morisio
Agenda
• What is the Truck-Factor?

• What happens to TF in OSS projects?

• When is TF low?
Truck Factor
                       Bob            Alice         Joe



Developers




Files        File 1          File 2      File 3   File 4
Truck Factor
          Bob            Alice         Joe




File 1          File 2      File 3   File 4
Truck Factor
          Bob                       Alice                        Joe




File 1              File 2             File 3                File 4




  The remaining developers know 75% of the system (3 out of 4)
Truck Factor
          Bob                        Alice                      Joe




File 1               File 2             File 3                File 4




  The remaining developer knows just 50% of the system (2 out of 4)
Truck Factor
• the number of developers on a team who
  have to be hit with a truck (i.e., to go on
  vacation, to become ill, or to leave the
  company for another) before the project is in
  serious trouble
                        i.e.
• before the remaining developers know less
  than T% of the modules
Which factors do influence TF?
• Team size (n)
• Residual knowledge threshold (T )
  – Minimum Proportion of known files before
    reaching TF
• Knowledge ratio (KR)
  – The average proportion of files known by each
    developer
• Knowledge dispersion (σKR)
  – Standard deviation of KR
Sample projects




      10 from Google code and 10 from Sourceforge
Project Fragility Threshold
• A project is considered fragile when its TF is
  below a minimum threshold
• Conjecture (Govindaray):
  – Small teams (n<10): 40% of team size
  – Large teams (n≥10): 20% ofteam size
• In practice:
  – 16 out of 20 OSS projects are fragile
  – Only 4 are exactly at the threshold level
Maximum TF
• Ideal condition:
  – Developers are split into two groups knowing
    KR ± σ of the system
  – Such knowledge is uniformly distributed among
    files
• The maximum achievable TF is:
Maximum TF vs. Fragility Threshold
Conclusion
• We compared the Govindaray threshold to the
  theoretical maximum TF:
  – the threshold appears either above the maximum
    or just barely below, i.e. practically unreachable
• Real and healthy projects, when confronted
  with such metric, appear as fragile
• Further work both empirical and theoretical is
  needed to define a fragility threshold
  applicable to real projects
1 of 13

Recommended

Nature language understanding by Baiyang Liu from Facebook by
Nature language understanding by Baiyang Liu from FacebookNature language understanding by Baiyang Liu from Facebook
Nature language understanding by Baiyang Liu from FacebookBill Liu
310 views29 slides
Traffic analysis by
Traffic analysisTraffic analysis
Traffic analysisSrashti Vyas
13K views20 slides
Traffic Volume Study by
Traffic Volume StudyTraffic Volume Study
Traffic Volume StudyAbontee
61.2K views68 slides
Testing the UI of Mobile Applications by
Testing the UI of Mobile ApplicationsTesting the UI of Mobile Applications
Testing the UI of Mobile ApplicationsMarco Torchiano
337 views118 slides
Software Engineering II Course at Politecnico di Torino by
Software Engineering II Course at Politecnico di TorinoSoftware Engineering II Course at Politecnico di Torino
Software Engineering II Course at Politecnico di TorinoMarco Torchiano
187 views14 slides
Espresso vs. EyeAutomate: comparing two generations of Android GUI testing tools by
Espresso vs. EyeAutomate: comparing two generations of Android GUI testing toolsEspresso vs. EyeAutomate: comparing two generations of Android GUI testing tools
Espresso vs. EyeAutomate: comparing two generations of Android GUI testing toolsMarco Torchiano
240 views30 slides

More Related Content

More from Marco Torchiano

Data Quality - Standards and Application to Open Data by
Data Quality - Standards and Application to Open DataData Quality - Standards and Application to Open Data
Data Quality - Standards and Application to Open DataMarco Torchiano
845 views62 slides
Data Visualization by
Data VisualizationData Visualization
Data VisualizationMarco Torchiano
908 views107 slides
Riflessioni su Riforma Costituzionale "Renzi-Boschi" by
Riflessioni su Riforma Costituzionale "Renzi-Boschi"Riflessioni su Riforma Costituzionale "Renzi-Boschi"
Riflessioni su Riforma Costituzionale "Renzi-Boschi"Marco Torchiano
205 views14 slides
Relevance, Benefits, and Barriers of Software Modelling and Model Driven Tech... by
Relevance, Benefits, and Barriers of Software Modelling and Model Driven Tech...Relevance, Benefits, and Barriers of Software Modelling and Model Driven Tech...
Relevance, Benefits, and Barriers of Software Modelling and Model Driven Tech...Marco Torchiano
392 views28 slides
Energy Consumption Analysis
 of Image Encoding and Decoding Algorithms by
Energy Consumption Analysis
 of Image Encoding and Decoding AlgorithmsEnergy Consumption Analysis
 of Image Encoding and Decoding Algorithms
Energy Consumption Analysis
 of Image Encoding and Decoding AlgorithmsMarco Torchiano
475 views16 slides
Relevance, Benefits, and Problems of Software Modelling and Model-Driven Tech... by
Relevance, Benefits, and Problems of Software Modelling and Model-Driven Tech...Relevance, Benefits, and Problems of Software Modelling and Model-Driven Tech...
Relevance, Benefits, and Problems of Software Modelling and Model-Driven Tech...Marco Torchiano
611 views25 slides

More from Marco Torchiano(9)

Data Quality - Standards and Application to Open Data by Marco Torchiano
Data Quality - Standards and Application to Open DataData Quality - Standards and Application to Open Data
Data Quality - Standards and Application to Open Data
Marco Torchiano845 views
Riflessioni su Riforma Costituzionale "Renzi-Boschi" by Marco Torchiano
Riflessioni su Riforma Costituzionale "Renzi-Boschi"Riflessioni su Riforma Costituzionale "Renzi-Boschi"
Riflessioni su Riforma Costituzionale "Renzi-Boschi"
Marco Torchiano205 views
Relevance, Benefits, and Barriers of Software Modelling and Model Driven Tech... by Marco Torchiano
Relevance, Benefits, and Barriers of Software Modelling and Model Driven Tech...Relevance, Benefits, and Barriers of Software Modelling and Model Driven Tech...
Relevance, Benefits, and Barriers of Software Modelling and Model Driven Tech...
Marco Torchiano392 views
Energy Consumption Analysis
 of Image Encoding and Decoding Algorithms by Marco Torchiano
Energy Consumption Analysis
 of Image Encoding and Decoding AlgorithmsEnergy Consumption Analysis
 of Image Encoding and Decoding Algorithms
Energy Consumption Analysis
 of Image Encoding and Decoding Algorithms
Marco Torchiano475 views
Relevance, Benefits, and Problems of Software Modelling and Model-Driven Tech... by Marco Torchiano
Relevance, Benefits, and Problems of Software Modelling and Model-Driven Tech...Relevance, Benefits, and Problems of Software Modelling and Model-Driven Tech...
Relevance, Benefits, and Problems of Software Modelling and Model-Driven Tech...
Marco Torchiano611 views
A Model-Based Approach to Language Integration by Marco Torchiano
A Model-Based Approach to Language Integration A Model-Based Approach to Language Integration
A Model-Based Approach to Language Integration
Marco Torchiano532 views
Language Interaction and Quality Issues: An Exploratory Study by Marco Torchiano
Language Interaction and Quality Issues: An Exploratory StudyLanguage Interaction and Quality Issues: An Exploratory Study
Language Interaction and Quality Issues: An Exploratory Study
Marco Torchiano928 views
The impact of process maturity on defect density by Marco Torchiano
The impact of process maturity on defect densityThe impact of process maturity on defect density
The impact of process maturity on defect density
Marco Torchiano1.3K views

On the computation of Truck Factor

  • 1. Is My Project’s Truck Factor Low? Theoretical and Empirical Considerations About the Truck Factor Threshold M. Torchiano, F. Ricca,A.Marchetto Presenter: M.Morisio
  • 2. Agenda • What is the Truck-Factor? • What happens to TF in OSS projects? • When is TF low?
  • 3. Truck Factor Bob Alice Joe Developers Files File 1 File 2 File 3 File 4
  • 4. Truck Factor Bob Alice Joe File 1 File 2 File 3 File 4
  • 5. Truck Factor Bob Alice Joe File 1 File 2 File 3 File 4 The remaining developers know 75% of the system (3 out of 4)
  • 6. Truck Factor Bob Alice Joe File 1 File 2 File 3 File 4 The remaining developer knows just 50% of the system (2 out of 4)
  • 7. Truck Factor • the number of developers on a team who have to be hit with a truck (i.e., to go on vacation, to become ill, or to leave the company for another) before the project is in serious trouble i.e. • before the remaining developers know less than T% of the modules
  • 8. Which factors do influence TF? • Team size (n) • Residual knowledge threshold (T ) – Minimum Proportion of known files before reaching TF • Knowledge ratio (KR) – The average proportion of files known by each developer • Knowledge dispersion (σKR) – Standard deviation of KR
  • 9. Sample projects 10 from Google code and 10 from Sourceforge
  • 10. Project Fragility Threshold • A project is considered fragile when its TF is below a minimum threshold • Conjecture (Govindaray): – Small teams (n<10): 40% of team size – Large teams (n≥10): 20% ofteam size • In practice: – 16 out of 20 OSS projects are fragile – Only 4 are exactly at the threshold level
  • 11. Maximum TF • Ideal condition: – Developers are split into two groups knowing KR ± σ of the system – Such knowledge is uniformly distributed among files • The maximum achievable TF is:
  • 12. Maximum TF vs. Fragility Threshold
  • 13. Conclusion • We compared the Govindaray threshold to the theoretical maximum TF: – the threshold appears either above the maximum or just barely below, i.e. practically unreachable • Real and healthy projects, when confronted with such metric, appear as fragile • Further work both empirical and theoretical is needed to define a fragility threshold applicable to real projects

Editor's Notes

  1. Good morning everybody, this work concerns theoretical and empirical considerations on the Truck Factor and was performed by Marco Torchiano, FilippoRicca and Alessandro Marchetto. Unfortunately they had to stay at home,… some one babysitting,… someone hard working, and someone else fishing.
  2. In this short presentation I will try to answer three main questions:…
  3. To understand what the Truck Factor is, we need to consider a typical software project where we find developers and files.Each file is know to a subset of developers, typically just a few.
  4. What happens if one of the developers leaves the project or has an accident?(he’s hit by a truck)Some files, known only to that unfortunate developer, become unknown for the project.
  5. The remaining developers collectively know just a fraction of the files.(A truck hits another developer)Some other file, known only to the second unfortunate developer become unknown.
  6. At this point the remaining team lost knowledge of most of the system
  7. [ Questa slide sipotrebbe non mostraree dare la definizionementresimostra la slide precedente ]
  8. Small projects have a large KR (required to know all the files) and tend to have more variabilityLarge projects tend to have small KR and smaller variability
  9. The conjecture has been presented in a blog post, there is no conventional publication reporting itThe four non-fragile projects have team size (n=3 or 4) therefore 40% = 1, and the actual TF is exactly equal to the threshold.
  10. [ la formula non èdacommentare, al limitesipotrebbesaltare la slide ]
  11. The maximum TF as a function of team size for different KR values at a fixed σKR = 0.1 (smaller than any real value)It is important to remember that the maximum TF can be obtained only with a perfectly uniform distribution of file knowledge among developers; actual projects are far from this condition.The continuous black line is the fragility threshold as proposed by GovindarayWe can observe how for large (n&gt;10) projects and small KR (typical case) it is quite possible the project never pass the fragility threshold.