SlideShare a Scribd company logo
A Master Thesis Presentation




                                                           (Dartington Pottery Training Workshop, 1978)



                                  Situated Learning in
                            Open Source Software Developers:
                           The Case of Google Chrome Project
    Author:                                               Supervisors:
    Josef Hardi                                           Prof. Barbara Russo
    European Master in Software Engineering               Dr. Richard Torkar
Thursday, August 4, 2011
Introduction

                     •     Situated Learning is the learning that occurs
                           in workplaces [Brown et al., 1989].
                     •     No separation between ‘knowing’ and ‘doing’.
                     •     Situated learning is primarily practiced by the
                           community of practitioners.



                                                                             1/18
Thursday, August 4, 2011
Existing Findings
             • Learning curve effect.
                  •        “That the more times a task has been performed, the
                           less time will be required on each subsequent
                           iteration.” [T.P. Wright, 1936]

             •       [Huntley, 2003]: Mozilla is reported to exhibit a
                     strong learning curve compared to Apache.
             •       [Au et al., 2009]: Learning is universally present
                     in OSS projects.

                                                                                 2/18
Thursday, August 4, 2011
Distinctions in this
           Thesis

                    •      Data are taken from each individual instead of
                           from an aggregation of individuals.
                    •      More insights to individual characteristics.
                           •   i.e., Knowledge depreciation and team roles as
                               factors that affect the learning process.




                                                                                3/18
Thursday, August 4, 2011
Research Question 1:        Research Question 2:
                Is learning present in       What are the factors that
                   OSS developers?               affect learning?

                                                    Hypothesis 2:
                           Hypothesis 1:    Knowledge depreciates over
                There is a relation            time among the OSS
                   between the                     developers.
                   accumulated
                experience and the                  Hypothesis 3:
                  performance.               Core developers resolve
                                                  issues faster.


                                                                         4/18
Thursday, August 4, 2011
Case Study


                           •   Google Chrome Project.

                           •   Duration: 10 months ~ 10
                               releases (December 2008 -
                               October 2009).




                                                           5/18
Thursday, August 4, 2011
Research Methodology
          1                                          2
                      Data Collection
                                                      Data exploration
        Issue Report            Review
            Data           Interaction Data       Performance   Experience   Team Role


                                              4                                   3

     Identification of Learning Curve
                                                         Construct Input Data
         Models and Data Fitting




                                                                                      6/18
Thursday, August 4, 2011
Research Methodology:                  1   2   3   4


      Data Collection
                     Issue Report =
                  [ID, Type, Area, Status,
                    Owner, Open date,
                  Assigned date, Started
                    date, Close date]



          1. Unrelated project areas,
          2. Invalid issue status,
          3. Empty owner name.




                   Issue Report Data
                     (5,160 entries)
                                                             7/18
Thursday, August 4, 2011
Research Methodology:        1   2   3   4


      Data Collection
 "ben","sky",1226700214
 "ben","sky",1226706864
 "ben","pkasting",1226707765
 "mal","tony",1226809276
 "sgk","tony",1226874776
 "phajdan.jr","deanm",1227808551
 "phajdan.jr","deanm",1227809341
 "phajdan.jr","mark",1228496086
 ...


            Interaction =
          [Owner, Reviewer,
           Comment date]




               Review
          Interaction Data
          (12,037 entries)                         8/18
Thursday, August 4, 2011
Research Methodology:        1     2     3     4


      Data Exploration
                                                                             Releases
            Issue Report
                Data




                                                         Developers
                                 Average of issue                           Performance
         Measure Performance     resolution time.

                                                                      ...
                                                                            Releases
            Issue Report
                Data




                                                         Developers
                                Number of resolved                          Experience
           Measure Experience
                                     issues

                                                                      ...
     Sample = 274 developers
                                                                                   9/18
Thursday, August 4, 2011
Research Methodology:             1     2     3     4


      Data Exploration
               Review
                                                                                 Releases
             Interaction
                 Data




                                                              Developers
                                     Core and periphery                          Team Role
           Estimate Team Role         structure model
                                      [Borgatti, 1999]
                                                                           ...

            •       Core entails a dense, cohesive structure and
                    periphery entails a sparse, loose structure.
            •       The estimation is performed by using UCINET.
     Sample = 274 developers
                                                                                      10/18
Thursday, August 4, 2011
Research Methodology:        1      2      3     4


      Construct Input Data
           274 Developers                                  38 Long-term
                                                           Contributors

                               Participate for
                                 at least 8
                                  releases




                                                                Refine
    Not all of them working
        in a long-term.           longitud inal data
                              new
                                       sets

                                                                        11/18
Thursday, August 4, 2011
Input data set:
                   Performance
                                    The data distribution in the group of long-term developers
 Average time of resolving issues
            (log days)




                                                         Releases
                                                                                            12/18
Thursday, August 4, 2011
Input data set:
              Experience
                             The data distribution in the group of long-term developers
 Amount of resolved issues
           (N)




                                                  Releases
                                                                                     13/18
Thursday, August 4, 2011
Input data set:
        Team Role
               The team composition in the group of long-term developers
               R1                 R2               R3               R4               R5



                            39%              39%              45%                         47%
        46%                                                              55%   53%
                      54%              61%              61%




               R6                 R7               R8               R9           R10


                                                                               39%
        47%                 47%              42%              42%
                      53%              53%              58%              58%              61%




                                                                                          14/18
Thursday, August 4, 2011
Research Methodology:   1   2   3   4


       Identification of Learning Curve
       Models and Data Fitting
  Model 1:




  Model 2:




                                               Note


                                                 15/18
Thursday, August 4, 2011
Result Summary

        Hypothesis             Variable     Model 1 Model 2 Supported?


                H1         KnowledgeStock   -0.01*** -0.01***           Yes

                H2         Lambda           0.94*** 0.94***             Yes

                H3         TeamRole           NA         0.18           No
                                            *** Statistically significant p < 0.001

                                                                              16/18
Thursday, August 4, 2011
Threats to Validity
                           Internal Validity              External Validity

    •      The improvement in the solving      •   Both models have a very low
           issues might be caused by the           statistical prediction power (less
           improvement in the system               than 5%).
           design.

    •      Some of the issue data are                    Construct Validity
           incomplete
                                               •   The estimation of Core and
                                                   Periphery structure might not
                                                   reflect the real situation.
                                                   However, the communication
                                                   pattern is the best indicator.

                                                                                    17/18
Thursday, August 4, 2011
Conclusion
                   •       I affirmed that learning is present in open
                           source software developers.
                   •       Knowledge does not significantly depreciate in
                           the Google Chrome team.
                   •       It is inconclusive to claim core developers
                           work faster than those who are in the
                           periphery.
                   •       Methodological contribution: A method to
                           harvest and analyze data from code review.

                                                                           18/18
Thursday, August 4, 2011
Thank you!



Bolzano, 8 October 2010
Thursday, August 4, 2011

More Related Content

Similar to Situated learning among open source software developers

One talk Machine Learning
One talk Machine LearningOne talk Machine Learning
One talk Machine Learning
ONE Talks
 
Machine Learning: Learning with data
Machine Learning: Learning with dataMachine Learning: Learning with data
Machine Learning: Learning with data
ONE Talks
 
Anyone can research: guerilla user research tips for design and development -...
Anyone can research: guerilla user research tips for design and development -...Anyone can research: guerilla user research tips for design and development -...
Anyone can research: guerilla user research tips for design and development -...
Girl Geek Dinners Milano
 
Pp iscar2011
Pp iscar2011Pp iscar2011
Pp iscar2011
Kari Kosonen
 
Internet mediatedresearch edirisingha_draft
Internet mediatedresearch edirisingha_draftInternet mediatedresearch edirisingha_draft
Internet mediatedresearch edirisingha_draft
Palitha Edirisingha
 
Lionel Briand ICSM 2011 Keynote
Lionel Briand ICSM 2011 KeynoteLionel Briand ICSM 2011 Keynote
Lionel Briand ICSM 2011 Keynote
ICSM 2011
 
e-Science, Research Data and Libaries
e-Science, Research Data and Libariese-Science, Research Data and Libaries
e-Science, Research Data and Libaries
Rob Grim
 
Named Entity Recognition from Online News
Named Entity Recognition from Online NewsNamed Entity Recognition from Online News
Named Entity Recognition from Online News
Bernardo Najlis
 
Policy Lunchbox - Digital Science
Policy Lunchbox - Digital SciencePolicy Lunchbox - Digital Science
Policy Lunchbox - Digital Science
Kaitlin Thaney
 
Design Principles of Advanced Task Elicitation Systems
Design Principles of Advanced Task Elicitation SystemsDesign Principles of Advanced Task Elicitation Systems
Design Principles of Advanced Task Elicitation Systems
Prof. Dr. Alexander Maedche
 
Talk Hpl
Talk HplTalk Hpl
Talk Hpl
Davide Eynard
 
Enhancing AT through ID Techniques
Enhancing AT through ID TechniquesEnhancing AT through ID Techniques
Enhancing AT through ID Techniques
northavorange
 
Mid-term presentation.pdf
Mid-term presentation.pdfMid-term presentation.pdf
Mid-term presentation.pdf
ZixunZhou
 
Cloud Programming Models: eScience, Big Data, etc.
Cloud Programming Models: eScience, Big Data, etc.Cloud Programming Models: eScience, Big Data, etc.
Cloud Programming Models: eScience, Big Data, etc.
Alexandru Iosup
 
User Centered Design of an Android app
User Centered Design of an Android appUser Centered Design of an Android app
User Centered Design of an Android app
Satheesh Kumar Chandran
 
Ml pluss ejan2013
Ml pluss ejan2013Ml pluss ejan2013
Ml pluss ejan2013
CS, NcState
 
Research Information Management
Research Information ManagementResearch Information Management
Research Information Management
OCLC Research
 
Implementing Archivematica, research data network
Implementing Archivematica, research data networkImplementing Archivematica, research data network
Implementing Archivematica, research data network
Jisc RDM
 
Text Analytics - JCC2014 Kimelfeld
Text Analytics - JCC2014 KimelfeldText Analytics - JCC2014 Kimelfeld
Text Analytics - JCC2014 Kimelfeld
Pedro Contreras Flores
 
Past, Present, and Future of Analyzing Software Data
Past, Present, and Future of Analyzing Software DataPast, Present, and Future of Analyzing Software Data
Past, Present, and Future of Analyzing Software Data
Jeongwhan Choi
 

Similar to Situated learning among open source software developers (20)

One talk Machine Learning
One talk Machine LearningOne talk Machine Learning
One talk Machine Learning
 
Machine Learning: Learning with data
Machine Learning: Learning with dataMachine Learning: Learning with data
Machine Learning: Learning with data
 
Anyone can research: guerilla user research tips for design and development -...
Anyone can research: guerilla user research tips for design and development -...Anyone can research: guerilla user research tips for design and development -...
Anyone can research: guerilla user research tips for design and development -...
 
Pp iscar2011
Pp iscar2011Pp iscar2011
Pp iscar2011
 
Internet mediatedresearch edirisingha_draft
Internet mediatedresearch edirisingha_draftInternet mediatedresearch edirisingha_draft
Internet mediatedresearch edirisingha_draft
 
Lionel Briand ICSM 2011 Keynote
Lionel Briand ICSM 2011 KeynoteLionel Briand ICSM 2011 Keynote
Lionel Briand ICSM 2011 Keynote
 
e-Science, Research Data and Libaries
e-Science, Research Data and Libariese-Science, Research Data and Libaries
e-Science, Research Data and Libaries
 
Named Entity Recognition from Online News
Named Entity Recognition from Online NewsNamed Entity Recognition from Online News
Named Entity Recognition from Online News
 
Policy Lunchbox - Digital Science
Policy Lunchbox - Digital SciencePolicy Lunchbox - Digital Science
Policy Lunchbox - Digital Science
 
Design Principles of Advanced Task Elicitation Systems
Design Principles of Advanced Task Elicitation SystemsDesign Principles of Advanced Task Elicitation Systems
Design Principles of Advanced Task Elicitation Systems
 
Talk Hpl
Talk HplTalk Hpl
Talk Hpl
 
Enhancing AT through ID Techniques
Enhancing AT through ID TechniquesEnhancing AT through ID Techniques
Enhancing AT through ID Techniques
 
Mid-term presentation.pdf
Mid-term presentation.pdfMid-term presentation.pdf
Mid-term presentation.pdf
 
Cloud Programming Models: eScience, Big Data, etc.
Cloud Programming Models: eScience, Big Data, etc.Cloud Programming Models: eScience, Big Data, etc.
Cloud Programming Models: eScience, Big Data, etc.
 
User Centered Design of an Android app
User Centered Design of an Android appUser Centered Design of an Android app
User Centered Design of an Android app
 
Ml pluss ejan2013
Ml pluss ejan2013Ml pluss ejan2013
Ml pluss ejan2013
 
Research Information Management
Research Information ManagementResearch Information Management
Research Information Management
 
Implementing Archivematica, research data network
Implementing Archivematica, research data networkImplementing Archivematica, research data network
Implementing Archivematica, research data network
 
Text Analytics - JCC2014 Kimelfeld
Text Analytics - JCC2014 KimelfeldText Analytics - JCC2014 Kimelfeld
Text Analytics - JCC2014 Kimelfeld
 
Past, Present, and Future of Analyzing Software Data
Past, Present, and Future of Analyzing Software DataPast, Present, and Future of Analyzing Software Data
Past, Present, and Future of Analyzing Software Data
 

Recently uploaded

“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
Edge AI and Vision Alliance
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
Daiki Mogmet Ito
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
Zilliz
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
SOFTTECHHUB
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
Neo4j
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
Neo4j
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
Uni Systems S.M.S.A.
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 
Infrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI modelsInfrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI models
Zilliz
 

Recently uploaded (20)

“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 
Infrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI modelsInfrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI models
 

Situated learning among open source software developers

  • 1. A Master Thesis Presentation (Dartington Pottery Training Workshop, 1978) Situated Learning in Open Source Software Developers: The Case of Google Chrome Project Author: Supervisors: Josef Hardi Prof. Barbara Russo European Master in Software Engineering Dr. Richard Torkar Thursday, August 4, 2011
  • 2. Introduction • Situated Learning is the learning that occurs in workplaces [Brown et al., 1989]. • No separation between ‘knowing’ and ‘doing’. • Situated learning is primarily practiced by the community of practitioners. 1/18 Thursday, August 4, 2011
  • 3. Existing Findings • Learning curve effect. • “That the more times a task has been performed, the less time will be required on each subsequent iteration.” [T.P. Wright, 1936] • [Huntley, 2003]: Mozilla is reported to exhibit a strong learning curve compared to Apache. • [Au et al., 2009]: Learning is universally present in OSS projects. 2/18 Thursday, August 4, 2011
  • 4. Distinctions in this Thesis • Data are taken from each individual instead of from an aggregation of individuals. • More insights to individual characteristics. • i.e., Knowledge depreciation and team roles as factors that affect the learning process. 3/18 Thursday, August 4, 2011
  • 5. Research Question 1: Research Question 2: Is learning present in What are the factors that OSS developers? affect learning? Hypothesis 2: Hypothesis 1: Knowledge depreciates over There is a relation time among the OSS between the developers. accumulated experience and the Hypothesis 3: performance. Core developers resolve issues faster. 4/18 Thursday, August 4, 2011
  • 6. Case Study • Google Chrome Project. • Duration: 10 months ~ 10 releases (December 2008 - October 2009). 5/18 Thursday, August 4, 2011
  • 7. Research Methodology 1 2 Data Collection Data exploration Issue Report Review Data Interaction Data Performance Experience Team Role 4 3 Identification of Learning Curve Construct Input Data Models and Data Fitting 6/18 Thursday, August 4, 2011
  • 8. Research Methodology: 1 2 3 4 Data Collection Issue Report = [ID, Type, Area, Status, Owner, Open date, Assigned date, Started date, Close date] 1. Unrelated project areas, 2. Invalid issue status, 3. Empty owner name. Issue Report Data (5,160 entries) 7/18 Thursday, August 4, 2011
  • 9. Research Methodology: 1 2 3 4 Data Collection "ben","sky",1226700214 "ben","sky",1226706864 "ben","pkasting",1226707765 "mal","tony",1226809276 "sgk","tony",1226874776 "phajdan.jr","deanm",1227808551 "phajdan.jr","deanm",1227809341 "phajdan.jr","mark",1228496086 ... Interaction = [Owner, Reviewer, Comment date] Review Interaction Data (12,037 entries) 8/18 Thursday, August 4, 2011
  • 10. Research Methodology: 1 2 3 4 Data Exploration Releases Issue Report Data Developers Average of issue Performance Measure Performance resolution time. ... Releases Issue Report Data Developers Number of resolved Experience Measure Experience issues ... Sample = 274 developers 9/18 Thursday, August 4, 2011
  • 11. Research Methodology: 1 2 3 4 Data Exploration Review Releases Interaction Data Developers Core and periphery Team Role Estimate Team Role structure model [Borgatti, 1999] ... • Core entails a dense, cohesive structure and periphery entails a sparse, loose structure. • The estimation is performed by using UCINET. Sample = 274 developers 10/18 Thursday, August 4, 2011
  • 12. Research Methodology: 1 2 3 4 Construct Input Data 274 Developers 38 Long-term Contributors Participate for at least 8 releases Refine Not all of them working in a long-term. longitud inal data new sets 11/18 Thursday, August 4, 2011
  • 13. Input data set: Performance The data distribution in the group of long-term developers Average time of resolving issues (log days) Releases 12/18 Thursday, August 4, 2011
  • 14. Input data set: Experience The data distribution in the group of long-term developers Amount of resolved issues (N) Releases 13/18 Thursday, August 4, 2011
  • 15. Input data set: Team Role The team composition in the group of long-term developers R1 R2 R3 R4 R5 39% 39% 45% 47% 46% 55% 53% 54% 61% 61% R6 R7 R8 R9 R10 39% 47% 47% 42% 42% 53% 53% 58% 58% 61% 14/18 Thursday, August 4, 2011
  • 16. Research Methodology: 1 2 3 4 Identification of Learning Curve Models and Data Fitting Model 1: Model 2: Note 15/18 Thursday, August 4, 2011
  • 17. Result Summary Hypothesis Variable Model 1 Model 2 Supported? H1 KnowledgeStock -0.01*** -0.01*** Yes H2 Lambda 0.94*** 0.94*** Yes H3 TeamRole NA 0.18 No *** Statistically significant p < 0.001 16/18 Thursday, August 4, 2011
  • 18. Threats to Validity Internal Validity External Validity • The improvement in the solving • Both models have a very low issues might be caused by the statistical prediction power (less improvement in the system than 5%). design. • Some of the issue data are Construct Validity incomplete • The estimation of Core and Periphery structure might not reflect the real situation. However, the communication pattern is the best indicator. 17/18 Thursday, August 4, 2011
  • 19. Conclusion • I affirmed that learning is present in open source software developers. • Knowledge does not significantly depreciate in the Google Chrome team. • It is inconclusive to claim core developers work faster than those who are in the periphery. • Methodological contribution: A method to harvest and analyze data from code review. 18/18 Thursday, August 4, 2011
  • 20. Thank you! Bolzano, 8 October 2010 Thursday, August 4, 2011