Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Findings from GitHub. Methods, Datasets and Limitations

609 views

Published on

Slides of my presentation at Mining Software Repositories conference (MSR'16) of the paper "Findings from GitHub. Methods, Datasets and Limitations" co-authored with Valerio Cosentino and Jordi Cabot

Published in: Software
  • Be the first to comment

  • Be the first to like this

Findings from GitHub. Methods, Datasets and Limitations

  1. 1. Valerio Cosentino, Javier L. Cánovas Izquierdo, Jordi Cabot Flickr/BenNuttall
  2. 2. Motivation
  3. 3. Motivation
  4. 4. Motivation Empirical Methods Employed Dataset Used Limitations Reported Methodology
  5. 5. Motivation Empirical Methods Employed Dataset Used Limitations Reported Methodology Discussion
  6. 6. Methodology
  7. 7. Methodology Title || Abstract || Keywords || IndexTerms INCLUDES “GitHub” OR “Git hub” OR “github”
  8. 8. Results Flickr/mararle
  9. 9. Empirical Methods Employed
  10. 10. Datasets Used
  11. 11. Datasets Used
  12. 12. Datasets Used
  13. 13. Limitations Reported
  14. 14. Limitations Reported
  15. 15. Discussion Flickr/KristinaAlexanderson
  16. 16. Discussion Data Collection Dataset Size Replicability Sampling Longitudinal Studies Variety of methodologies
  17. 17. Discussion Data Collection Dataset Size Replicability Sampling Longitudinal Studies Variety of methodologies Freshness vs. Curation Small-medium size > 2/3 not providing dataset access Most use non-probaility sampling Scarcely used Replication? Comparisons?
  18. 18. Flickr/JimRafferty Except where otherwise noted, content on this site is licensed under a Creative Commons Attribution 4.0 International license. Thanks! http://tinyurl.com/GitHub-SystRev-Papers Some works might have been ignored Subjetivity issues
  19. 19. Except where otherwise noted, content on this site is licensed under a Creative Commons Attribution 4.0 International license. Discussion Data Collection Dataset Size Replicability Sampling Longitudinal Studies Variety of methodologies Thanks! http://tinyurl.com/GitHub-SystRev-Papers

×