Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Code sharing and review in the open with rOpenSci

325 views

Published on

Talk given at Open Con Switzerland 2018 in Bern about code review and sharing with rOpenSci

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

Code sharing and review in the open with rOpenSci

  1. 1. Code sharing and review in the open with rOpenSci Julia Gustavsen, PhD
  2. 2. About me • Background: • PhD from University of British Columbia, (Vancouver, Canada) in marine microbial ecology • Current job: • Bioinformatician at health tech company SOPHiA GENETICS (St-Sulpice, VD, CH) • Other activities: • rOpenSci reviewer • Software Carpentry instructor 2
  3. 3. Scientific data access and analysis via R packages Tutorials UnConferences Supporting post-docs and other researchers rOpenSci Promoting tools and best practices 3 https://ropensci.org
  4. 4. The issue of reproducibility in science Vines et al. The availability of research data declines rapidly with article age. Current Biology, 2014. 21:1-4 The availability of data (and code) from analyses can be problematic for the reproducibility of results. Probability that data are available or can be shared Age of paper (years) The Effect of Article Age to Receiving Data from the Authors 4
  5. 5. Scientific data access and analysis via R packages Tutorials UnConferences Supporting post-docs and other researchers rOpenSci Promoting tools and best practices 5 https://ropensci.org
  6. 6. rOpenSci: enabling access to scientific data and reproducibility in analyses • R packages = Software written in the R programming language • rOpenSci has >307 R packages available • What kinds of packages and data? • Altmetrics • Databases • Geospatial • Image Processing • Text mining and language processing • Computing Infrastructure • Security • Taxonomy 6 https://ropensci.org/packages/
  7. 7. Example of facilitating database access: NCBI • Package “rentrez” is used to access NCBI’s database – large amount of publication and biological data. • Can access NBCI by web- interface (“entrez”) or interacting with FTP site or via the application programming interface (API). rentrez uses api to make it easier to get data. • Many tutorials at: ropensci.org/tutorials 7
  8. 8. R packages submitted to rOpenSci Submit • Author submits R package to rOpenSci Code review • Editor assigns two reviewers Revisions • Rounds of revisions with author and reviewers Decision • Decision by editor and R package included as part of rOpenSci 8
  9. 9. Code review process 9 Aspects of the package to be reviewed: • quality • fit • documentation • clarity
  10. 10. Why this code review works well? • Code review is done in the open. • Happens on Github • Info available to other authors and interested parties • Less fear of unfairly negative review for authors • Code of conduct and reviewing guide 10 • Overall benefits of the code review • Improved code • Author and reviewer learn something • About code • About data accessibility • Warm fuzzy feeling of helping another researcher • Networking
  11. 11. Get involved with rOpenSci: • Use their packages and discuss! https://discuss.ropensci.org/ • Get involved with onboarding (submit a package, review a package) https://github.com/ropensci/onboarding • More ways to get involved https://ropensci.org/community/ 11 @rOpenSci ropensci
  12. 12. Thanks! rOpenSci: Scott Chamberlain Stefanie Butland Follow-up questions or comments: @JuliaGustavsen j.gustavsen@gmail.com 12

×