TOWARDS AN
EMPIRICAL ANALYSIS
OF THE
MAINTAINABILITY OF
CRAN PACKAGES
Maëlick Claes
Tom Mens & Philippe Grosjean
Software ...
INTRODUCTION
R & CRAN
http://www.r-­project.org
Statistical environment based on the S language
Package system
Packages contain code (R, C, Fort...
CRAN
http://cran.r-­project.org
"Official" repository containing more than 5000 packages
Strict policy for package accepta...
"DESCRIPTION" FILE
PACKAGE DEPENDENCIES
HAS GROWTH BECOME LINEAR?
CRAN CHECKS
R CMD check command tool for checking package correctness and quality
Package status: OK, NOTE, WARNING or ERR...
EXAMPLES
Package listing
Package with ERROR status
NUMBER OF PACKAGES FOR EACH
FLAVOR
HOW DO FLAVORS
IMPACT ERRORS?
EVOLUTION OF ERRORS FOR
EACH FLAVOR
NUMBER OF ERRORS EXTRACTED
WHAT IS THE
CAUSE OF ERRORS
IN CRAN AND
HOW ARE THEY
FIXED?
SOURCES OF STATUS CHANGES
Package Update (PU)
ERROR status change coincides with the release of a new package
version
Depe...
SOURCES OF NEW ERRORS AND
ERROR FIXES
ARE ERRORS CAUSED AND FIXED
BY DEPENDENCY UPDATE REALLY
CAUSED AND FIXED BY IT?
WHAT IS THE SOURCE OF ERRORS
FIXED BY PACKAGE UPDATE?
HOW LONG DOES
IT TAKE TO FIX AN
ERROR?
NUMBER OF DAYS NEEDED TO FIX
ERRORS
NUMBER OF DAYS NEEDED TO FIX
ERRORS WITH PACKAGE UPDATE
CONCLUSION
CONCLUSION
Some flavors are more error prone than others
High amount of errors caused by external factors
Most errors fixe...
FUTURE WORK
Refine errors by looking at package content
Static analysis of the R code
Function dependency graph
Rerun some...
THANKS FOR YOUR ATTENTION

QUESTIONS?
Upcoming SlideShare
Loading in...5
×

Towards an empirical analysis of the maintainability of CRAN packages

201

Published on

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
201
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
3
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Towards an empirical analysis of the maintainability of CRAN packages

  1. 1. TOWARDS AN EMPIRICAL ANALYSIS OF THE MAINTAINABILITY OF CRAN PACKAGES Maëlick Claes Tom Mens & Philippe Grosjean Software Engineering Lab & Numerical Ecology of Aquatic Systems Lab 17th December 2013, BENEVOL 0
  2. 2. INTRODUCTION R & CRAN
  3. 3. http://www.r-­project.org Statistical environment based on the S language Package system Packages contain code (R, C, Fortran,...), documentation, examples, tests, datasets,... Dependency relations between packages
  4. 4. CRAN http://cran.r-­project.org "Official" repository containing more than 5000 packages Strict policy for package acceptance Package quality regularly checked & archive process. Complaints in the community Hornik 2012, Are there too many R packages?
  5. 5. "DESCRIPTION" FILE
  6. 6. PACKAGE DEPENDENCIES
  7. 7. HAS GROWTH BECOME LINEAR?
  8. 8. CRAN CHECKS R CMD check command tool for checking package correctness and quality Package status: OK, NOTE, WARNING or ERROR Every package release must pass the test on two OS to be accepted on CRAN R CMD check rerun daily on the whole set of packages Packages with inactive maintainer and ERROR status are archived when the next non-­minor version of R is released A combination of R version, OS, architecture and C compiler forms a flavor
  9. 9. EXAMPLES Package listing Package with ERROR status
  10. 10. NUMBER OF PACKAGES FOR EACH FLAVOR
  11. 11. HOW DO FLAVORS IMPACT ERRORS?
  12. 12. EVOLUTION OF ERRORS FOR EACH FLAVOR
  13. 13. NUMBER OF ERRORS EXTRACTED
  14. 14. WHAT IS THE CAUSE OF ERRORS IN CRAN AND HOW ARE THEY FIXED?
  15. 15. SOURCES OF STATUS CHANGES Package Update (PU) ERROR status change coincides with the release of a new package version Dependency Update (DU) ERROR status change coincides with the release of a new package version of a dependency External Factors (EF) ERROR status change without a new package version or a new version of any dependency
  16. 16. SOURCES OF NEW ERRORS AND ERROR FIXES
  17. 17. ARE ERRORS CAUSED AND FIXED BY DEPENDENCY UPDATE REALLY CAUSED AND FIXED BY IT?
  18. 18. WHAT IS THE SOURCE OF ERRORS FIXED BY PACKAGE UPDATE?
  19. 19. HOW LONG DOES IT TAKE TO FIX AN ERROR?
  20. 20. NUMBER OF DAYS NEEDED TO FIX ERRORS
  21. 21. NUMBER OF DAYS NEEDED TO FIX ERRORS WITH PACKAGE UPDATE
  22. 22. CONCLUSION
  23. 23. CONCLUSION Some flavors are more error prone than others High amount of errors caused by external factors Most errors fixed quickly without developer intervention Maintenance effort needs to be focused on fixing errors caused by others Need for a more specific tool for problems related to dependency changes
  24. 24. FUTURE WORK Refine errors by looking at package content Static analysis of the R code Function dependency graph Rerun some part of R CMD check Impact of the maintainers on the time required to fix packages and amount of errors for each flavor Do packages containing tests are more error prone? How does the dependency network evolve over time?
  25. 25. THANKS FOR YOUR ATTENTION QUESTIONS?
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×