Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Where does it go from here? The role of software in digital repositories

The open repositories community has made great strides in recent years in addressing interoperability, policy and providing the arguments for open access and sharing. One aspect of open research which has come to prominence is the importance of software as a fundamental part of reproducible research, which in turn raises issues around the preservation of software.

In this short presentation, I will describe some of the work that the Software Sustainability Institute (SSI) has been doing to address the structural and policy issues which currently present a barrier to the deposit and use of software in open repositories.

  • Be the first to comment

Where does it go from here? The role of software in digital repositories

  1. 1. Where does it go from here?The Place of Software in Digital Repositories 12 July 2012 OR2012, Edinburgh Neil Chue Hong (@npch) Software Sustainability Institute
  2. 2. Software is pervasive in research Software Sustainability Institute
  3. 3. The Software Sustainability Institute national facility for building better software• Better software enables better research• Software reaches boundaries in its development cycle that prevent improvement, growth and adoption• Providing the expertise and services needed to negotiate to the next stage • Software reviews and refactoring, collaborations to develop your project, guidance and best practice on software development, project management, community building, publicity and more… Supported by EPSRC Software Sustainability Institute Grant EP/H043160/1
  4. 4. Software Sustainability: preservation vs sustainability Sustainability? Image courtesy of London Permaculture under CC-by-nc-sa licenseImage courtesy of Mortati under CC-by-nc-nd Preservation? Software Sustainability Institute
  5. 5. Why are you considering software sustainability? Achieve legal compliance Create heritage valuePurpose Enable continued access to data Encourage software reuse JISC-funded, with Curtis+Cartwright Software Sustainability Institute
  6. 6. How are you going to choose the right approach? Preservation (techno-centric) Emulation (data-centric) Migration (functionality-centric) Approach Transition (process-centric) Hibernation (knowledge-centric) Deprecation Software Sustainability Institute
  7. 7. Software Carpentry• Helping scientists be more productive by teaching them basic computing skills• How to use repositories properly is a key skill• Software Sustainability Institute
  8. 8. Just the Nature of the problem? courtesy of Greg Wilson, Software Carpentry, from Nature article Maintenance is not fun Published online 13 October 2010 | Nature 467, 775-777 (2010) doi:10.1038/467775a Hacking is fun Software Sustainability Institute
  9. 9.“Re-”is the new black Software Sustainability Institute
  10. 10. Slide from Carole Goble, JCDL 2012 Reuse Review New Refresh State Rerun Same State Good enough Repeat To Verify Reproduce with new DataData ReplayProvenance Repurpose Recover Reconstruct Repair Data Reproduce with new Method Public ation Method Method Method only Documentation Provenance Execution (link data and code)Drummond C Replicability is not Reproducibility: Nor is it Good Science, onlinePeng RD, Reproducible Research in Computational Science Science 2 Dec 2011: 1226-1227.
  11. 11. The most important: Reward• How do we reward people for important software contributions?• Traditionally: publish a research paper that happens to mention software  Can we provide more direct, acceptable software citations?• A Research Software Impact Manifesto  damned-alternative-impact-manifesto-research-software  NB Authorship is hard Software Sustainability Institute
  12. 12.’t softwarejust data? Software Sustainability Institute
  13. 13. Boundary do we choose to keep:- Workflow?- Software that runs workflow?- Software referenced by workflow?- Software dependencies?What’s the minimum citable part? Software Sustainability Institute
  14. 14. Function Granularity Library / Suite / Package AlgorithmProgram … Software Sustainability Institute
  15. 15. Why do we version? Versioning To indicate a change- To allow sharing- To confer special status Public Public Public v1 v2 v3 Personal Personal v3 v3a Personal Personal Personal v1 v2 v2a Personal v2a Software Sustainability Institute
  16. 16.,Sharing,Archiving Software Sustainability Institute
  17. 17. Differing roles, different repositories  sharing  archivingTimescales IngestPolicy MetadataLicensing Assurance Software Sustainability Institute
  18. 18. Software Metapapers • Create a complete scholarly record including “standard” publication, method, dataset and models, and software  e.g. modelling and simulation, statistical analysis  Enable replay, reproduction and reuse • Pragmatic approach is to create a metadata record for the software, and link it to a copy of the software in some storage infrastructure  This is a software metapaper  Peer-review the metadata, not the software • Journal of Open Research Software:  Software Sustainability Instituteand the work by B. Matthews et al: The Significant Properties of Software: A Study
  19. 19. An acceptable repository• Metapaper references an instance of software, stored in a “suitable” repository  Clear access / deposit / preservation policy  Adherence to standards  Ability to easily “transfer”  Sustainability of hosting organisation  Ability to monitor, check integrity (obsolescence?)• We may be storing  Binaries, source code (as text or archived), virtual machines(!) Software Sustainability Institute
  20. 20. Potential for confusion• ‘The right license for all parts of the scholarly record’  Victoria Stodden, Enabling Reproducible Research: Open Licensing for Scientific Innovation• Commonly used OSI approved licenses include:  Apache License, 2.0 (Apache-2.0)  BSD 3-Clause “New” or “Revised” license (BSD-3-Clause)  BSD 3-Clause “Simplified” or “FreeBSD” license (BSD-2-Clause)  GNU General Public License (GPL)  GNU Library or “Lesser” General Public License (LGPL)  MIT license (MIT)  Mozilla Public License 2.0 (MPL-2.0)  Common Development and Distribution License (CDDL-1.0)  Eclipse Public License (EPL-1.0)• Does enabling the deposit of software just confuse those already depositing publications/data? Software Sustainability Institute
  21. 21. 5 Stars of Software?• Do we need a 5 stars for software?  Existence – there is accurate metadata that defines the software  Availability – you can access and run the software  Openness – the software has an open permissible license  Assured – the software provides ways of assuring its correctness  Linked – the related data, c.f. 5 Stars of Linked Data dependencies and papers are (Berners-Lee) indicated 5 Stars of Online Journals (Shotton) Software Sustainability Institute
  22. 22. Take home points Researchers are developing more softwarethan ever, and trying to do it better2) They want to be rewarded for creating acomplete scholarly record – this includessoftware3) We still don’t know the best way to shiftfrom one repository role to another when itcomes to software! BackupSoftware Sustainability Institutearchiving -> sharing ->