Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Doing Science Properly In The Digital Age - Rutgers Seminar


Published on

Seminar given at Rutgers University on 2nd October 2012.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Doing Science Properly In The Digital Age - Rutgers Seminar

  1. 1. ScienceProperly in theDigital Age2 October 2012, Rutgers UniversityNeil Chue Hong (@npch) Software Sustainability Institute
  2. 2. Four Paradigms of Research Software Sustainability Institute
  3. 3. Software is pervasive in research Software Sustainability Institute
  4. 4. Just the Nature of the problem? courtesy of Jo Hannay et al, “How Do Scientists Develop and Use Scientific Software? Maintenance is not fun Published online 13 October 2010 | Nature 467, 775-777 Hacking new stuff is fun (2010) doi:10.1038/467775a Software Sustainability Institute
  5. 5. The Software Sustainability Institute national facility for cultivating world-class research through software• Better software enables better research• Software reaches boundaries in its development cycle that prevent improvement, growth and adoption• Providing the expertise and services needed to negotiate to the next stage• Developing the policy and tools to support the community developing and using research software Supported by EPSRC Software Sustainability Institute Grant EP/H043160/1
  6. 6. UK Research Computing Ecosystem PeopleComputing Software Communities Data Centres … Network/Collaboration Instruments Software Sustainability Institute
  7. 7. SSI Organisation• Community Engagement (Shoaib Sufi)  Fellowship Programme• Consultancy (Steve Crouch)  Open Call for Projects  Software Evaluation• Policy (Simon Hettrick)  Guides and Case Studies• Training (Mike Jackson)  Software Carpentry  Software Surgeries• Collaboration between universities of Edinburgh, Manchester, Oxford and Southampton Software Sustainability Institute
  8. 8. Case Study: Ligand Binding• Centre for Computational Chemistry, Bristol  New methods for rapid MC sampling of biomolecular systems modelled using QM/MM  Developed two codes ProtoMS (F77) + Sire (C++)  Water-Swap Reaction Coordinate method to calculate absolute protein-ligand binding free energies• SSI’s work is helping to scale development  ProtoMS and Sire both single developer codes  ASPIRE/ACQUIRE framework has multiple devs • Split architecture between ASPIRE (adaptive multiresolution hybrid MD simulation) and ACQUIRE (WorkPacket scheduling system with optimisation for time to result vs “green-ness”• Software Sustainability Institute
  9. 9. Case Study: Brain Imaging• Brain Research Imaging Centre, Edinburgh  Develop PrivacyGuard software, a DICOM image deidentification toolkit  Created software to support new multispectral colouring modulation and variance identification technique (“MCMxxxVI”) to identify white matter lesions that are indicative of declining cognitive ability  BRIC are not principally software developers, but do provide software to other researchers• SSI’s work means the software has been reviewed and refactored  Looked at exploitation • Usability review, Naming/trademark review  Made it easier for BRIC staff to maintain and develop • Move to standard repositories, testing and documentation processes • Examination of licencing for MCMxxxVI • Extraction and refactoring to create standalone tools•• Software Sustainability Institute
  10. 10. Case Study: Climate Policy Modelling• CIAS team at Tyndall Centre for Climate Change Research, University of East Anglia  Develop linked climate and economic models for detailed analysis  Their software was not ready to be used by other groups • One researcher/developer at UEA, several users• SSI’s work means the software is robust enough that it can be installed and used by others  Enabled use of the software by the WWFN’sClimascope project and James Cook University • Documented software to allow extensions by contributors • Made it easier to maintain and backup • Added job scheduling to improve modeling throughput • New modelling framework enables new models i.e. new science• Software Sustainability Institute
  11. 11. Case Study: textual studies• TextVRE team at CeRCH, Kings College London  Developed an environment which is used to integrate various tools used in the e-Humanities textual studies lifecycle  Builds on the German TextGrid project, and many other existing tools• SSI’s work means the software is can be run “out of the box” – an important requirement for the researchers  Developed a VM image containing the TextVRE installation • Improve installation instructions • Develop tests to check each installed component • Improve modularisation to allow others to contribute and maintain  Feeding back work to TextGrid• Software Sustainability Institute
  12. 12. The modern researcher… • … worries about:  Data management and analysis  Reproducible research  Scalable simulations  Integration of models and workflowsPicture of Otto Stern of  CollaborationEmilio Segre Visual Archives Software Sustainability Institute
  13. 13. Observation 1:Software is acrossresearchCorollary: software is bleeding edge and long-tailDemanding users are coming from arts + humanities,economics, and social science as well as sciences Software Sustainability Institute
  14. 14. Observation 2:A culture of re-use than re-invention is notwidespreadCorollary: we have wasted effort and increased siloing Software Sustainability Institute
  15. 15. Observation 3: people are“embarrassed”about softwareCorollary: something is broken in the way we regard,recognise and reward software Software Sustainability Institute
  16. 16. SSI Drivers and Themes• Two key drivers which cause people to seek the SSI’s advice:  They want to be more productive in their research  They don’t want to be embarrassed by appearing worse than their peers• Broadly, our work falls into a few key themes:  The role and reward of software in research  Recognition of software career paths  Developing the scientific computing / software development skill base Software Sustainability Institute
  17. 17. The Foundations of Digital Research Re- Re-usable search Re-producible Software Careers Software Recognition / RewardSoftware Skills and Capability Software Sustainability Institute
  18. 18. Gap 1: Software Skills Training Research Software Summer Focussed Schools Carpentry (methods) Who fills this gap? HPC Short Courses MSc in HPC / scientific computing Advanced HPC TrainingProgramming Focussed Programming Programming (Tools) 101 201 Basic Advanced Software Sustainability Institute
  19. 19. Software philosophy as part of the process• Foundations of scientific computing in undergraduate courses  Like presentation skills• Methods of scientific computing in postgraduate courses  Like statistics and ethics• Show the benefits from the knowledge and methods of digital research  Not just programming 101 Software Sustainability Institute
  20. 20. Best Practices for Scientific Computing Write programs for people, not computers2. Automate repetitive tasks3. Use the computer to record history4. Make incremental changes5. Use version control6. Don’t repeat yourself (or others)7. Plan for mistakes8. Optimise software only after it works correctly9. Document the design and purpose of the code, rather than its mechanics10. Conduct code reviewsPaper (including the evidence) being submitted to arXiv and PNAS Software Sustainability Institute
  21. 21. Gap 2: Lack of recognition and reward• There is an anachronism in the way we conduct and recognise research?  REF references software as an output but it is still not easy to get recognition – peer review fails• Software careers  Researchers who use software  Researcher-Developers  Research Software Engineers  Research Software Support  Research Systems Providers Software Sustainability Institute
  22. 22. No recognition without reward, no reward without reproducibility?• How do we reward people for important software contributions?• Traditionally: publish a research paper that happens to mention software  Can we provide more direct, acceptable software citations?• A Research Software Impact Manifesto  alternative-impact-manifesto-research-software  NB Authorship is hard• It works for data!  C.f. Heather Piowowar’s work  000308 Software Sustainability Institute
  23. 23. Software Metapapers • Create a complete scholarly record including “standard” publication, method, dataset and models, and software  e.g. modelling and simulation, statistical analysis  Enable replay, reproduction and reuse • Pragmatic approach is to create a metadata record for the software, and link it to a copy of the software in some storage infrastructure  This is a software metapaper  Peer-review the metadata, not the software • Journal of Open Research Software:  Software Sustainability Instituteand the work by B. Matthews et al: The Significant Properties of Software: A Study
  24. 24. Gap 3: Lack of support infrastructure• For example: no digital repository which satisfies the criteria:  Open to anyone in the UK to archive software  Software associated with an OSI license  Provide a unique, permanent identifier  Publishes a preservation/curation/sustainability plan• This is just deposit, not even preservation or sustainability Software Sustainability Institute
  25. 25. 5 Stars of Software?• Do we need a 5 stars for software?  Existence – there is accurate metadata that defines the software  Availability – you can access and run the software  Openness – the software has an open permissible license  Assured – the software provides ways of assuring its correctness c.f. 5 Stars of Linked Data  Linked – the related data, (Berners-Lee) dependencies and papers are 5 Stars of Online Journals (Shotton) indicated Software Sustainability Institute
  26. 26. Gap 4: Software Maturity and Management Not all software should make it to the next stageSoftware proliferation Management changes through time, requiring planning Innovation Consolidation Customisation Time Software Sustainability Institute
  27. 27. A More Manageable Ecosystem• Discourage duplicative software development in research grants by rewarding reuse and long-term development  Need to change perceptions so that software is seen as valuable  But understand when it should not proceed to next stage• Different stages should be managed and funded separately  Maintenance vs. research vs. development• A skilled researcher base is the key in the digital age  Create a larger proportion of enabled researchers and provide the ramps to go from desktop to high-end infrastructure  Allow and encourage specialism and collaboration Software Sustainability Institute
  28. 28. Take home points Researchers are developing more softwarethan ever, and trying to do it better2) We are not adequately providing thetraining, recognition and reward, and careerpaths to enable a step change improvementin research software3) This is hindering digital research4) The only people who can change thissituation are peopleSustainability Institute Software like you!
  29. 29. A national facility for cultivating world-class research through software current collaborationsBecome our next collaborators!Website: Software Sustainability Institute