Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Better software, better service, better research: The Software Sustainability Institute, ELIXIR and you

131 views

Published on

Ever spotted some great looking software only to discover you can’t get it, it doesn’t work, there is no documentation to help fix it and the developers don’t have the time or incentive to help? Ever produced some software that you want to be widely used or have folks contribute? What’s the sustainability of that key platform/library/tool /database your lab uses day in and day out? Are you helping the providers? The same issues stand for Data (or as we now say “FAIR” Findable, Accessible, Interoperable, Reusable Data) and its metadata. Is anyone looking out for Europe’s data services– the datasets and analysis systems you use and you make – the standards they use and the curators and developers who make them? Or is FAIR just a FAIRy story? I’ll tell how two organisations with quite different structures and approaches - the UK’s Software Sustainability Institute and the ELIXIR European Research Infrastructure for Life Science Data – are working for the common goal of better software, better service, and better research.
https://www.rothamsted.ac.uk/events/14th-international-symposium-integrative-bioinformatics

Published in: Science
  • Want to preview some of our plans? You can get 50 Woodworking Plans and a 440-Page "The Art of Woodworking" Book... Absolutely FREE ♥♥♥ http://tinyurl.com/y3hc8gpw
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Be the first to like this

Better software, better service, better research: The Software Sustainability Institute, ELIXIR and you

  1. 1. Better software, better service, better research The Software Sustainability Institute, ELIXIR and you Professor Carole Goble Head of Node ELIXIR UK Software Sustainability Institute UK The University of Manchester, UK carole.goble@manchester.ac.uk Keynote: 14th Intl Symposium on Integrative Bioinformatics, IB2018 Rothamsted Research, Harpenden, UK, 13-15 June 2018
  2. 2. My team has produced lots of software and services used by others for a long time… … including data and metadata management and sharing systems … viewer
  3. 3. Shared and sharable data and software key to reproducibility & productivity • Improve transparency, understanding, trust • Eliminate errors • Encourage collaboration • Ease on-boarding “Scholarship is the full software environment, code and data, that produced the result” - Claerbout
  4. 4. Hey, some great looking software ! you can’t get it it doesn’t work for me no documentation developers don’t have resources to help or have gone how do I get it be widely used? have folks contribute make it sustainable get folk who use it to contribute to it Hey, I have some great software !
  5. 5. Hey, how can I get hold of and use data? A great deal of talk of FAIR at Integrative Bioinformatics 2018…….. Hey, how do I make my (meta)data FAIR? Scientific Data 3, 160018 (2016) doi:10.1038/sdata.2016.18
  6. 6. Are they FAIR so I can upload to them and use them? Are they sustained?
  7. 7. F A I R F R E E ≠ No software or data is free. Its all sponsored.
  8. 8. “Better Software, Better Research” Software Sustainability Institute UK national facility cultivating better, more sustainable, research software to enable world-class research Est 2010 By UK funders Better Software, Better Research IEEE Internet Computing (2014) doi.ieeecomputersociety.org/10.1109/MIC.2014.88 “FAIR Data for Life” ELIXIR European Research Infrastructure operating a sustainable European infrastructure for biological information, supporting life-science research and its translation to society, the bio-industries, environment and medicine. Est 2013 by inter-govt agreement The FAIR guiding principles for Scientific data management and stewardship Scientific Data 3, 160018 (2016) doi:10.1038/sdata.2016.18 http://elixir-europe.org
  9. 9. 4 organisations Edinburgh, Manchester, Southampton, Oxford 21 National Nodes* + Hub *Counting EMBL-EBI as a Nation Seeded an international movement >180 organisations “Act Local Think Global” “Act Global, Think Global….”
  10. 10. Better research Reliable Reusable Reproducible
  11. 11. The research community and research depends on software Do you use research software? What would happen to your research without software Survey of researchers from 15 UK Russell Group universities conducted by SSI between August - October 2014. 406 respondents covering representative range of funders, discipline and seniority.
  12. 12. Software Sustainability Institute www.software.ac.uk The Research community produces software scientific software is important for their own research 91% developing scientific software is important for their own research 84% claimed to spend more time developing scientific software than they did 10 years ago 53% spend at least one fifth of their time developing software 38% 2000 scientists. J.E. Hannay et al., “How Do Scientists Develop and Use Scientific Software?” Proc. ICSE Workshop Software Eng. for Computational Science and Eng., 2009, pp. 1–8.
  13. 13. Software Sustainability Institute www.software.ac.uk £840m Investment in 2013-2014 financial year, an amount that has risen by 3% on average over last four years The cost of UK research that relies on software 30% Of total research investment has been spent on research which relies on software over the last four financial years Analysis of data from 49,650 grant titles and abstracts published on Gateway to Research covering 2010-2014.
  14. 14. Investment across UK Research Councils into software use
  15. 15. Software Sustainability Institute www.software.ac.uk Software in research papers
  16. 16. Software Sustainability Institute www.software.ac.uk Culture change is hard Stodden, Seiler, Ma. An empirical analysis of journal policy effectiveness for computational reproducibility https://doi.org/10.1073/pnas.1708290115 “We require that all computer code used for modeling and/or data analysis that is not commercially available be deposited in a publicly accessible repository upon publication.” “After publication, all reasonable requests for data, code, or materials must be fulfilled.” In 2011 Science changed its editorial policies:
  17. 17. Software Sustainability Institute www.software.ac.uk Culture change is hard Stodden, Seiler, Ma. An empirical analysis of journal policy effectiveness for computational reproducibility, PNAS March 13, 2018. 115 (11) 2584-2589; https://doi.org/10.1073/pnas.1708290115
  18. 18. Software Sustainability Institute www.software.ac.ukPeople depend on my software?! But I’m a researcher… I didn’t intend to be a long term service provider.. I only get funded for novelty Doesn’t work with my tools
  19. 19. Software Sustainability Institute www.software.ac.uk Software Ecosystem Patchworks and Spectrums Not all software is equal and worth sustaining. Its all worth being good. Nangia and Katz: https://arxiv.org/pdf/1706.06527.pdf Invisible Domain generic Visible Domain specific Tools Services Workflows ScriptsLibraries Frameworks platforms Teams Individuals
  20. 20. Software Sustainability Institute www.software.ac.uk Software Ecosystem Patchworks and Spectrums Not all software is equal and worth sustaining. Its all worth being good. Nangia and Katz: https://arxiv.org/pdf/1706.06527.pdf Intentional Side-effect Full fledged for reuse Throw- away Code Algorithm
  21. 21. Software Sustainability Institute www.software.ac.uk Survey of researchers from 15 Russell Group universities conducted by SSI between August - October 2014. 406 respondents covering representative range of funders, discipline and seniority. 56%Of UK researchers develop their own research software or scripts 73%Of UK researchers have had no formal software engineering training 140,000UK researchers rely on their own coding skills The UK research community making software
  22. 22. Software Sustainability Institute www.software.ac.uk Software making practices “As a general rule, researchers do not test or document their programs rigorously, and they rarely release their codes, making it almost impossible to reproduce and verify published results generated by scientific software” Zeeya Merali , Nature 467, 775-777 (2010) | doi:10.1038/467775a Computational science: ...Error…why scientific programming does not compute. 2000 scientists. J.E. Hannay et al., “How Do Scientists Develop and Use Scientific Software?” Proc. ICSE Workshop Software Eng. for Computational Science and Eng., 2009, pp. 1–8.
  23. 23. http://science.sciencemag.org/ content/314/5807/1856.full “Chang’s data are good… but the faulty software threw everything off” “a homemade data- analysis program had flipped two columns”
  24. 24. Llorente et al. Science, 350, 6262 doi:10.1126/science.aad2879 The results presented in the Report “Ancient Ethiopian genome reveals extensive Eurasian admixture throughout the African continent“ were affected by a bioinformatics pipeline error that wrongly discarded data
  25. 25. Software Sustainability Institute www.software.ac.uk The Software Sustainability Institute Software Consultancy Training CommunityPolicy Helping the community to develop software that meets the needs of reliable, reproducible, and reusable research Delivering essential software skills to researchers via CDTs, institutions & doctoral schools Bringing together the right people to understand and address topical issues Collecting evidence on the community’s software issues & policymaking with stakeholders Outreach Exploiting our platform to enable engagement, delivery & uptake
  26. 26. Software Sustainability Institute www.software.ac.uk The Software Sustainability Institute Software Consultancy Training CommunityPolicy 140+ UKCarpentry workshops 4500+ learners 10 delivery partners Outreach 50+ projects 130+ evaluations 4 surgeries 90+ guides 50,000 readers Network of 112 Fellows across 70 orgs 20+ workshops organised 740 researchers surveyed 50,000 grants analysed Web site & Blogs 150+ contributed articles 20,000 unique visitors per month 3,000Twitter followers 300+ RSEs engagedAdvice to UK, USA and EU govt stakeholders
  27. 27. Software Sustainability Institute www.software.ac.uk Guides, Briefing Papers, and the Wisdom of your Peers
  28. 28. Software Sustainability Institute www.software.ac.uk Fellowship Programme Fellowship programme funds researchers in exchange for their expertise and advice. • travel to conferences, setup and run workshops, organise software sustainability sessions at domain conferences, host, organise or teach training events Annual Collaborations Workshop SSI Fellows
  29. 29. Software Sustainability Institute www.software.ac.uk Workshops Software Deposit and Preservation Workshop 11 July 2018, with Jisc https://software.ac.uk/workshops Developing Software Licensing Guidance for the BBSRC 24 April 2017, with ELIXIR-UK and the BBSRC Docker Containers for Reproducible Research, 27-28 June 2017Specialist International 9th International Workshop on Sustainable Software for Science: Practice and Experiences 29 Oct 2018, Amsterdam, NL 3rd Research Software Engineers Conference 3-4 Sept 2018, Birmingham, UK NSF Workshop Data and Software Citation, 6-7 June 2016, Boston USA Annual Collaborations Workshop Software Credit Workshop - 19 Oct 2015, Natural History Museum, London
  30. 30. Software Sustainability Institute www.software.ac.uk Policy making Campaigning for Software Recognition by researchers, publishers, journals, funders, institutions, societies … When and how should I cite? How do I deal with components and teams? Be a better reviewer Wynholds, et al (2012) Data, data use, and scientific inquiry: two case studies of data practices 10.1145/2232817.2232822
  31. 31. Software Sustainability Institute www.software.ac.uk Policy making, Campaigning for careers Professionalisation of research software est 2012 at a SSI Collaborations Workshop
  32. 32. Software Sustainability Institute www.software.ac.uk A worldwide movement 3rd Conference, 3-4 Sept, Birmingham, UK www.rse.ac.uk www.de-rse.org
  33. 33. Software Sustainability Institute www.software.ac.uk Train your Team software, data and library carpentry Basic skills, Train the trainers
  34. 34. Software Sustainability Institute www.software.ac.uk Get help Biomolecular systems and protein modelling codes BoneJ: suite of open- source plug-ins for bone shape analysis based on ImageJ Community assessment and building Improved testing f/work Packaging and installation Improved coding standards Improved web site Community web portal ionomic data on over 300,000 plant and yeast samples Rehosted service Migration of portal from Purdue to Nottingham Technical analysis of the service + a migration process Changes to ensure the long-term sustainability User assessment Re-architect and scale One-man, small-scale software project into multi-developer programme ChrisWood David SaltMichael Doube
  35. 35. Software Sustainability Institute www.software.ac.uk Five steps to better software better research Get Expert Help Train yourTeam Publish your Code Develop a Software Management Plan Write for strangers
  36. 36. Software Sustainability Institute www.software.ac.uk Get a plan and publish… develop  share  preserve Developed and versioned using code repository Published via code repository or website Deposited in digital repository with paper / for preservation SOFTWARE HERIT
  37. 37. Software Sustainability Institute www.software.ac.uk Five steps to better software better research Get Expert Help Train yourTeam Publish your Code Develop a Software Management Plan Write for strangers
  38. 38. Software Sustainability Institute www.software.ac.uk Writing for strangers Goldilocks principle • Readers of papers • Reviewers • Future collaborators • Potential users • Potential contributors • Future members of your research group • Current students • Co-authors • You in 6 months time “stranger - anyone who doesn’t possess our current short-term memory and experiences” – David Donoho
  39. 39. Software Sustainability Institute www.software.ac.uk 2984 Views 574 Downloads 7813 Views 384 Downloads http://doi.org/10.5281/zenodo.1172988
  40. 40. Software Sustainability Institute www.software.ac.uk All software is “legacy code” Maintenance = Evolution prepare to repair if its used it will evolve Institute Software Sustainability Corrective Preventative Adaptive Perfective Keeping the Show on the Road Dealing with change
  41. 41. Software Sustainability Institute www.software.ac.uk provenance portability good enough practices access documentation adopt a licence make it discoverable source code accessible citation metadata validation docs test data example data version control, automated build and test, code reviews by mates, modularise, use standards clear and transparent contribution, governance and communication processes packaging, containers dependencies Writing for Strangers ids steps
  42. 42. ELIXIR All Hands 2018, Berlin
  43. 43. agriculture medicine bioindustries environment Operate a sustainable European infrastructure for biological information Support research and translation Connect centres distributed infrastructure coordinated (inter)national data resources, tools, services
  44. 44. ELIXIR Services & Activities Training CommunitiesPolicy Data,Tools, Compute, Interoperability Engage European International National Industry domains technologiestechniques
  45. 45. European Level EOSC Summit 11 June Open for comments until 5th August https://github.com/FAIR- Data-EG/action-plan http://bit.ly/interim_FAIR _report https://ec.europa.eu/info/events/2nd -eosc-summit-2018-jun-11_en
  46. 46. Researcher Level: Best practices and software carpentry
  47. 47. Distributed infrastructure with shared services ELIXIR operated by Nodes Funded by (inter)national schemes Connected through ELIXIR Access / impact internationally Nodes
  48. 48. Node Services Core and Recommended Resources (with Processes) AAI BioTools Identifiers.org
  49. 49. Standards general specific 1. Define driving user questions(s) 7. Query interface 2. Pre- FAIRification analysis 3. Define semantic model 4. Transform data records 5. Define metadata 6. Deploy FAIR data point Data Stewards (cf RSE) FAIRification Processes
  50. 50. Data Validation Open validation services for archetype archival databases and knowledge bases: public APIs, min information checklists, file formats, phenotyping data. ELIXIR- BE, EBI, UK, FR [Frederik Coppens]
  51. 51. Bioschemas.org Universal Lightweight Web Mark-up to Find, Cite, Index, Summarise without API tears DataCatalog Dataset Event Lab Protocol Tool Training Material Protein ProteinAnnotation ProteinStructure Sample Beacon Machines processable metadata for better software, better search < / >
  52. 52. Bioschemas.org Towards Knowledge Graphs for Biology MarRef Marine Metagenomics Database BioSamples Deposition Database Aside: Google alpha test dataset-search feature (under NDA) invitation….
  53. 53. Describe workflows to be portable, scalable & interoperable with different workflow systems and containerised tools
  54. 54. 58 ELIXIR Tools, Workflows & Containers BioTools Registry Packaging Containers Integration Workflows Benchmarking Info Standards Software Best Practice Communities InteroperabilityTraining ComputeData EDAM
  55. 55. Five steps to better data better research – metadata at source Get expert help Train yourTeam Publish your Data Develop a Data Management Plan Annotate for strangers
  56. 56. Five steps to better data better research – metadata at source Annotate for strangers create analysis- friendly data use a unique identifier for each record record your processing steps use standards try to use platforms and tools that work together & help save and backup raw data
  57. 57. Tragedy of the Commons metadata & identifier quality https://ncip.nci.nih.gov/blog/face-new-tragedy-commons-remedy-better-metadata/ https://metadatacenter.org Creating good metadata takes considerable work …. When investigators act in their own self- interest, taking short cuts to generate metadata as quickly as possible, we should expect that the overall utility of the resource will decline. … a need for easy-to-use solutions that are generic to provide guidance over the entire life cycle of metadata — streamlining metadata creation, discovery, and access, as well as supporting metadata publication to third-party repositories” Mark Musen
  58. 58. The Nodes: The last (or is it first?) mile Bench Benefit HEI’s Institutes Industry HEI’s Institutes Industry Policy Makers Public Nodes “Act Local Think Global” The ‘last mile’ challenge for European research e-infrastructures https://doi.org/10.3897/rio.2.e9933
  59. 59. The Nodes: The last (or is it first?) mile Bench Benefit “Act Local Think Global” Nodes The ‘last mile’ challenge for European research e-infrastructures https://doi.org/10.3897/rio.2.e9933
  60. 60. FAIRDOM Project Commons, Stewardship and the Last Mile https://fairdomhub.org SOPs https://nels.bioinfo.no https://bio.tools/nels NORWAY models Data DATA fair-dom.org Data Stewards
  61. 61. Researcher / DeveloperInstitution / Lab National / International Knowledge Exchange Report: http://www.knowledge-exchange.info/event/ke-approach-open-scholarship The ‘last mile’ challenge for European research e-infrastructures https://doi.org/10.3897/rio.2.e9933 Act Think
  62. 62. Overcoming the Tragedy of the Commons at all scales … TOGETHER…. Help Skills CommunityPlans and Policies Work for strangers Value Systems Sweatshops CreditInfrastructure funding models FAIR ≠ FREE First Mile Ramps Professionalise Beat Cultural Inertia Pay RSEs Data stewards Skill at SourceServices and Practices embedding
  63. 63. Acknowledgements http://www.fair-dom.org http://www.fairdomhub.org http://seek4science.org http://rightfield.org.uk http://www.bioschemas.org http://www.commonwl.org http://www.bioexcel.eu http://www.software.ac.uk http://www.elixir-europe.org
  64. 64. Funder Acknowledgements European Union Horizon 2020 program under grant agreement 676559 Implementation Studies CWL and Bioschemas European Union Horizon 2020 program under grant agreement 675728. European Union Horizon 2020 program under grant agreement 654248. European Union Horizon 2020 program under grant agreement 739563.

×