Software Sustainability: preserving the future of research software


Published on

Talk given at the National Science Foundation on the UK e-Science programme, the UK Software Sustainability Institute, and some of the challenges faced in ensuring long term development and maintenance of scientific software

Published in: Technology, Business
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Managed Programme gave money to address gapsMany projects flourished (such as GridSAM, the Application Hosting Environment from RealityGrid and BPEL Designer), but some wilted and faded away.
  • 8 projects with multiple international contributors through SF/CPAN/PyPl
  • With the SSI we have reached a new stage where we are working to support all the current gardeners who are already out there.So, how are we going to do this?
  • Quality of Research funding
  • The reason we are able to have such an impact is because of the approaches we have developed in working with the communityLeads toCSP – how we got betterENGAGE – how we encourage investment
  • Interviews, from ENGAGE and from eUptake/eIUSDistilled into development projectsGuided by database of findings: barriers and enablersPushed out through NGS roadshows, websites, newsletters, workshopsIdentifying the new requirements
  • Monte Carlo Treatment Planning (MCTP)Groups of users at Velindre Hospital and collaborating centres will be able to use the NGS-based computationally intensive radiotherapy planning software through the RTGrid portal on a routine basis, both within and outside an NHS firewall. The documentation and software will be of a sufficiently high quality to allow the RTGrid software to be established at institutions without any help from specialists in the RTGrid project. Data protection and security issues will also be addressed.Crystal Energy Landscape ApplicationThe application uses a good part of the OMII stack, in particular WS-I, GridSam, OMII-BPEL and Grimoires. . This servlet then invokes the BPEL engine that orchestrates the workflow required to perform the search and at the end of the search the results are visualised on a web page. The scientists also use this web page to check progress of the calculation as it gets updated as the results come in.replace DMAREL with DMACRYS, which is capable of dealing with much larger molecules and crystal structuresexpand the BPEL workflow to perform post-processing of the resultsport the deployment to run on both Legion and Condor pool for testing, and design it to then also run on the NGS so that polymorph calculations can be performed by the wider range of users.Epigraphy and papyrology image processing : VRE-SDMapplications developed within eSAD will be encapsulated such that they are easily transported to a distributed computing environment such as the NGS.The presentation to the user will be through a custom development of the NGS applications repository portlet such that complications such as remote resource and application version selection are automatically performed. This JSR-168 compliant portlet will also then seamlessly fit into the portal environment developed within the VRE-SDM project.By basing this development on the NGS application repository we will be able to take advantage of already existing web-service endpoints that are able to connect into the computational resources of the NGS using the OMII-UK developed GridSAM software as currently deployed at partner resources of the NGS.Strengthening and support for eMinerals RMCS systemEnabling RMCS to work on the hardware provided by partner and affiliate sites in addition to that of the core sites;Supporting one change to the software from the AgentX XML tool (now no longer under active development since the loss of core STFC staff earlier in 2008) to the use of XPATH (we have carried out some preliminary work on this);Enhance support for MS Windows users, including reactivation of a java GUI (support lost since the STFC financial crisis) and user-friendly packaging of the client tools;Revision and field-testing of the documentation;Support for working with campus grids using Condor; there are some oddities with the Globus–Condor interface that need examination;Support for the NGS training teams;Creation of some use cases with groups of new users, focussing on the DL_POLY and CASTEP modelling codes and the SHETRAN hydrology codes. Specific groups will be easy to select from within the materials modelling community if this proposal is approved; the SHETRAN community is based in Newcastle.Configuration parameters for the GENIE simulatorThe aim of this project is to provide a fully functional prototype of a 'launchpad' application which will facilitate set-up and launch of GENIE model runs and to facilitate its use in a GENIE training workshop for PhD students and more senior researchers and in Masters-level teaching units at the University of East Anglia and Bristol in the Spring Semester 2009. After evaluation in these environments, an improved version will be added to the trunk of the GENIE subversion repository and a tagged release will be made to allow the use of the launchpad by anybody using the latest stable release of the model.Integrating field work with the e-Lab Notebook with centralized services and archivesThis scenario offers integration with the grid-computing and the associated storage, retrieval and integration of instrument-recorded data. Use of the blog framework makes it easier to store more fully annotated data. The results of other services, for example NGS calculations, can be returned to the blog in an annotated and context-rich format. The investigative computations, “a soft pipelines approach” can be tried and tested incrementally and recorded for discussion, before formally committing to pipelines and other more rigid workflows. This benefits the wider research community by providing improved context for the data, and significantly the processes as these are recorded automatically and is therefore more easily searchable.Strengthening and supporting the text and data analysis toolkit OSCARThe ease with which developer-users could work with OSCAR, and with which developers could build end-user tools would be massively increased by refactoring all of OSCAR to the same Object Oriented style API, with good configuration support and developer and user documentation. Implementing unit testing across the library will make it easier and less difficult to maintain in the future.The developer-user utility would also be enhanced by building a component that enables OSCAR to work in the UIMA architecture, and therefore with the various tools provided by NaCTeM. NaCTeM have indicated strong interest in seeing OSCAR integrated with UIMA
  • Drawing on pool of specialists to drive the continued improvement and impact of research software developed by and for researchers
  • There is a spectrum of approachesExamples:-
  • Based on CSP evaluation and Engage triage
  • JournalTOCS largest collection of TOCs from major publication
  • Update slide for surveymapper?
  • Update slide for surveymapper?
  • How does software sustainability fit within context of software engineering, community engagement, project management, fundingWhat are the external factors like change in effort, timelines and deadlines, licensing, step changes in product development
  • No one sets out to make a bad piece of software
  • Frequency Hopping Spread Spectrum (HedyLamarr) originally using a piano roll, Nikola Tesla for controlling boats
  • Tools –Signal Data Explorer (SDE)􀂄We developed SDE which is now being used:􀂄In CARMEN –neuroscience tools and data sharing􀂄In BROADEN and in Rolls-Royce􀂄We exploited SDE through Cybula Ltd.􀂄Being used on trains􀂄Started to sell out of the box system
  • CAStep: keeping up with the community
  • Allowing people to move makes it easier to bridge gaps as you have a chance of creating common communication structures
  • Become our next collaborator – email
  • Software Sustainability: preserving the future of research software

    1. 1. Software Sustainabilitypreserving the future of research software<br />22 November 2010<br />NSF<br />Neil Chue Hong<br /><br />
    2. 2. Overview<br />e-Science software in the UK<br />A brief history<br />OMII-UK<br />Commissioned Software Programme<br />ENGAGE Programme<br />Software Sustainability Institute<br />Approaches<br />Software Preservation<br />Challenges<br />
    3. 3. UK e-Science Programme: Preparing the Ground<br />“e-Science is about global collaboration in key areas of science, and the next generation of infrastructure that will enable it”<br />John Taylor, D-G RCUK<br />e-Science Centres<br />e-Science Pilot Projects<br />
    4. 4. UK e-Science Budget (2001-2006)<br />EPSRC Breakdown<br />+ Industrial Contributions £25M <br />+ £100M via JISC<br />Total: £213M<br />Staff costs -<br />Grid Resources<br />Computers & Network<br />funded separately<br />Source: Science Budget 2003/4 – 2005/6, DTI(OST)<br />Slide from Steve Newhouse<br />
    5. 5. OMII: Sowing the first seeds<br />11 initial projects funded by Managed Programme<br />Many projects flourished<br />But some wilted and decayed<br />OMII setup to harvest and maintain software output of UK e-Science Core Programme<br />
    6. 6. OMII-UK: Cultivating and Nurturing<br />Emphasis on helping existing software grow<br />Extra gardeners brought in (Edinburgh and Manchester) with their own plant stock<br />Making the garden public through initiatives like Google Summer of Code and ENGAGE<br />Inviting specialists through the PALs scheme<br />Cultivate and sustain community software important to research<br />
    7. 7. Software Sustainability Institute: pruning, staking, grafting<br />Working with research softwareusers and developers<br />Helping review and refactor<br />Providing support and skills<br />Identifying areas of convergence<br />Producing strong, capablesoftware able to live longand be successfully built on<br />
    8. 8. OMII-UK: Cultivating software for all kinds of users<br />
    9. 9. Software Services for eResearch<br />Software Maturation Cycle<br />Documentation<br />and Training<br />Research<br />Users<br />Requirements<br />Gathering<br />Information<br />Provision<br />Software<br />Support<br />Software<br />Improvement<br />Community<br />Development<br />Governance<br />Software<br />Deployment<br />Infrastr<br />Providers<br />Deployment<br />Analysis<br />Promotion + Exploitation<br />Integration<br />Packaging/porting<br />Design /Code<br />Evaluation<br />Software<br />Innovators<br />Testing / Dev<br />Infrastructure<br />Software<br />Contributions<br />
    10. 10. International Collaboration and Impact<br />Active users in 35+ countries<br />UK:<br /><ul><li>Abbott Labs
    11. 11. Abcam
    12. 12. AstroGrid
    13. 13. BioDA
    14. 14. BioSimGrid
    15. 15. BRIDGES
    16. 16. CancerGrid
    17. 17. Cancer Research UK
    18. 18. CARMEN
    19. 19. Chimactica
    20. 20. CISBAN
    21. 21. CISBIC
    22. 22. Cobra-CT
    23. 23. ConvertGrid
    24. 24. CPIB
    25. 25. CSBE
    26. 26. DAASI
    27. 27. DynamO
    28. 28. Eagle Genomics
    29. 29. EaSTCHEM
    30. 30. eDiaMonD
    31. 31. e-Family
    32. 32. e-Fungi
    33. 33. e-Protein
    34. 34. EDINA
    35. 35. EMBL-EBI
    36. 36. EMBOSS
    37. 37. First Group plc
    38. 38. Fujitsu Labs Europe
    39. 39. GEDDM
    40. 40. GEESE
    41. 41. GeneGrid
    42. 42. Genomic Technology and Informatics
    43. 43. GEODE
    44. 44. GOLD
    45. 45. Healthcare@Home
    46. 46. Integrative Biology
    47. 47. ISPIDER
    48. 48. John Innes Centre
    49. 49. LaQuAT
    50. 50. MAISGrid
    51. 51. MCISB
    52. 52. MESSAGE
    53. 53. MRC Harwell
    54. 54. MRC Human Genetics Unit
    55. 55. nanoCMOS</li></ul>- NCeSS<br /><ul><li>NeISS
    56. 56. NeRC CEH
    57. 57. NeRC EBC</li></ul>- NGS<br />- NiBHI<br /><ul><li>NIeeS
    58. 58. ONDEX-SABR
    59. 59. Qurator
    60. 60. Roslin Institute
    61. 61. Rothamsted Research
    62. 62. Shared Genomics
    63. 63. SINAPSE
    64. 64. Unilever Centre for Chemistry
    65. 65. Utopia
    66. 66. VOTES
    67. 67. WSBC</li></ul>USA + Canada:<br /><ul><li>BioMoby
    68. 68. BioMart
    69. 69. BioTeam
    70. 70. caBIG
    71. 71. BIRN
    72. 72. ePCRN
    73. 73. FLOSS
    74. 74. Globus Alliance
    75. 75. iCapture
    76. 76. Indiana University
    77. 77. iPlant
    78. 78. J Craig Venter Institute
    79. 79. GEON
    80. 80. LEAD
    81. 81. Lexicon Genetics
    82. 82. MCS
    83. 83. MEDICUS
    84. 84. NASA JPL
    85. 85. NCSA
    86. 86. Partners Healthcare
    87. 87. PlexLogic
    88. 88. RENCI
    89. 89. Secure Data Grid
    90. 90. UNC</li></ul>Europe:<br /><ul><li>ACGT
    91. 91. ADMIRE
    92. 92. @NeurIST
    93. 93. AstroGrid-D
    94. 94. BEinGrid
    95. 95. BioSapien
    96. 96. BioSeeds
    97. 97. Casimir
    98. 98. CERN
    99. 99. Ciemat
    100. 100. C-INB
    101. 101. Cnio
    102. 102. CSC Finland
    103. 103. DataMiningGrid
    104. 104. DEISA
    105. 105. D-GRID
    106. 106. eChase
    107. 107. EMBRACE
    108. 108. ENFIN
    109. 109. eSysBio
    110. 110. ESO
    111. 111. GeneSilico
    112. 112. Genomining
    113. 113. GridMiner
    114. 114. GridSphere
    115. 115. HELIO
    116. 116. IBIS
    117. 117. IPB – Halle
    118. 118. INB
    119. 119. Inteligrid
    120. 120. IST
    121. 121. KeyGene
    122. 122. KnowArc
    123. 123. MOTEUR
    124. 124. N2Grid
    125. 125. NBIC
    126. 126. OntoGrid
    127. 127. Orfeus Data Center
    128. 128. PLANET
    129. 129. Provenance
    130. 130. RUPAGATION
    131. 131. Scana
    132. 132. Sigenae
    133. 133. SIMDAT
    134. 134. SPIDR/ESSE
    135. 135. SysMO
    136. 136. UnIDART
    137. 137. ViroLab
    138. 138. VLe-S
    139. 139. VPH
    140. 140. Woonstichting De Key
    141. 141. XtreemOS</li></ul>China:<br /><ul><li>CAS
    142. 142. ChinaGrid
    143. 143. EADDG
    144. 144. cnGrid
    145. 145. SAIC
    146. 146. INWA</li></ul>Japan:<br /><ul><li>AIST
    147. 147. BioGrid
    148. 148. GeoGrid
    149. 149. NAREGI
    150. 150. RIKEN</li></ul>South Korea:<br /><ul><li>KISTI</li></ul>SE Asia:<br /><ul><li>GoalNet
    151. 151. IRRI
    152. 152. KooPrime
    153. 153. TCELS
    154. 154. ThaiGrid</li></ul>Multinational:<br /><ul><li>Aventis
    155. 155. BASF
    156. 156. Bayer
    157. 157. BristolMyersSquibb
    158. 158. Hewlett=Packard
    159. 159. IBM
    160. 160. Microsoft
    161. 161. Novartis
    162. 162. Oracle
    163. 163. Pfizer
    164. 164. Philips
    165. 165. Philip Morris
    166. 166. Roche
    167. 167. Sun</li></ul>Africa:<br />- National Bioinformatics Network, South Africa<br />Australia:<br /><ul><li>Curtin Business School
    168. 168. INWA
    169. 169. TDWG</li></ul>South America:<br /><ul><li>ParqueSoft
    170. 170. PCB Chile</li></ul>Tutorials to over 2000 researchers: Antwerp, Bangkok, Basel, Boston, Cambridge, Catania, CERN, Chicago, Edinburgh, Hanoi, Hawaii, Helsinki, Leeds, London, Manchester, Newcastle, Nijmegen, Nottingham, Oxford, San Francisco, Seattle, Seoul, Sheffield, Southampton, Tenerife, Tokyo, Toronto, ISSGC 03 to 09<br />
    171. 171. Developing the role of standards in the community <br />OMII-UK is instrumental in the development and use of standards<br />Enabling interoperation over continental scale<br />AHE across TeraGrid, DEISA, EGEE<br />DataMINX across SRB, GridFTP<br />Reference implementations<br />SAGA enabling legacy applications<br />WS-DAI for data access<br />JSDL/BES/HPC BP for computational job submission<br />
    172. 172. Impact on UK Research<br />The top 75% of “Quality of Research” funding is allocated to 49 UK research institutions out of a total of 159 HEIs<br />
    173. 173. Impact on UK Research<br /> OMII-UK has worked with all 7 of the top research intensive institutions in each region:<br />Oxford, Cambridge, UCL, Imperial<br />Edinburgh<br />Cardiff<br />QUB<br />
    174. 174. Commissioned Software Programme<br />Commissioning<br />Supporting<br />Developing<br />GridSAM<br />Condor WS<br />Geodise<br />Lab<br />AHE<br />BPEL Designer<br />Compute<br />Grimoires<br />Open Grid Manager<br />Info / Registry<br />MANGO<br />Visual/<br />Collab<br />WSRF::<br />Lite<br />FINS/<br />FIRMS<br />Infra /<br />Security<br />WSeSSH<br />£3.4m initial funding for Managed Programme<br />2006: Q1 – Initial projects commissioned; open call to community<br />Deprecated:<br />
    175. 175. Commissioned Software Programme<br />Commissioning<br />Supporting<br />Developing<br />GridSAM<br />Geodise<br />Lab<br />AHE<br />BPEL Designer<br />Compute<br />Grimoires<br />Open Grid Manager<br />Info / Registry<br />KNOOGLE<br />MANGO<br />Visual/<br />Collab<br />WSRF::<br />Lite<br />FINS/<br />FIRMS<br />Infra /<br />Security<br />OMII-AuthZ<br />2006: Q3 – trials complete; new specific commissions<br />Deprecated:<br />Condor WS<br />WSeSSH<br />
    176. 176. Commissioned Software Programme<br />Commissioning<br />Supporting<br />Developing<br />GridSAM<br />Geodise<br />Lab<br />AHE<br />BPEL Designer<br />Compute<br />Grimoires<br />Open Grid Manager<br />Info / Registry<br />KNOOGLE<br />MANGO<br />Visual/<br />Collab<br />RAVE<br />WSRF::<br />Lite<br />FINS/<br />FIRMS<br />Infra /<br />Security<br />OMII-AuthZ<br />2007: Q1 – Application focussed projects complete<br />Deprecated:<br />Condor WS<br />WSeSSH<br />
    177. 177. Commissioned Software Programme<br />Commissioning<br />Supporting<br />Developing<br />GridSAM<br />Geodise<br />Lab<br />GridBSBroker<br />RAPID<br />AHE<br />BPEL Designer<br />Compute<br />OGRSH<br />SAGA<br />Grimoires<br />Open Grid Manager<br />Info / Registry<br />KNOOGLE<br />Visual/<br />Collab<br />RAVE<br />NGS JSDL App Rep<br />PAG<br />WSRF::<br />Lite<br />Infra /<br />Security<br />OMII-AuthZ<br />SCAMP<br />NDG Security<br />WHIP<br />£1.4m additional funding for Commissioned Software Programme<br />2007: Q3 – Software integrated; new portal and simplified access calls<br />Deprecated:<br />MANGO<br />Condor WS<br />WSeSSH<br />FINS/<br />FIRMS<br />
    178. 178. Commissioned Software Programme<br />Commissioning<br />Supporting<br />Developing<br />GridSAM<br />GridBSBroker<br />RAPID<br />AHE<br />BPEL Designer<br />Compute<br />OGRSH<br />SAGA<br />Grimoires<br />Open Grid Manager<br />Info / Registry<br />Visual/<br />Collab<br />KNOOGLE<br />RAVE<br />NGS JSDL App Rep<br />VIC + RAT<br />PAG<br />WSRF::<br />Lite<br />Infra /<br />Security<br />OMII-AuthZ<br />SCAMP<br />NDG Security<br />WHIP<br />2008: Q1 – significant support for implementations of standards<br />Deprecated:<br />MANGO<br />Condor WS<br />WSeSSH<br />FINS/<br />FIRMS<br />Geodise<br />Lab<br />
    179. 179. Commissioned Software Programme<br />Commissioning<br />Supporting<br />Developing<br />GridSAM<br />GridBSBroker<br />RAPID<br />AHE<br />BPEL Designer<br />Compute<br />OGRSH<br />SAGA<br />Grimoires<br />Open Grid Manager<br />Info / Registry<br />Visual/<br />Collab<br />RAVE<br />NGS JSDL App Rep<br />VIC + RAT<br />PAG<br />WSRF::<br />Lite<br />Infra /<br />Security<br />OMII-AuthZ<br />SCAMP<br />NDG Security<br />WHIP<br />2008: Q3 – start of investment in community development<br />Deprecated:<br />MANGO<br />Condor WS<br />WSeSSH<br />FINS/<br />FIRMS<br />Geodise<br />Lab<br />KNOOGLE<br />
    180. 180. Commissioned Software Programme<br />Commissioning<br />Supporting<br />Developing<br />GridSAM<br />GridBSBroker<br />RAPID<br />AHE<br />BPEL Designer<br />Compute<br />SAGA<br />Grimoires<br />Open Grid Manager<br />Info / Registry<br />Visual/<br />Collab<br />RAVE<br />NGS JSDL App Rep<br />VIC + RAT<br />PAG<br />WSRF::<br />Lite<br />Infra /<br />Security<br />OMII-AuthZ<br />WHIP<br />SCAMP<br />NDG Security<br />2009: Q1 – many projects complete, in use by community<br />Deprecated:<br />MANGO<br />Condor WS<br />WSeSSH<br />FINS/<br />FIRMS<br />Geodise<br />Lab<br />KNOOGLE<br />OGRSH<br />
    181. 181. Commissioned Software Programme<br />Commissioning<br />Supporting<br />Developing<br />GridSAM<br />GridBSBroker<br />RAPID<br />AHE<br />BPEL Designer<br />Compute<br />SAGA<br />8 projects with multiple international contributors through SF/CPAN/PyPl<br />75+ evaluations of 40+ components<br />Grimoires<br />Info / Registry<br />Visual/<br />Collab<br />RAVE<br />NGS JSDL App Rep<br />VIC + RAT<br />PAG<br />WSRF::<br />Lite<br />Infra /<br />Security<br />WHIP<br />SCAMP<br />NDG Security<br />Data<br />DiGS<br />CIAS<br />Data<br />MINX<br />OSCAR<br />2009: Q3 – Data call commissioned; focus on community need<br />Deprecated:<br />Open Grid Manager<br />MANGO<br />Condor WS<br />WSeSSH<br />FINS/<br />FIRMS<br />Geodise<br />Lab<br />OMII-AuthZ<br />KNOOGLE<br />OGRSH<br />
    182. 182. Case Study: TavernaWorkbench<br />Initially funded through e-Science myGrid project (2001-2005)<br />Directly funded through OMII-UK (2006-2010)<br />Plus marketing, outreach, legal and networking<br />Platform funding (2009-2014)<br />caBIG subcontract<br />Eli Lilly development<br />40,000+ downloads of Taverna 1.x<br />Take up in other domains,e.g. astronomy <br />
    183. 183. Case Study: NERC Data Grid Security<br />Provides single sign-on to federated data infrastructure <br />NDGS software now installed at major NERC data centres in the UK<br />Now used across multiple projects<br />Filter based approach and OpenID work used by US Earth System Grid for access to CMIP5 archive<br />METAFOR QUESTIONNAIRE<br />COWS/NCEO<br />Contributions back to Python community<br />ndg_saml, ndg_xacml, MyproxyClient<br />
    184. 184. Case Study: VIC + RAT<br />Media backbone tools for audio and video maintained by UCL since early 90s<br />Used as the basis for Access Grid, VRVS<br />OMII-UK funding when other sources cut<br />Allowed continued maintenance and bug fixes<br />Enabled projects from Australia, Korea to contribute<br />However difficulties in sustaining<br />Rapid changes in hardware / software<br />Too low profile<br />Other projects not contributing back <br />
    185. 185. Engaging Research with e-Infrastructure<br />53 direct interviews<br />200+ interviews total<br />Interviews<br />30 month programme<br />14 projects<br />3-6 months duration<br />£650,000 funding<br />Wider<br />deployment<br />17 papers<br />10 posters<br />50 presentations<br />£1.36m further funding<br />Projects<br />Dissemination<br />Adoption<br />New requirements<br />
    186. 186. First Phase ENGAGE Development Projects<br />High Throughput Humanities for e-Research<br />Exposing bioinformatic programs as Web Services <br />Protein Molecule Simulation on the Grid<br />Enable workflows in a Shared Genomics causality workbench <br />Linking and Querying Ancient Texts<br />SWARMCloud<br />Rapid Chemistry Portals by Engaging Users<br />
    187. 187. Second Phase ENGAGE Development Projects<br />Monte Carlo Treatment Planning <br />Crystal Energy Landscape Application<br />Epigraphy and papyrology image processing <br />Strengthening and support for eMinerals RMCS system<br />Configuration parameters for the GENIE simulator<br />Lab Blog Book<br />Strengthening and supporting the text and data analysis toolkit OSCAR<br />
    188. 188. ENGAGE Findings<br />Significant findings include<br />the most challenging aspects of e-Science application development is the communication between development and research teams<br />there are differing time constraints on researchers and developers<br />having good facilitators improves the success of a project<br />centralisation of IT services means that it is harder to do exploratory development<br />adherence to standards can reduce the barriers for the deployment of technology<br />removing complexity can allow researchers to become developers<br />there are still issues when trying to migrate from local to national resources<br />issues which appear trivial to computer scientists can cause researchers to consider the software unusable.<br />
    189. 189. ENGAGE Outputs<br />Significant outputs of the development projects include: <br />publicly available workflows in daily use by students to do analyses of 15,000 protein sequences<br />a protein molecule simulation portal available to any user with a valid UK e-Science certificate<br />a live portal being used teach over 140 students how to optimise molecule structures<br />new data exploration techniques being enabled<br />a number of follow-on projects funded to take the work pioneered in ENGAGE to a larger or different community;<br />many improvements to commonly used software being released back to the community<br />
    190. 190. Case Study: Crystal Energy Landscapes<br />Understanding polymorphism in drugs<br />E.g. Dosage profile<br />Chemists<br />Computational<br />Experimental<br />Developers<br />Domain<br />S/W Engineers<br />Integrators<br />Research Computing Services<br />Facilitator<br /><br />
    191. 191. The Software Sustainability Institute<br />A national facility for research software <br />Providing services for research software users and developers<br />Developing research community interactions and capacity<br />Promoting research software best practice and capability<br />Sustaining software by helping to negotiate the stages of the software maturity cycle<br />
    192. 192. What the SSI does<br />Work with research groups within the UK to improve key research software <br /><ul><li>online materials (tutorials, guides)
    193. 193. consultative advice (software evaluation , development process, community engagement, dissemination, workshops+surgeries)
    194. 194. collaborative partnerships (usability, quality, maintainability)</li></ul>Engagement with international community, doctoral training centres and funding programmes to change policy<br />Providing effort, support and guidance to ensure that researchers can continue to use their chosen software as a cornerstone of their research<br /><ul><li>And beyond the lifetime of its original funding cycle</li></ul>We help do the things standard grants don’t<br />
    195. 195. What the SSI brings<br />Provides specialist skills to drive the continued improvement and impact of research software<br /><ul><li>Drawn from a large and varied pool of expertise
    196. 196. PALs programme funds researcher champions</li></ul>Led by University of Edinburgh with Universities of Manchester and Southampton<br /><ul><li>Director: Neil Chue Hong
    197. 197. Funded by EPSRC for 5 years, 9.5 FTE
    198. 198. Builds on existing collaborations and experience from OMII-UK and EPCC</li></li></ul><li>Software Preservation Purposes<br />Achieve legal compliance<br />Create heritage value<br />Enable continued access to data<br />Encourage software reuse<br />Manage systems and services<br />Purpose<br />
    199. 199. How are you going to choose the right approach?<br />Preservation (techno-centric)<br />Emulation (data-centric)<br />Migration (functionality-centric)<br />Transition (process-centric)<br />Hibernation (knowledge-centric)<br />Approach<br />SSI effort focused here<br />
    200. 200. Current SSI Guides<br />Software development<br />Software development: general best practice <br />Developing maintainable software<br />Testing your software<br />Repositories<br />Choosing a repository for your software project <br />Migrating project resources: what to remember<br />Creating and managing SourceForge projects<br />Retrieving project resources from NeSCForge<br />Open source<br />Adopting an open-source licence<br />Supporting open-source software <br />Community building<br />Recruiting champions for your project<br />Recruiting student developers<br />
    201. 201. SSI Evaluation Criteria <br />Importance: the alignment of the research domain to the UK’s strategic research roadmap.<br />Enthusiasm: the impact which the work will have on the community, engagement of software authors with process and the likely additional contribution that would be gained from the community.<br />Value: the impact on the research outputs. Would the science enabled be significantly improved by the work? This is a measure of the User Demand for improvement.<br />Availability: the likelihood that the work would enable the software to reach a new stage of availability e.g. taking it from within one collaboration to make it fit for the whole research community or a new community<br />Tractability: the impact on the software. Will it be possible to improve easily the quality or performance of the software?<br />Opportunity: will the work lead to new opportunities for sustainability, e.g. collaboration with other groups, commercialisation, alternative funding or new effort?<br />
    202. 202. SSI Pilot Projects<br />Pilot collaborators:<br /><ul><li>Fusion Energy
    203. 203. Climate Policy
    204. 204. GeospatialLinked Data
    205. 205. CrystalStructure
    206. 206. Brain Imaging
    207. 207. Scholarly Journals</li></li></ul><li>Case Study: NeISS<br />Evaluate impact of traffic control measures over next 5/10/15 years<br />Access baseline demographic data about the city<br />Execute simulation of traffic system and population<br />Visualise simulation outputs<br />Augment with new forms of data<br />Run dynamic models to assess future patterns (congestion, health, social inequality)<br />
    208. 208. Case Study: NeISS<br />40<br />
    209. 209. Case Study: NeISS<br />41<br />
    210. 210. Exploiting software for sustainability<br />Models<br />Grant Mosaic<br />Institutional support<br />Fully Costed Service<br />External Enterprise / Consultancy<br />Royalties and Fees<br />Donations<br />Advertising<br />T-shirt (spinoff merchandising)<br />Vehicles<br />University based<br />Spin out company<br />Consultancy and Customisation<br />Industrial knowledge transfer<br />Contracts<br />Licensing<br />Certification<br />Support services / training<br />Software as a Service<br />Software Foundation<br />Most common but what happens when PI retires?<br />
    211. 211. Sustainability in Context<br />Support /<br />Contributions<br />Software<br />Sustainability<br />Community<br />Engagement<br />Software<br />Engineering<br />Product<br />Management<br />Market<br />Development<br />Funding/<br />Effort<br />
    212. 212. Software sustainability is part of the process<br />Comparable to risk management<br />No one right “solution” but many examples of best practice and process<br />Plan from before the start if possible<br />But must be reviewed regularly<br />No longer considering timescales bounded by a project, but considering the product<br />
    213. 213. Software development comes in stages<br />Bridging criteria: strength of team; strength of market; proximity of software to market<br />Idea<br />Prototype<br />Research<br />Idea<br />Prototype<br />Idea<br />Idea<br />Prototype<br />Research<br />Supported<br />Product<br />Idea<br />Prototype<br />Research<br />Supported<br />An idea to solve a problem<br />Understand the functionality<br />Scaling to work for others<br />Allow othersto participate<br />
    214. 214. e-Research is multidisciplinary, timescales don’t synchronise<br />46<br />Gap in Interest?<br />Cutting Edge Research<br />Timescales vary:<br /><ul><li>ARIES (1989 – 1994)
    215. 215. Giant Magnetoresistance (1988 – 1999)
    216. 216. Frequency Hopping (1903/1942 – 1976)
    217. 217. Bayesian statistics (1763 – 1996)</li></ul>Applied Research<br />
    218. 218. Case Study: Signal Processing<br />Slide from Jim Austin<br />
    219. 219. Case Study: CASTEP<br />Building intellectual access ramps to support incremental engagement – building capacity and capability<br />Individual<br />Group<br />Consortium<br />W/ industry<br />Community<br />Active<br />48<br />
    220. 220. Case Study: R-Project<br />Basics: Website, mailing list, code repository, issue resolution<br />Remove barriers to participation, increase efficiency<br />1993: First public release; 2 devs<br />1995: Code open sourced; 3 devs<br />1996: r-testers list set up<br />1997: lists split: r-announce, r-help,r-devel; public CVS; 11 devs<br />2000: CRAN split and mirror<br />2001: BioConductor<br />2003: Namespaces<br />2005: I8n, L8n<br />2007: R-Forge<br />Today: BioConductor (33 core devs), R-Forge (532 projects, 1562 devs), CRAN (1400+ packages)<br />49<br /><br />
    221. 221. The Software Maturity Curve<br />Portals<br />Quantum<br />chemistry<br />Cloud<br />Computing<br />RDBMS<br />Social<br />Simulation<br />Workflows<br />Spatio-<br />Temporal<br />viz<br />Molecular<br />Dynamics<br />Geospatial<br />viz<br />Digitised<br />Doc Analysis<br />Digital<br />repositories<br />Software proliferation<br />Innovation<br />Consolidation<br />Customisation<br />Time<br />
    222. 222. Enabling Innovation<br />Supporting emergent disciplines<br />Needs recognition of innovative software development as part of funding <br />Breaking down barriers<br />We cannot assume that the way people interact with resources will conform to expectations<br />e.g. researchers will use/store files outside of universities<br />Researchers will do whatever they can to get an edge – they will not always conform<br />
    223. 223. Supporting Consolidation<br />“e-Science is an organic, emergent process requiring ongoing, coordinated investment from multiple funders and coordinated action by multiple research and infrastructure communities. It is both an enabler of research and an object of research” – RCUK Review of e-Science<br />Bridging the expectation gaps between participants<br />Maintenance vs. research<br />Different timescales for “exciting” work<br />Well supported open platforms are the key in the age of the research mashup<br />Platforms to enable bottom-up innovation<br />Platforms to enable citizen participation<br />Competition/innovation built on top c.f. industry <br />
    224. 224. Sustaining Customisation<br />“The time constants for real transformative impact and significant competitive advantage is decades” – RCUK Review of e-Science<br />Sustain software infrastructure in the long term<br />Differing models: through centres; within institutions; distributed<br />Need to change perceptions so that software is seen as valuable! (and not just invaluable)<br />Lower barriers to community growth and participation<br />Increase value of providing services<br />Virtually merge + map small amounts of effort / funding<br />
    225. 225. Invest in people<br />People are the most important investment<br />Adaptability, ability to recognise transferable skills, not strict career paths<br />Software developers come from many backgrounds <br />If e-Science is multi-disciplinary, multi-institution, multi-scale then make it easier to recognise peoples efforts as they move<br />University structures do not make it easy<br />These people are key to effective e-Science as they bridge the gap between other participants<br />
    226. 226. The credit question<br />How do we get credit for reusing, extending and sustaining software?<br />Research credit is based on publication output<br />Data citations and credit for reuse are still not commonplace<br />Software credit is the next stage<br />Otherwise how can we persuade people to contribute back?<br />
    227. 227. A National Facility for Research Software<br />Pilot collaborators:<br /><ul><li>Fusion Energy
    228. 228. Climate Policy
    229. 229. GeospatialLinked Data
    230. 230. CrystalStructure
    231. 231. Brain Imaging
    232. 232. Scholarly Journals</li></ul>Become our next collaborators!<br />Email:<br />Blog:<br />Twitter:<br />SlideShare:<br />YouTube:<br />Telephone: +44 (0) 131 650 5030<br />