Your SlideShare is downloading. ×
0
Implementing Policy
WSSSPE Workshop 2013

Daisie Huang
Biodiversity Research Centre
University of British Columbia
Implementing Policy
•

Key issues:
•

As software matures, new problems emerge.

•

Sustainability issues should be addres...
Implementing Policy
➡ API

Governance

➡ Software

Security

➡ Sustainability
Implementing Policy
➡ API

Governance

Developing Systems for API Governance
C Krintz, H Jayathilaka, S Dimopoulos, A Puch...
API Governance
•

Scientific research
relies on access to
digital assets as well
as hardware.

•

APIs govern the
interact...
API Governance
•

APIs need to be portable and consistent.
•

Semantic compatibility

•

Syntactic compatibility
Implementing Policy
➡ API

Governance

➡ Software

Security

➡ Sustainability
Implementing Policy
➡ Software

Security

Toward a Research Software Security
Maturity Model
R Heiland, B Thomas, V Welch,...
Software Security
Software Security
•

A Security Maturity Model can formalize this
process:
•

Provides classification of software security...
Implementing Policy
➡ API

Governance

➡ Software

Security

➡ Sustainability
Implementing Policy

➡ Sustainability

A User Perspective on Sustainable
Scientific Software
Brian Blanton and Chris Lenha...
Sustainability
•

Tension between “getting it
done” enough to publish
scientific results and “getting
it right” for future...
Sustainability
Co-funding

Best suited for large, collaborative projects
Sustainability
“Software carpentry”

Teach scientists to use software development best
practices.
Implementing Policy

➡ Sustainability

Software Engineering as Instrumentation
for the Long Tail of Scientific Software
Da...
The Long Tail
The lifespan of scientific software can be
unexpectedly long.
The Long Tail
Lots of small programs implement different methods.
Facets of software design
•

API development

•

Security

•

User interface design

•

Test engineering

•

Deployment
Facets of software design
Phylogenetics/Genomics/Ecology/Mol
ecular Biology/Developmental Biology
•

API development

•

S...
Instrumentation
•

Software engineering as a resource

•

Analogous to DNA sequencing facilities
Instrumenting Software
Engineering
•

A scientific software engineering center can
provide these resources to many project...
Conclusions
•

Many facets of software design not addressed in most
scientific software projects.

•

Possible solutions i...
Upcoming SlideShare
Loading in...5
×

Implementing policy @ WSSSPE

317

Published on

Panel subpresentation on Implementing Policy in Sustainable Software

Published in: Technology, Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
317
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
7
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Hello, I’m Daisie Huang, and
    I’m an evolutionary biologist at the University of British Columbia
    and I’m also a software engineer.
    I’ll be discussing some matters of implementing policy in sustainable scientific software.
  • To create sustainable software, we need to look at some key issues.
    First, we need to acknowledge that as a software package matures,
    we will face new types of problems, and
    we need to plan for these throughout a package’s life cycle.
    Therefore, making scientific software sustainable means that we define policies and guidelines that the scientific community can follow and implement.
    But the reality of science today is that we have limited resources and rewards to encourage people to follow these policies and guidelines.
    Implementing policy often takes specialized expertise in software engineering.
  • In light of these issues, I’ll be discussing several papers that were contributed to the workshop.
    Some of these papers focus on specific facets of software design that are not often addressed in scientific software development, such as
    API governance and
    Software Security, and
    some of the papers discuss strategies to implement all of the different facets of sustainable software design in the framework of scientific software.
  • First, Krintz et al from The University of California at Santa Barbara’s Department of Computer Science discuss one of these important issues: developing systems for API governance.
  • The authors make the point that scientific research is moving away from local hardware environments towards cloud computing.
    Therefore, instead of focusing on access to hardware, we will need to focus on access to the digital assets: the code and the data.
    APIs—programming interfaces—are the main link governing interactions between these assets.
    Because APIs are the main interface between different digital assets, they have to be maintained in a sustainable way.
  • The authors focus on understanding the portability and consistency of APIs used to connect these data archives,
    because changes in APIs affect accessibility of data.
    They define two different types of compatibility,
    semantic compatibility and
    syntactic compatibility.
    They demonstrate an algorithmic method for categorizing a particular API port as “hard” or “easy,” at least for semantic compatibility.
  • Next, we’ll look at issues of software security.
  • Heiland et al from the Center for Trustworthy Scientific Infrastructure discussed issues related to implementing strong security measures for scientific software.
    They point out that cybersecurity is rarely addressed in scientific software design.
  • Security considerations for software vary depending on the maturity level of the software package.
    But when we initially develop scientific software, we generally don’t know what the final maturity level will be.
    Scientific software developers are probably not aware of best practices for cybersecurity.
    So the authors introduce the concept of Software Security Maturity Models, such as OpenSAMM and BSI-MM.
    These are used in industry to identify and define security vulnerabilities at different stages of the software life cycle.
  • They suggest that a similar Software Security Maturity Model can formalize this process:
    It provides classification of software security practices.
    It provides a path for tightening security practices as a package’s maturity level increases.
    It emphasizes understandability over complexity.
  • Finally, we’ll look at some papers that discuss implementing sustainability in scientific software.
  • Blanton and Lenhardt from the Renaissance Computing Institute discuss these issues from a user perspective.
  • The authors focus on a point that has been brought up many times in this context:
    There is a tension between writing code that is good enough just to “get it done,” i.e. to publish a paper about scientific results obtained using software,
    and “getting it right,” that is, developing software that is comprehensible to future users and reviewers.
    Just because the elevator panel works like this doesn’t mean it’s sustainable for the long run.
    We don’t have a way to validate that the software used in a paper is actually done right.
    The best way to get software designed correctly is to make sure best practices are considered from the start.
  • The authors highlight two models for sustainable software, at different extremes:
    One is what they call “co-funding”:
    In these projects, usually large, multi-year collaborations,
    there is equal emphasis on both the science and the software development.
    Both are planned into the project from inception.
    In the life sciences, the iPlant Collaborative, Galaxy Project, and Qiime are good examples of these sorts of large, well-designed projects.
  • At the other extreme, they discuss “software carpentry”:
    in this model, it’s assumed that the scientists themselves will write and maintain their code.
    Groups like Software Carpentry and ROpenSci assume that
    scientists won’t have access to dedicated software engineering,
    so they try to give them tools to use best practices in their own software development.
  • There might be a middle ground here: a way to get the engineering expertise that large co-funded projects have to individual scientist-developers.
    Hilmar Lapp of NESCent and I discuss one such possibility in our paper, Software Engineering as Instrumentation for the Long Tail of Scientific Software.
  • What do we mean when we refer to the “long tail” of scientific software?
    Think of the distribution of resources in scientific software. Most are focused on big projects with lots of community buy-in and funding. But a lot of scientific software exists away from this model.
    For example, scientific software can be used long after the original developer has moved on or the funding runs out.
    Look at MacClade: it was originally released in 1986 and last updated in 2005,
    but it was still cited over 400 times in 2013!
    The scientists who developed it have a newer package, Mesquite, that was meant to replace MacClade, but they haven’t had sufficient time or resources to maintain either package fully, let alone both of them.
  • Another dimension of the long tail can also be found in my particular research domain.
    In the field of phylogenetics, we have a lot of programs that implement different computational methods in slightly different ways.
    Here, Joe Felsenstein has listed some (but not anywhere near all) phylogenetics packages available online.
    Most of these programs are developed by academic scientists…
    They generally have limited training in software engineering
    Limited time or career incentive to improve software
    Limited funding
  • So, to summarize a bit:
    Making sustainable software means we have to pay attention to many facets of software design, like APIs, security, user experience, testing, etc.
    A single project that requires one full­-time software engineer may actually require fractions of different kinds of engineers.
    But long-tail projects can’t even fund one FTE, let alone one that can address all these facets.
  • Then we have to consider that the users of scientific software are scientists,
    so the developers need to understand the users and the science.
    This is the idea of a “t-skilled” person:
    one who is both well-versed in a scientific domain
    and deeply experienced in one or more facets of software engineering.
    These people are pretty rare in the first place and difficult to retain in academia, because the academic career structure doesn’t incentivize this.
  • We should look at software engineering as an expensive resource, but one that needs to be accessible to scientists at all levels.
    Think of it as analogous to DNA sequencing:
    Sequencers used to be something that individual labs and institutions had to
    buy, maintain, and operate themselves,
    so only highly-funded operations had them and probably didn’t use them to their full capacity even when they had one.
    But now, core facilities provide the instrumentation and service to labs of any size.
    Anyone can pay a core facility to sequence their samples for them and provide quality control and bioinformatics advice as additional services.
  • We propose that software engineering can be “instrumented” in a similar way.
    Let’s create a nonprofit center for scientific software engineering.
    This center can hire these t-skilled personnel and provide access to them for projects at contracted cost.
    Because the center is focused on providing development services to scientific projects, it is not tied to the long-term success or failure of any individual project.
    It would emphasize the centrality of doing good science by making functional software tools as envisioned by scientists.
  • So, to conclude…
    Implementing policies to encourage sustainability in scientific software
    requires that many facets of good software design are addressed throughout the lifecycle of these projects.
    But most of them aren’t addressed in the status quo.
    We’ve highlighted some of these facets today and suggested some possible solutions.
    Large projects can afford to hire software engineers with the expertise to implement these facets correctly.
    Grassroots developer groups can provide guidance to scientists about best practices in software development.
    We think there is a place for a software engineering center that can provide
    both engineering expertise and guidance
    with a contract-driven instrumentation model
    to the scientific software in the long tail.
  • Transcript of "Implementing policy @ WSSSPE"

    1. 1. Implementing Policy WSSSPE Workshop 2013 Daisie Huang Biodiversity Research Centre University of British Columbia
    2. 2. Implementing Policy • Key issues: • As software matures, new problems emerge. • Sustainability issues should be addressed throughout the life cycle. • How to implement sustainability when resources are limited?
    3. 3. Implementing Policy ➡ API Governance ➡ Software Security ➡ Sustainability
    4. 4. Implementing Policy ➡ API Governance Developing Systems for API Governance C Krintz, H Jayathilaka, S Dimopoulos, A Pucher, and R Wolski, Department of Computer Science, UC Santa Barbara
    5. 5. API Governance • Scientific research relies on access to digital assets as well as hardware. • APIs govern the interactions between these digital assets. from phylotastic.org
    6. 6. API Governance • APIs need to be portable and consistent. • Semantic compatibility • Syntactic compatibility
    7. 7. Implementing Policy ➡ API Governance ➡ Software Security ➡ Sustainability
    8. 8. Implementing Policy ➡ Software Security Toward a Research Software Security Maturity Model R Heiland, B Thomas, V Welch, C Jackson, Center for Trustworthy Scientific Cyberinfrastructure, Indiana University
    9. 9. Software Security
    10. 10. Software Security • A Security Maturity Model can formalize this process: • Provides classification of software security practices. • Provides a path for tightening security practices as a package’s maturity level increases. • Emphasizes understandability over complexity.
    11. 11. Implementing Policy ➡ API Governance ➡ Software Security ➡ Sustainability
    12. 12. Implementing Policy ➡ Sustainability A User Perspective on Sustainable Scientific Software Brian Blanton and Chris Lenhardt, Renaissance Computing Institute
    13. 13. Sustainability • Tension between “getting it done” enough to publish scientific results and “getting it right” for future users.
    14. 14. Sustainability Co-funding Best suited for large, collaborative projects
    15. 15. Sustainability “Software carpentry” Teach scientists to use software development best practices.
    16. 16. Implementing Policy ➡ Sustainability Software Engineering as Instrumentation for the Long Tail of Scientific Software Daisie Huang and Hilmar Lapp, UBC and NESCent
    17. 17. The Long Tail The lifespan of scientific software can be unexpectedly long.
    18. 18. The Long Tail Lots of small programs implement different methods.
    19. 19. Facets of software design • API development • Security • User interface design • Test engineering • Deployment
    20. 20. Facets of software design Phylogenetics/Genomics/Ecology/Mol ecular Biology/Developmental Biology • API development • Security • User interface design • Test engineering • Deployment
    21. 21. Instrumentation • Software engineering as a resource • Analogous to DNA sequencing facilities
    22. 22. Instrumenting Software Engineering • A scientific software engineering center can provide these resources to many projects. • Governed by long­-term vision that is not tied to success or failure of any individual project. • Emphasis on executing good science by making functional tools.
    23. 23. Conclusions • Many facets of software design not addressed in most scientific software projects. • Possible solutions include: • • • large projects can hire developers with software engineering expertise providing scientists with software design guidance A software engineering center can provide both expertise and guidance to the long tail.
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×