Good Practices for Developing
Scientific Software Frameworks
The WRENCH framework example (and some others)
Rafael Ferreira da Silva
https://rafaelsilva.com
https://wrench-project.org
In collaboration with:
What is the best way to write
secure and reliable applications?
2
3
No Code!
Write
nothing,
deploy
nowhere!
How to proceed?
Here are some guidelines…
4
Good Enough Practices in Scientific Computing
• Among several recommendations, this paper emphasizes the following
aspects for software development:
• Place a brief explanatory comment at the start of every program
• Decompose programs into functions
• Be ruthless about eliminating duplication
• Always search for well-maintained software libraries that do what you need
• Test libraries before relying on them
• Give functions and variables meaningful names
• Make dependencies and requirements explicit
• Do not comment and uncomment sections of code to control a program's behavior
• Provide a simple example or test data set
• Submit code to a reputable DOI-issuing repository
5
https://arxiv.org/abs/1609.00037
Best Practices for Scientific Computing
• This paper focuses on the development of scientific software for science
domains:
• Write programs for people, not computers
• Let the computer do the work
• Make incremental changes
• Do not repeat yourself (or others)
• Plan for mistakes
• Optimize software only after it works correctly
• Document design and purpose, not mechanics
• Collaborate
6
https://doi.org/10.1371/journal.pbio.1001745
Scientific Software Best Practices
• This page highlights best-practices for scientific software development:
• Designing your software
• Software Requirements Specification
• Test-driven Development
• Version Control
• Branching Strategy
• Versioning of Releases
• Creating Citable Code
• Coding Guidelines and Code Review
• Continuous Integration
• Gamification
• Integrated Development Environment
7
https://scientific-software-best-practices.readthedocs.io/en/latest/
Best Practices for Scientific Software
• This blog post focuses on domain science software:
• Hosting
• Packaging / installation
• Documentation
• Assistance
• Testing
• Academic publishing
8
https://software.ac.uk/blog/2017-11-29-best-practices-scientific-software
However, there are no actual
recommendations for developing
scientific software frameworks!
Here it is our approach for the
WRENCH Software Framework…
9
The WRENCH Simulation Framework
10
• Objective: Make it easy to develop simulators of complex
Cyberinfrastructure application executions
• Provides high-level, reusable simulation abstractions
• Produces accurate and scalable simulations
https://wrench-project.org
Hosting: Open Source Project
• Some numbers
• > 26K lines of code
• C++, Javascript
• > 300 files
• ~3000 commits
• 16 contributors
• 12 releases
• 7 branches
• Version control
• Issues Tracking
11
https://github.com/wrench-project/wrench
License
12
https://choosealicense.com
Documentation
• Installation Guide
• Getting started
• APIs reference
• Tutorials
• Examples
• Focused guides
• 101, 102, etc.
• Tools
• Doxygen
• Sphynx
13
https://wrench-project.org/wrench/1.8
Testing: Code Coverage
• Unit Tests
• Google Tests
• Integration Tests
• CodeCov
• >88% code coverage
• Other free options
• Coveralls
• SonarCloud
14
https://codecov.io/gh/wrench-project/wrench
Code Quality Analysis
• Automated code review
• Identify issues through static
code review analysis
• Duplication
• Vulnerability
• Etc.
• Tools
• CodeFactor
• Codacy
• SonarCloud
15
https://www.codefactor.io/repository/github/wrench
-project/wrench/overview/master
Continuous Integration (CI) / Deployment (CD)
• Merging in small code
changes frequently
• Automated builds, testing,
and deployment
• Tools
• GitHub Actions
• TravisCI
• AppVeyor
• Jenkins
• Bamboo
16
https://github.com/wrench-project/wrench/actions
Continuous Delivery (CD)
• Kubernetes
https://kubernetes.io
• Amazon AWS / Google
Cloud / Microsoft Azure
• Ansible
https://www.ansible.com
• Packer
https://www.packer.io
17
https://dz2cdn1.dzone.com/storage/temp/11914589-what-
is-continuous-delivery.jpg
Core Infrastructure Initiative (CII)
18
https://bestpractices.coreinfrastructure.org/en/projects/2357
Gamification
• Getting a little badge to appear notifying you, gives a
(small) feeling of accomplishment and can be a good
motivator to write better pull requests
19
Statistics /
Usage
20
Outreach
• Website
• Presentations
• Publications
21
Releases
• Semantic versioning
https://semver.org
22
Some very successful scientific
software framework examples…
23
24
Pegasus Workflow
Management System
http://pegasus.isi.edu
• Open source project since 2001
• Documentation with examples
and tutorial
• Tests and continuous
Integration
• Thousands of citations
Scikit-learn
• Open source Python
library used for
millions of data
scientists
• Tests code coverage:
98%
25
and many, many others…
26
Thank you!
Questions?
27

Good Practices for Developing Scientific Software Frameworks: The WRENCH framework example (and some others)

  • 1.
    Good Practices forDeveloping Scientific Software Frameworks The WRENCH framework example (and some others) Rafael Ferreira da Silva https://rafaelsilva.com https://wrench-project.org In collaboration with:
  • 2.
    What is thebest way to write secure and reliable applications? 2
  • 3.
  • 4.
    How to proceed? Hereare some guidelines… 4
  • 5.
    Good Enough Practicesin Scientific Computing • Among several recommendations, this paper emphasizes the following aspects for software development: • Place a brief explanatory comment at the start of every program • Decompose programs into functions • Be ruthless about eliminating duplication • Always search for well-maintained software libraries that do what you need • Test libraries before relying on them • Give functions and variables meaningful names • Make dependencies and requirements explicit • Do not comment and uncomment sections of code to control a program's behavior • Provide a simple example or test data set • Submit code to a reputable DOI-issuing repository 5 https://arxiv.org/abs/1609.00037
  • 6.
    Best Practices forScientific Computing • This paper focuses on the development of scientific software for science domains: • Write programs for people, not computers • Let the computer do the work • Make incremental changes • Do not repeat yourself (or others) • Plan for mistakes • Optimize software only after it works correctly • Document design and purpose, not mechanics • Collaborate 6 https://doi.org/10.1371/journal.pbio.1001745
  • 7.
    Scientific Software BestPractices • This page highlights best-practices for scientific software development: • Designing your software • Software Requirements Specification • Test-driven Development • Version Control • Branching Strategy • Versioning of Releases • Creating Citable Code • Coding Guidelines and Code Review • Continuous Integration • Gamification • Integrated Development Environment 7 https://scientific-software-best-practices.readthedocs.io/en/latest/
  • 8.
    Best Practices forScientific Software • This blog post focuses on domain science software: • Hosting • Packaging / installation • Documentation • Assistance • Testing • Academic publishing 8 https://software.ac.uk/blog/2017-11-29-best-practices-scientific-software
  • 9.
    However, there areno actual recommendations for developing scientific software frameworks! Here it is our approach for the WRENCH Software Framework… 9
  • 10.
    The WRENCH SimulationFramework 10 • Objective: Make it easy to develop simulators of complex Cyberinfrastructure application executions • Provides high-level, reusable simulation abstractions • Produces accurate and scalable simulations https://wrench-project.org
  • 11.
    Hosting: Open SourceProject • Some numbers • > 26K lines of code • C++, Javascript • > 300 files • ~3000 commits • 16 contributors • 12 releases • 7 branches • Version control • Issues Tracking 11 https://github.com/wrench-project/wrench
  • 12.
  • 13.
    Documentation • Installation Guide •Getting started • APIs reference • Tutorials • Examples • Focused guides • 101, 102, etc. • Tools • Doxygen • Sphynx 13 https://wrench-project.org/wrench/1.8
  • 14.
    Testing: Code Coverage •Unit Tests • Google Tests • Integration Tests • CodeCov • >88% code coverage • Other free options • Coveralls • SonarCloud 14 https://codecov.io/gh/wrench-project/wrench
  • 15.
    Code Quality Analysis •Automated code review • Identify issues through static code review analysis • Duplication • Vulnerability • Etc. • Tools • CodeFactor • Codacy • SonarCloud 15 https://www.codefactor.io/repository/github/wrench -project/wrench/overview/master
  • 16.
    Continuous Integration (CI)/ Deployment (CD) • Merging in small code changes frequently • Automated builds, testing, and deployment • Tools • GitHub Actions • TravisCI • AppVeyor • Jenkins • Bamboo 16 https://github.com/wrench-project/wrench/actions
  • 17.
    Continuous Delivery (CD) •Kubernetes https://kubernetes.io • Amazon AWS / Google Cloud / Microsoft Azure • Ansible https://www.ansible.com • Packer https://www.packer.io 17 https://dz2cdn1.dzone.com/storage/temp/11914589-what- is-continuous-delivery.jpg
  • 18.
    Core Infrastructure Initiative(CII) 18 https://bestpractices.coreinfrastructure.org/en/projects/2357
  • 19.
    Gamification • Getting alittle badge to appear notifying you, gives a (small) feeling of accomplishment and can be a good motivator to write better pull requests 19
  • 20.
  • 21.
  • 22.
  • 23.
    Some very successfulscientific software framework examples… 23
  • 24.
    24 Pegasus Workflow Management System http://pegasus.isi.edu •Open source project since 2001 • Documentation with examples and tutorial • Tests and continuous Integration • Thousands of citations
  • 25.
    Scikit-learn • Open sourcePython library used for millions of data scientists • Tests code coverage: 98% 25
  • 26.
    and many, manyothers… 26
  • 27.