SlideShare a Scribd company logo
1 of 21
Download to read offline
Best practices for DuraMat software dissemination
Anubhav Jain, Silvana Ovaitt, Cliff Hansen, Robert White
April 17, 2023
slides (already) posted to
https://hackingmaterials.lbl.gov
DuraMat funds many projects that produce software products –
but currently without many standards or guidance
Project Link
DuraMat data hub https://datahub.duramat.org
PV Analytics https://github.com/pvlib/pvanalytics
PV Ops https://github.com/sandialabs/pvOps
VocMax https://github.com/toddkarin/vocmax
PV Climate Zones https://github.com/toddkarin/pvcz
PVTools https://pvtools.lbl.gov/string-length-calculator
PV ARC thickness estimator https://github.com/DuraMAT/pvarc
PV-terms https://github.com/DuraMAT/pv-terms
Comparative LCOE calculator www.github.com/NREL/PVLCOE
PV-Pro SDM parameter estimation https://github.com/DuraMAT/pvpro
WhatsCracking https://datahub.duramat.org/dataset/whatscracking-application
• To help you share software products effectively, including:
– Sharing best practices in software dissemination
– Save time and effort in the dissemination process
– Establishing some consistency across projects
– Getting you (and DuraMat) more credit for software products
• This is intended to be a discussion, so
comments/questions/improvements are welcome
– Software best practices often move very quickly as new tools are
introduced
Purpose of this discussion
The level of dissemination should depend on the purpose of the software
Level 1 Level 2 Level 3
Code maturity
and novelty
Code is mostly data
analysis/plots, or using
other already published
packages. The code is
largely intended to
demonstrate usage or
clarify an analysis.
Novelty is low, implementing
published ideas.
Code is structured into
functions which are
intended to serve as a
general toolset for other
analyses.
Code may contain new
algorithms that may require
a disclosure.
Code is rationally and
thoughtfully organized into
packages, modules, classes
and functions. It may serve
as a framework for
downstream analyses.
Code may contain new
algorithms that may require
a disclosure.
Intended Use
and Lifetime
Typically, used to support
and document published
analyses for enhanced
reproducibility – e.g.,
something akin to
supporting information for a
journal publication.
Typically, may serve as
documentation for the
innovations of an entire
project, e.g., for multiple
publications. However, the
project may no longer be
actively maintained after
project end.
The project is intended to
be used and maintained
long-term by the project
team and a community of
users; project lives on even
if/when initial developers
exit the project
Level 1: e.g., one-off” scripts that support a plot, table, etc. in another document
q Follow Laboratory-specific guidelines for approval to release your code.
q Inline code documentation. Each public function and class definition
should have its own documentation (e.g., docstring). Use a consistent
format (an example is provided later in this document). A docstring
should include:
q Function purpose
q Input Parameters
q Return parameters
q References (if any)
q Add README
q How to install/run the code as well as associated tests
q How to cite it (i.e. OSTI record, publication, or other)
q Does the README clearly describe the code’s purpose and its
organization
q Add LICENSE that conforms to lab and funding guidelines and includes
copyright-specific wording (see example at end)
will discuss
bullet points
with arrows
in subsequent
slides
Inline code documentation example (Python)
Some notes:
• The formatting of the docstring can
depend on if you are autoconverting
the docstrings to HTML documentation
• Common formatting examples include
reST (restructured text), Google
formatting, epyDoc, etc.
• You can add type hinting to further help
in code readability as well as the ability
to use static type checking tools
Basic README example
https://github.com/DuraMAT/pv-terms/blob/master/README.md
File in Markdown (.md) format
Github page
Basic LICENSE file
https://github.com/NREL/PV_ICE/blob/main/LICENSE.md
• Talk to your lab’s IP / IT departments for guidance
• BSD/MIT licenses are examples of very “open” licenses that allow others to
do what they’d like with the software
– BSD typically gives some more protection against others using your
name to promote their product, e.g. “our commercial product uses LBL-
approved software technology for its analysis” or “uses the same
algorithms developed by the brilliant scientist <your_name_here>”
• Be careful about choosing licenses that require all downstream code to also
use the same license, e.g. GPL/Apache
– e.g., if you leave DuraMat and work for a company, you may no longer be
able to use your own code as companies typically avoid any GPL code
Choosing a license – some guidance
Level 2: e.g., Repository used for lifetime of a project (software itself is a work product)
q All Level 1 items
q In addition to lab-specific guidelines, ensure that DOE
requirements are being met. For example, this likely
includes:
q Software Record (gets recorded in OSTI.gov and
helps in reporting purposes / credit)
q Lab-specific approval to release code
q Set up a public facing Github repository. This could be
hosted by the project organization, by your institution, or
by your research lab. Examples include:
q github.com/DURAMAT
q github.com/NREL
q Additional README components
q Screenshot or visual aid of the project
q Current status of the project (testing use,
production use, actively maintained, etc.)
q Funding information and institutional branding
(logo, funding acknowledgement text)
q Add Contributor license agreement (CLA) for
contributors
q Include any examples of use, Jupyter notebooks or
scripts used for scientific publications that you want to
make available, and data that can also be made available
for testing/demonstration
q Use a standard layout for the repository (an example of a
standard Python layout is provided at the end of this
document)
q Add a consistent versioning scheme. Examples include
semantic versioning (v0.0.1) and date-based versioning
(v2023.01.25); tools like versioneer may help.
q Ensure your software is easy to install locally, including
any necessary dependencies. For example, Python
projects may include files such as setup.py or
requirements.txt.
q Report your software to your funding program so it can
be included in accomplishments
The following text is at the bottom of the LBL BSD-3 license:
You are under no obligation whatsoever to provide any bug fixes, patches, or upgrades to the features,
functionality or performance of the source code ("Enhancements") to anyone; however, if you choose to
make your Enhancements available either publicly, or directly to Lawrence Berkeley National Laboratory
or its contributors, without imposing a separate written license agreement for such Enhancements, then
you hereby grant the following license: a non-exclusive, royalty-free perpetual license to install, use,
modify, prepare derivative works, incorporate into other computer software, distribute, and sublicense
such enhancements or derivative works thereof, in binary and source code form.
Example of a contributor license agreement
https://spdx.org/licenses/BSD-3-Clause-LBNL.html
Example of a standard project layout for a Python project
You can look up standard project
layouts for the programming
language you are using
Some details of the layout may
depend on the tools you are using
for other tasks such as code
distribution or continuous
integration
Example of setup.py for easier installation of your Python-based software
Another example: https://github.com/NREL/PV_ICE/blob/main/setup.py
Such files can help
users install the
correct dependencies
with the correct
versions to ensure
your software runs
smoothly
Level 3: e.g., Ongoing project
q All Level 1 items
q All Level 2 items
q Implement a release system. One option is to use Github
tags and releases. You can obtain a DOI for each release
via Zenodo:
q Link the Github repo to Zenodo
q Perform the release and tag it
q Update the README to include the DOI identifier
Zenodo provides in the “how to cite” section
q Set up continuous integration (CI) tools (examples include
Github actions to execute CircleCI, Travis CI, etc. against
pull requests)
q A code coverage tool (e.g. coveralls) can help
establish that tests cover the entire codebase and
publish test status (pass/fail, test coverage)
q Check for consistent code formatting; a format checker
(e.g., pylint or Python black) can be used to check the
formatting of pull requests and/or automatically reformat
code
q Add Documentation pages (e.g., HTML documentation).
Documents can be deployed at several places (e.g.,
Github pages, readthedocs). Documentation pages
should provide:
q Getting started. Provide simple instructions to
install the code and run a sample problem. Links
here to Tutorials.
q Examples / Tutorials. Links here to illustrations of
using the code.
q API reference. Links here to the documentation of
each public Class, function and/or method. Note
that this can typically be auto-generated.
q Release notes. Links here to logs of changes with
each tagged release.
q Upload to PyPI, conda, or other easy install code service.
q Consider submitting to a code-centric journal
publication such as Journal of Open Source Software
Release versions via Github, citable via Zenodo
Github release
https://github.com/NREL/PV_ICE
Citeable DOI via Zenodo
Continuous integration and code coverage
Add documentation pages
.rst file in Github repo is
compiled to HTML docs
https://pv-ice.readthedocs.io/en/latest/?badge=latest
Consider submitting paper to a code-centric journal
Reviews via Github repo
Releasing large data sets with code
• Data sets should be formally released into a separate archival repository
(project-specific data hub (e.g., DuraMat Data Hub), Figshare, Dryad, etc.).
• If there are smaller files that are needed for the code, for example for unit
tests, document and include these datasets in the repository, provided they
have been cleared for release and are not infringing copyright from other
sources or NDAs.
• Do not use links to local files on your computer!
• https://michal.karzynski.pl/blog/2019/05/26/python-project-maturity-
checklist/
• https://dbader.org/blog/write-a-great-readme-for-your-github-project
• https://www.patricksoftwareblog.com/software-development-checklist-for-
python-applications/
Some other potentially useful links
• Questions / discussion

More Related Content

Similar to Best practices for DuraMat software dissemination

Ubucon 2013, licensing and packaging OSS
Ubucon 2013, licensing and packaging OSSUbucon 2013, licensing and packaging OSS
Ubucon 2013, licensing and packaging OSSNuno Brito
 
Develop FOSS project using Google Code Hosting
Develop FOSS project using Google Code HostingDevelop FOSS project using Google Code Hosting
Develop FOSS project using Google Code HostingNarendra Sisodiya
 
How to implement continuous delivery with enterprise java middleware?
How to implement continuous delivery with enterprise java middleware?How to implement continuous delivery with enterprise java middleware?
How to implement continuous delivery with enterprise java middleware?Thoughtworks
 
Let's build Developer Portal with Backstage
Let's build Developer Portal with BackstageLet's build Developer Portal with Backstage
Let's build Developer Portal with BackstageOpsta
 
The path to an hybrid open source paradigm
The path to an hybrid open source paradigmThe path to an hybrid open source paradigm
The path to an hybrid open source paradigmJonathan Challener
 
Devops interview questions 1 www.bigclasses.com
Devops interview questions  1  www.bigclasses.comDevops interview questions  1  www.bigclasses.com
Devops interview questions 1 www.bigclasses.combigclasses.com
 
Introduction to Google App Engine with Python
Introduction to Google App Engine with PythonIntroduction to Google App Engine with Python
Introduction to Google App Engine with PythonBrian Lyttle
 
Open-Do - Initial concepts and idea
Open-Do - Initial concepts and ideaOpen-Do - Initial concepts and idea
Open-Do - Initial concepts and ideaAdaCore
 
Continues Integration and Continuous Delivery with Azure DevOps - Deploy Anyt...
Continues Integration and Continuous Delivery with Azure DevOps - Deploy Anyt...Continues Integration and Continuous Delivery with Azure DevOps - Deploy Anyt...
Continues Integration and Continuous Delivery with Azure DevOps - Deploy Anyt...Janusz Nowak
 
TYPO3 Flow 2.0 in the field - webtech Conference 2013
TYPO3 Flow 2.0 in the field - webtech Conference 2013TYPO3 Flow 2.0 in the field - webtech Conference 2013
TYPO3 Flow 2.0 in the field - webtech Conference 2013die.agilen GmbH
 
Intro to DevOps 4 undergraduates
Intro to DevOps 4 undergraduates Intro to DevOps 4 undergraduates
Intro to DevOps 4 undergraduates Liran Levy
 
AWS Study Group - Chapter 04 - Hybrid Cloud Architectures [Solution Architect...
AWS Study Group - Chapter 04 - Hybrid Cloud Architectures [Solution Architect...AWS Study Group - Chapter 04 - Hybrid Cloud Architectures [Solution Architect...
AWS Study Group - Chapter 04 - Hybrid Cloud Architectures [Solution Architect...QCloudMentor
 
Partner Development Guide for Kafka Connect
Partner Development Guide for Kafka ConnectPartner Development Guide for Kafka Connect
Partner Development Guide for Kafka Connectconfluent
 
Introduction to cloud-native application development: with Heroku and Spring ...
Introduction to cloud-native application development: with Heroku and Spring ...Introduction to cloud-native application development: with Heroku and Spring ...
Introduction to cloud-native application development: with Heroku and Spring ...Roberto Casadei
 
Azure DevOps for Developers
Azure DevOps for DevelopersAzure DevOps for Developers
Azure DevOps for DevelopersSarah Dutkiewicz
 
August OpenNTF Webinar - Git and GitHub Explained
August OpenNTF Webinar - Git and GitHub ExplainedAugust OpenNTF Webinar - Git and GitHub Explained
August OpenNTF Webinar - Git and GitHub ExplainedHoward Greenberg
 
Open Source Compliance Automation Capability Map
Open Source Compliance Automation Capability MapOpen Source Compliance Automation Capability Map
Open Source Compliance Automation Capability MapShane Coughlan
 
Integration Group - Robot Framework
Integration Group - Robot Framework Integration Group - Robot Framework
Integration Group - Robot Framework OpenDaylight
 

Similar to Best practices for DuraMat software dissemination (20)

Ubucon 2013, licensing and packaging OSS
Ubucon 2013, licensing and packaging OSSUbucon 2013, licensing and packaging OSS
Ubucon 2013, licensing and packaging OSS
 
Develop FOSS project using Google Code Hosting
Develop FOSS project using Google Code HostingDevelop FOSS project using Google Code Hosting
Develop FOSS project using Google Code Hosting
 
How to implement continuous delivery with enterprise java middleware?
How to implement continuous delivery with enterprise java middleware?How to implement continuous delivery with enterprise java middleware?
How to implement continuous delivery with enterprise java middleware?
 
Let's build Developer Portal with Backstage
Let's build Developer Portal with BackstageLet's build Developer Portal with Backstage
Let's build Developer Portal with Backstage
 
The path to an hybrid open source paradigm
The path to an hybrid open source paradigmThe path to an hybrid open source paradigm
The path to an hybrid open source paradigm
 
Devops interview questions 1 www.bigclasses.com
Devops interview questions  1  www.bigclasses.comDevops interview questions  1  www.bigclasses.com
Devops interview questions 1 www.bigclasses.com
 
Introduction to Google App Engine with Python
Introduction to Google App Engine with PythonIntroduction to Google App Engine with Python
Introduction to Google App Engine with Python
 
Open-Do - Initial concepts and idea
Open-Do - Initial concepts and ideaOpen-Do - Initial concepts and idea
Open-Do - Initial concepts and idea
 
DevOps-Roadmap
DevOps-RoadmapDevOps-Roadmap
DevOps-Roadmap
 
Continues Integration and Continuous Delivery with Azure DevOps - Deploy Anyt...
Continues Integration and Continuous Delivery with Azure DevOps - Deploy Anyt...Continues Integration and Continuous Delivery with Azure DevOps - Deploy Anyt...
Continues Integration and Continuous Delivery with Azure DevOps - Deploy Anyt...
 
TYPO3 Flow 2.0 in the field - webtech Conference 2013
TYPO3 Flow 2.0 in the field - webtech Conference 2013TYPO3 Flow 2.0 in the field - webtech Conference 2013
TYPO3 Flow 2.0 in the field - webtech Conference 2013
 
Intro to DevOps 4 undergraduates
Intro to DevOps 4 undergraduates Intro to DevOps 4 undergraduates
Intro to DevOps 4 undergraduates
 
AWS Study Group - Chapter 04 - Hybrid Cloud Architectures [Solution Architect...
AWS Study Group - Chapter 04 - Hybrid Cloud Architectures [Solution Architect...AWS Study Group - Chapter 04 - Hybrid Cloud Architectures [Solution Architect...
AWS Study Group - Chapter 04 - Hybrid Cloud Architectures [Solution Architect...
 
Partner Development Guide for Kafka Connect
Partner Development Guide for Kafka ConnectPartner Development Guide for Kafka Connect
Partner Development Guide for Kafka Connect
 
Introduction to cloud-native application development: with Heroku and Spring ...
Introduction to cloud-native application development: with Heroku and Spring ...Introduction to cloud-native application development: with Heroku and Spring ...
Introduction to cloud-native application development: with Heroku and Spring ...
 
Azure DevOps for Developers
Azure DevOps for DevelopersAzure DevOps for Developers
Azure DevOps for Developers
 
August OpenNTF Webinar - Git and GitHub Explained
August OpenNTF Webinar - Git and GitHub ExplainedAugust OpenNTF Webinar - Git and GitHub Explained
August OpenNTF Webinar - Git and GitHub Explained
 
Parameters for c# developers.pdf
Parameters for c# developers.pdfParameters for c# developers.pdf
Parameters for c# developers.pdf
 
Open Source Compliance Automation Capability Map
Open Source Compliance Automation Capability MapOpen Source Compliance Automation Capability Map
Open Source Compliance Automation Capability Map
 
Integration Group - Robot Framework
Integration Group - Robot Framework Integration Group - Robot Framework
Integration Group - Robot Framework
 

More from Anubhav Jain

Discovering advanced materials for energy applications: theory, high-throughp...
Discovering advanced materials for energy applications: theory, high-throughp...Discovering advanced materials for energy applications: theory, high-throughp...
Discovering advanced materials for energy applications: theory, high-throughp...Anubhav Jain
 
Applications of Large Language Models in Materials Discovery and Design
Applications of Large Language Models in Materials Discovery and DesignApplications of Large Language Models in Materials Discovery and Design
Applications of Large Language Models in Materials Discovery and DesignAnubhav Jain
 
An AI-driven closed-loop facility for materials synthesis
An AI-driven closed-loop facility for materials synthesisAn AI-driven closed-loop facility for materials synthesis
An AI-driven closed-loop facility for materials synthesisAnubhav Jain
 
Available methods for predicting materials synthesizability using computation...
Available methods for predicting materials synthesizability using computation...Available methods for predicting materials synthesizability using computation...
Available methods for predicting materials synthesizability using computation...Anubhav Jain
 
Efficient methods for accurately calculating thermoelectric properties – elec...
Efficient methods for accurately calculating thermoelectric properties – elec...Efficient methods for accurately calculating thermoelectric properties – elec...
Efficient methods for accurately calculating thermoelectric properties – elec...Anubhav Jain
 
Natural Language Processing for Data Extraction and Synthesizability Predicti...
Natural Language Processing for Data Extraction and Synthesizability Predicti...Natural Language Processing for Data Extraction and Synthesizability Predicti...
Natural Language Processing for Data Extraction and Synthesizability Predicti...Anubhav Jain
 
Machine Learning for Catalyst Design
Machine Learning for Catalyst DesignMachine Learning for Catalyst Design
Machine Learning for Catalyst DesignAnubhav Jain
 
Discovering new functional materials for clean energy and beyond using high-t...
Discovering new functional materials for clean energy and beyond using high-t...Discovering new functional materials for clean energy and beyond using high-t...
Discovering new functional materials for clean energy and beyond using high-t...Anubhav Jain
 
Natural language processing for extracting synthesis recipes and applications...
Natural language processing for extracting synthesis recipes and applications...Natural language processing for extracting synthesis recipes and applications...
Natural language processing for extracting synthesis recipes and applications...Anubhav Jain
 
Accelerating New Materials Design with Supercomputing and Machine Learning
Accelerating New Materials Design with Supercomputing and Machine LearningAccelerating New Materials Design with Supercomputing and Machine Learning
Accelerating New Materials Design with Supercomputing and Machine LearningAnubhav Jain
 
DuraMat CO1 Central Data Resource: How it started, how it’s going …
DuraMat CO1 Central Data Resource: How it started, how it’s going …DuraMat CO1 Central Data Resource: How it started, how it’s going …
DuraMat CO1 Central Data Resource: How it started, how it’s going …Anubhav Jain
 
The Materials Project
The Materials ProjectThe Materials Project
The Materials ProjectAnubhav Jain
 
Evaluating Chemical Composition and Crystal Structure Representations using t...
Evaluating Chemical Composition and Crystal Structure Representations using t...Evaluating Chemical Composition and Crystal Structure Representations using t...
Evaluating Chemical Composition and Crystal Structure Representations using t...Anubhav Jain
 
Perspectives on chemical composition and crystal structure representations fr...
Perspectives on chemical composition and crystal structure representations fr...Perspectives on chemical composition and crystal structure representations fr...
Perspectives on chemical composition and crystal structure representations fr...Anubhav Jain
 
Discovering and Exploring New Materials through the Materials Project
Discovering and Exploring New Materials through the Materials ProjectDiscovering and Exploring New Materials through the Materials Project
Discovering and Exploring New Materials through the Materials ProjectAnubhav Jain
 
The Materials Project: Applications to energy storage and functional materia...
The Materials Project: Applications to energy storage and functional materia...The Materials Project: Applications to energy storage and functional materia...
The Materials Project: Applications to energy storage and functional materia...Anubhav Jain
 
The Materials Project: A Community Data Resource for Accelerating New Materia...
The Materials Project: A Community Data Resource for Accelerating New Materia...The Materials Project: A Community Data Resource for Accelerating New Materia...
The Materials Project: A Community Data Resource for Accelerating New Materia...Anubhav Jain
 
Machine Learning Platform for Catalyst Design
Machine Learning Platform for Catalyst DesignMachine Learning Platform for Catalyst Design
Machine Learning Platform for Catalyst DesignAnubhav Jain
 
Applications of Natural Language Processing to Materials Design
Applications of Natural Language Processing to Materials DesignApplications of Natural Language Processing to Materials Design
Applications of Natural Language Processing to Materials DesignAnubhav Jain
 
Assessing Factors Underpinning PV Degradation through Data Analysis
Assessing Factors Underpinning PV Degradation through Data AnalysisAssessing Factors Underpinning PV Degradation through Data Analysis
Assessing Factors Underpinning PV Degradation through Data AnalysisAnubhav Jain
 

More from Anubhav Jain (20)

Discovering advanced materials for energy applications: theory, high-throughp...
Discovering advanced materials for energy applications: theory, high-throughp...Discovering advanced materials for energy applications: theory, high-throughp...
Discovering advanced materials for energy applications: theory, high-throughp...
 
Applications of Large Language Models in Materials Discovery and Design
Applications of Large Language Models in Materials Discovery and DesignApplications of Large Language Models in Materials Discovery and Design
Applications of Large Language Models in Materials Discovery and Design
 
An AI-driven closed-loop facility for materials synthesis
An AI-driven closed-loop facility for materials synthesisAn AI-driven closed-loop facility for materials synthesis
An AI-driven closed-loop facility for materials synthesis
 
Available methods for predicting materials synthesizability using computation...
Available methods for predicting materials synthesizability using computation...Available methods for predicting materials synthesizability using computation...
Available methods for predicting materials synthesizability using computation...
 
Efficient methods for accurately calculating thermoelectric properties – elec...
Efficient methods for accurately calculating thermoelectric properties – elec...Efficient methods for accurately calculating thermoelectric properties – elec...
Efficient methods for accurately calculating thermoelectric properties – elec...
 
Natural Language Processing for Data Extraction and Synthesizability Predicti...
Natural Language Processing for Data Extraction and Synthesizability Predicti...Natural Language Processing for Data Extraction and Synthesizability Predicti...
Natural Language Processing for Data Extraction and Synthesizability Predicti...
 
Machine Learning for Catalyst Design
Machine Learning for Catalyst DesignMachine Learning for Catalyst Design
Machine Learning for Catalyst Design
 
Discovering new functional materials for clean energy and beyond using high-t...
Discovering new functional materials for clean energy and beyond using high-t...Discovering new functional materials for clean energy and beyond using high-t...
Discovering new functional materials for clean energy and beyond using high-t...
 
Natural language processing for extracting synthesis recipes and applications...
Natural language processing for extracting synthesis recipes and applications...Natural language processing for extracting synthesis recipes and applications...
Natural language processing for extracting synthesis recipes and applications...
 
Accelerating New Materials Design with Supercomputing and Machine Learning
Accelerating New Materials Design with Supercomputing and Machine LearningAccelerating New Materials Design with Supercomputing and Machine Learning
Accelerating New Materials Design with Supercomputing and Machine Learning
 
DuraMat CO1 Central Data Resource: How it started, how it’s going …
DuraMat CO1 Central Data Resource: How it started, how it’s going …DuraMat CO1 Central Data Resource: How it started, how it’s going …
DuraMat CO1 Central Data Resource: How it started, how it’s going …
 
The Materials Project
The Materials ProjectThe Materials Project
The Materials Project
 
Evaluating Chemical Composition and Crystal Structure Representations using t...
Evaluating Chemical Composition and Crystal Structure Representations using t...Evaluating Chemical Composition and Crystal Structure Representations using t...
Evaluating Chemical Composition and Crystal Structure Representations using t...
 
Perspectives on chemical composition and crystal structure representations fr...
Perspectives on chemical composition and crystal structure representations fr...Perspectives on chemical composition and crystal structure representations fr...
Perspectives on chemical composition and crystal structure representations fr...
 
Discovering and Exploring New Materials through the Materials Project
Discovering and Exploring New Materials through the Materials ProjectDiscovering and Exploring New Materials through the Materials Project
Discovering and Exploring New Materials through the Materials Project
 
The Materials Project: Applications to energy storage and functional materia...
The Materials Project: Applications to energy storage and functional materia...The Materials Project: Applications to energy storage and functional materia...
The Materials Project: Applications to energy storage and functional materia...
 
The Materials Project: A Community Data Resource for Accelerating New Materia...
The Materials Project: A Community Data Resource for Accelerating New Materia...The Materials Project: A Community Data Resource for Accelerating New Materia...
The Materials Project: A Community Data Resource for Accelerating New Materia...
 
Machine Learning Platform for Catalyst Design
Machine Learning Platform for Catalyst DesignMachine Learning Platform for Catalyst Design
Machine Learning Platform for Catalyst Design
 
Applications of Natural Language Processing to Materials Design
Applications of Natural Language Processing to Materials DesignApplications of Natural Language Processing to Materials Design
Applications of Natural Language Processing to Materials Design
 
Assessing Factors Underpinning PV Degradation through Data Analysis
Assessing Factors Underpinning PV Degradation through Data AnalysisAssessing Factors Underpinning PV Degradation through Data Analysis
Assessing Factors Underpinning PV Degradation through Data Analysis
 

Recently uploaded

Genomic DNA And Complementary DNA Libraries construction.
Genomic DNA And Complementary DNA Libraries construction.Genomic DNA And Complementary DNA Libraries construction.
Genomic DNA And Complementary DNA Libraries construction.k64182334
 
Ahmedabad Call Girls Service 9537192988 can satisfy every one of your dreams
Ahmedabad Call Girls Service 9537192988 can satisfy every one of your dreamsAhmedabad Call Girls Service 9537192988 can satisfy every one of your dreams
Ahmedabad Call Girls Service 9537192988 can satisfy every one of your dreamsoolala9823
 
zoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzohaibmir069
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhousejana861314
 
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxPhysiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxAArockiyaNisha
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...jana861314
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...Sérgio Sacani
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )aarthirajkumar25
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsAArockiyaNisha
 
Boyles law module in the grade 10 science
Boyles law module in the grade 10 scienceBoyles law module in the grade 10 science
Boyles law module in the grade 10 sciencefloriejanemacaya1
 
Recombination DNA Technology (Microinjection)
Recombination DNA Technology (Microinjection)Recombination DNA Technology (Microinjection)
Recombination DNA Technology (Microinjection)Jshifa
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Nistarini College, Purulia (W.B) India
 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfSwapnil Therkar
 
Recombinant DNA technology( Transgenic plant and animal)
Recombinant DNA technology( Transgenic plant and animal)Recombinant DNA technology( Transgenic plant and animal)
Recombinant DNA technology( Transgenic plant and animal)DHURKADEVIBASKAR
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Sérgio Sacani
 
Luciferase in rDNA technology (biotechnology).pptx
Luciferase in rDNA technology (biotechnology).pptxLuciferase in rDNA technology (biotechnology).pptx
Luciferase in rDNA technology (biotechnology).pptxAleenaTreesaSaji
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCEPRINCE C P
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxUmerFayaz5
 

Recently uploaded (20)

Genomic DNA And Complementary DNA Libraries construction.
Genomic DNA And Complementary DNA Libraries construction.Genomic DNA And Complementary DNA Libraries construction.
Genomic DNA And Complementary DNA Libraries construction.
 
Ahmedabad Call Girls Service 9537192988 can satisfy every one of your dreams
Ahmedabad Call Girls Service 9537192988 can satisfy every one of your dreamsAhmedabad Call Girls Service 9537192988 can satisfy every one of your dreams
Ahmedabad Call Girls Service 9537192988 can satisfy every one of your dreams
 
zoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistan
 
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhouse
 
Engler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomyEngler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomy
 
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxPhysiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based Nanomaterials
 
Boyles law module in the grade 10 science
Boyles law module in the grade 10 scienceBoyles law module in the grade 10 science
Boyles law module in the grade 10 science
 
Recombination DNA Technology (Microinjection)
Recombination DNA Technology (Microinjection)Recombination DNA Technology (Microinjection)
Recombination DNA Technology (Microinjection)
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...
 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
 
Recombinant DNA technology( Transgenic plant and animal)
Recombinant DNA technology( Transgenic plant and animal)Recombinant DNA technology( Transgenic plant and animal)
Recombinant DNA technology( Transgenic plant and animal)
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
Luciferase in rDNA technology (biotechnology).pptx
Luciferase in rDNA technology (biotechnology).pptxLuciferase in rDNA technology (biotechnology).pptx
Luciferase in rDNA technology (biotechnology).pptx
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptx
 

Best practices for DuraMat software dissemination

  • 1. Best practices for DuraMat software dissemination Anubhav Jain, Silvana Ovaitt, Cliff Hansen, Robert White April 17, 2023 slides (already) posted to https://hackingmaterials.lbl.gov
  • 2. DuraMat funds many projects that produce software products – but currently without many standards or guidance Project Link DuraMat data hub https://datahub.duramat.org PV Analytics https://github.com/pvlib/pvanalytics PV Ops https://github.com/sandialabs/pvOps VocMax https://github.com/toddkarin/vocmax PV Climate Zones https://github.com/toddkarin/pvcz PVTools https://pvtools.lbl.gov/string-length-calculator PV ARC thickness estimator https://github.com/DuraMAT/pvarc PV-terms https://github.com/DuraMAT/pv-terms Comparative LCOE calculator www.github.com/NREL/PVLCOE PV-Pro SDM parameter estimation https://github.com/DuraMAT/pvpro WhatsCracking https://datahub.duramat.org/dataset/whatscracking-application
  • 3. • To help you share software products effectively, including: – Sharing best practices in software dissemination – Save time and effort in the dissemination process – Establishing some consistency across projects – Getting you (and DuraMat) more credit for software products • This is intended to be a discussion, so comments/questions/improvements are welcome – Software best practices often move very quickly as new tools are introduced Purpose of this discussion
  • 4. The level of dissemination should depend on the purpose of the software Level 1 Level 2 Level 3 Code maturity and novelty Code is mostly data analysis/plots, or using other already published packages. The code is largely intended to demonstrate usage or clarify an analysis. Novelty is low, implementing published ideas. Code is structured into functions which are intended to serve as a general toolset for other analyses. Code may contain new algorithms that may require a disclosure. Code is rationally and thoughtfully organized into packages, modules, classes and functions. It may serve as a framework for downstream analyses. Code may contain new algorithms that may require a disclosure. Intended Use and Lifetime Typically, used to support and document published analyses for enhanced reproducibility – e.g., something akin to supporting information for a journal publication. Typically, may serve as documentation for the innovations of an entire project, e.g., for multiple publications. However, the project may no longer be actively maintained after project end. The project is intended to be used and maintained long-term by the project team and a community of users; project lives on even if/when initial developers exit the project
  • 5. Level 1: e.g., one-off” scripts that support a plot, table, etc. in another document q Follow Laboratory-specific guidelines for approval to release your code. q Inline code documentation. Each public function and class definition should have its own documentation (e.g., docstring). Use a consistent format (an example is provided later in this document). A docstring should include: q Function purpose q Input Parameters q Return parameters q References (if any) q Add README q How to install/run the code as well as associated tests q How to cite it (i.e. OSTI record, publication, or other) q Does the README clearly describe the code’s purpose and its organization q Add LICENSE that conforms to lab and funding guidelines and includes copyright-specific wording (see example at end) will discuss bullet points with arrows in subsequent slides
  • 6. Inline code documentation example (Python) Some notes: • The formatting of the docstring can depend on if you are autoconverting the docstrings to HTML documentation • Common formatting examples include reST (restructured text), Google formatting, epyDoc, etc. • You can add type hinting to further help in code readability as well as the ability to use static type checking tools
  • 9. • Talk to your lab’s IP / IT departments for guidance • BSD/MIT licenses are examples of very “open” licenses that allow others to do what they’d like with the software – BSD typically gives some more protection against others using your name to promote their product, e.g. “our commercial product uses LBL- approved software technology for its analysis” or “uses the same algorithms developed by the brilliant scientist <your_name_here>” • Be careful about choosing licenses that require all downstream code to also use the same license, e.g. GPL/Apache – e.g., if you leave DuraMat and work for a company, you may no longer be able to use your own code as companies typically avoid any GPL code Choosing a license – some guidance
  • 10. Level 2: e.g., Repository used for lifetime of a project (software itself is a work product) q All Level 1 items q In addition to lab-specific guidelines, ensure that DOE requirements are being met. For example, this likely includes: q Software Record (gets recorded in OSTI.gov and helps in reporting purposes / credit) q Lab-specific approval to release code q Set up a public facing Github repository. This could be hosted by the project organization, by your institution, or by your research lab. Examples include: q github.com/DURAMAT q github.com/NREL q Additional README components q Screenshot or visual aid of the project q Current status of the project (testing use, production use, actively maintained, etc.) q Funding information and institutional branding (logo, funding acknowledgement text) q Add Contributor license agreement (CLA) for contributors q Include any examples of use, Jupyter notebooks or scripts used for scientific publications that you want to make available, and data that can also be made available for testing/demonstration q Use a standard layout for the repository (an example of a standard Python layout is provided at the end of this document) q Add a consistent versioning scheme. Examples include semantic versioning (v0.0.1) and date-based versioning (v2023.01.25); tools like versioneer may help. q Ensure your software is easy to install locally, including any necessary dependencies. For example, Python projects may include files such as setup.py or requirements.txt. q Report your software to your funding program so it can be included in accomplishments
  • 11. The following text is at the bottom of the LBL BSD-3 license: You are under no obligation whatsoever to provide any bug fixes, patches, or upgrades to the features, functionality or performance of the source code ("Enhancements") to anyone; however, if you choose to make your Enhancements available either publicly, or directly to Lawrence Berkeley National Laboratory or its contributors, without imposing a separate written license agreement for such Enhancements, then you hereby grant the following license: a non-exclusive, royalty-free perpetual license to install, use, modify, prepare derivative works, incorporate into other computer software, distribute, and sublicense such enhancements or derivative works thereof, in binary and source code form. Example of a contributor license agreement https://spdx.org/licenses/BSD-3-Clause-LBNL.html
  • 12. Example of a standard project layout for a Python project You can look up standard project layouts for the programming language you are using Some details of the layout may depend on the tools you are using for other tasks such as code distribution or continuous integration
  • 13. Example of setup.py for easier installation of your Python-based software Another example: https://github.com/NREL/PV_ICE/blob/main/setup.py Such files can help users install the correct dependencies with the correct versions to ensure your software runs smoothly
  • 14. Level 3: e.g., Ongoing project q All Level 1 items q All Level 2 items q Implement a release system. One option is to use Github tags and releases. You can obtain a DOI for each release via Zenodo: q Link the Github repo to Zenodo q Perform the release and tag it q Update the README to include the DOI identifier Zenodo provides in the “how to cite” section q Set up continuous integration (CI) tools (examples include Github actions to execute CircleCI, Travis CI, etc. against pull requests) q A code coverage tool (e.g. coveralls) can help establish that tests cover the entire codebase and publish test status (pass/fail, test coverage) q Check for consistent code formatting; a format checker (e.g., pylint or Python black) can be used to check the formatting of pull requests and/or automatically reformat code q Add Documentation pages (e.g., HTML documentation). Documents can be deployed at several places (e.g., Github pages, readthedocs). Documentation pages should provide: q Getting started. Provide simple instructions to install the code and run a sample problem. Links here to Tutorials. q Examples / Tutorials. Links here to illustrations of using the code. q API reference. Links here to the documentation of each public Class, function and/or method. Note that this can typically be auto-generated. q Release notes. Links here to logs of changes with each tagged release. q Upload to PyPI, conda, or other easy install code service. q Consider submitting to a code-centric journal publication such as Journal of Open Source Software
  • 15. Release versions via Github, citable via Zenodo Github release https://github.com/NREL/PV_ICE Citeable DOI via Zenodo
  • 16. Continuous integration and code coverage
  • 17. Add documentation pages .rst file in Github repo is compiled to HTML docs https://pv-ice.readthedocs.io/en/latest/?badge=latest
  • 18. Consider submitting paper to a code-centric journal Reviews via Github repo
  • 19. Releasing large data sets with code • Data sets should be formally released into a separate archival repository (project-specific data hub (e.g., DuraMat Data Hub), Figshare, Dryad, etc.). • If there are smaller files that are needed for the code, for example for unit tests, document and include these datasets in the repository, provided they have been cleared for release and are not infringing copyright from other sources or NDAs. • Do not use links to local files on your computer!
  • 20. • https://michal.karzynski.pl/blog/2019/05/26/python-project-maturity- checklist/ • https://dbader.org/blog/write-a-great-readme-for-your-github-project • https://www.patricksoftwareblog.com/software-development-checklist-for- python-applications/ Some other potentially useful links
  • 21. • Questions / discussion