Ubucon 2013, licensing and packaging OSSNuno Brito
As developers of open source and free software, we share our code freely with the world. It feels great. The problem is when someone points out that the code can't be used for some odd reason. Either because of missing license information or because the reported licenses are incompatible.
If you're writing code then you shouldn't miss this talk. We'll be showing which licenses you should avoid mixing (for e.g Apache v2 inside GPL v2) and other tips to avoid a licensing headache. In the end we'll talk about the SPDX format introduced by the Linux Foundation and show practical examples.
Open Source project failure often stems from not setting clear objectives or having a shared vision from the start. That said there are many success stories, including two well known Statistical examples: Demetra; and Eurostat SDMX tools (SDMX-RI). However, in all these examples there was at first a founding organisation/entity that created the right environment for its successful path into a new paradigm. In the context of my presentation this being the Statistical Information System Collaboration Community (SIS-CC / http://siscc.oecd.org).
Presented at the International Marketing and Output DataBase Conference, Gozd Martuljek, September 18 - 22, 2016.
Quick introduction to Open Data, Open Source, Open Development for the University of Victoria.
This presentation was part of the LocationTech 2015 Tour.
Ubucon 2013, licensing and packaging OSSNuno Brito
As developers of open source and free software, we share our code freely with the world. It feels great. The problem is when someone points out that the code can't be used for some odd reason. Either because of missing license information or because the reported licenses are incompatible.
If you're writing code then you shouldn't miss this talk. We'll be showing which licenses you should avoid mixing (for e.g Apache v2 inside GPL v2) and other tips to avoid a licensing headache. In the end we'll talk about the SPDX format introduced by the Linux Foundation and show practical examples.
Open Source project failure often stems from not setting clear objectives or having a shared vision from the start. That said there are many success stories, including two well known Statistical examples: Demetra; and Eurostat SDMX tools (SDMX-RI). However, in all these examples there was at first a founding organisation/entity that created the right environment for its successful path into a new paradigm. In the context of my presentation this being the Statistical Information System Collaboration Community (SIS-CC / http://siscc.oecd.org).
Presented at the International Marketing and Output DataBase Conference, Gozd Martuljek, September 18 - 22, 2016.
Quick introduction to Open Data, Open Source, Open Development for the University of Victoria.
This presentation was part of the LocationTech 2015 Tour.
Guidelines for Working with Contract Developers in Evergreenloriayre
Slides from presentation on Application Development from Evergreen International Conference 2012. Jed Moffitt also participated in the presentation but he had no slides so these are just the slides from Lori Ayre's portion. Handouts included a sample Development Plan and contract (http://www.slideshare.net/loriayre/development-contract-sample)
What is DevOps Services_ Tools and Benefits.pdfkomalmanu87
This closer relationship between “Dev” and “Ops” permeates every phase of the DevOps lifecycle: from initial software planning to code, build, test, and release phases and on to deployment, operations, and ongoing monitoring. This relationship propels a continuous customer feedback loop of further improvement, development, testing, and deployment. One result of these efforts can be the more rapid, continual release of necessary feature changes or additions.
What is DevOps Services_ Tools and Benefits.pdfkomalmanu87
Some people group DevOps goals into four categories: culture, automation, measurement, and sharing (CAMS), and DevOps tools can aid in these areas. These tools can make development and operations workflows more streamlined and collaborative, automating previously time-consuming, manual, or static tasks involved in integration, development, testing, deployment, or monitoring.
Leverage the power of Open Source in your company Guillaume POTIER
Open source is a major tech key nowadays for companies. In this presentation I try to explain how to carefully choose your OS libraries and how to share some bits of your company code to the OS world.
August OpenNTF Webinar - Git and GitHub ExplainedHoward Greenberg
When OpenNTF began in 2001, source control was little known and sharing of code via the cloud was limited. Fast forward 20 years and GitHub is the dominant sharing site and git the standard technology for source control.
In this webinar Paul Withers and Jesse Gallagher will:
Demystify git
Explain Branching
Show what makes a high quality repository
How to take advantage of GitHub’s broad functionality
Get that coveted "Verified" badge
Go from source control zero to GitHub hero!
The most hated thing a developer can imagine is writing documentation. But on the other hand nothing can compare with a well sorted documentation, in case you want to change or extend something or just want to get into the topic again. We all know, there is no major way how to do documentation, but there a number of principles and todos which makes it much easier for you. This talk is not about tools, like phpDocumentor, nor is it about promoting a special way of documentation. It is about some of the thoughts you should have gone through, before and when writing documentation.
Using the TOGAF® 9.1 Framework with the ArchiMate® 2.1 Modeling LanguageIver Band
This White Paper describes the TOGAF®
9.1 framework and the ArchiMate®
2.1 modeling language, showing at a high level how these two open standards from The Open Group can
be used together.
The main observations are:
The TOGAF framework and the ArchiMate language overlap in their use of viewpoints, and the concept of an underlying common repository of architectural artifacts and models; i.e., they have a firm common foundation.
The two standards complement each other with respect to the definition of an architecture development process and the definition of an Enterprise Architecture modeling language.
The ArchiMate 2.1 standard supports modeling of the architectures throughout the phases of the TOGAF Architecture Development Method (ADM).
The combined use of the TOGAF framework with the ArchiMate modeling language can support better communication with stakeholders inside and outside organizations supporting The Open Group vision of Boundaryless Information Flow™.
- What are Internal Developer Portal (IDP) and Platform Engineering?
- What is Backstage?
- How Backstage can help dev to build developer portal to make their job easier
Jirayut Nimsaeng
Founder & CEO
Opsta (Thailand) Co., Ltd.
Youtube Record: https://youtu.be/u_nLbgWDwsA?t=850
Dev Mountain Tech Festival @ Chiang Mai
November 12, 2022
Mindtree provides devops service that builds continuous delivery capabilities with tool choices through a DevSecOps maturity assessment framework. Click here to know more.
Guidelines for Working with Contract Developers in Evergreenloriayre
Slides from presentation on Application Development from Evergreen International Conference 2012. Jed Moffitt also participated in the presentation but he had no slides so these are just the slides from Lori Ayre's portion. Handouts included a sample Development Plan and contract (http://www.slideshare.net/loriayre/development-contract-sample)
What is DevOps Services_ Tools and Benefits.pdfkomalmanu87
This closer relationship between “Dev” and “Ops” permeates every phase of the DevOps lifecycle: from initial software planning to code, build, test, and release phases and on to deployment, operations, and ongoing monitoring. This relationship propels a continuous customer feedback loop of further improvement, development, testing, and deployment. One result of these efforts can be the more rapid, continual release of necessary feature changes or additions.
What is DevOps Services_ Tools and Benefits.pdfkomalmanu87
Some people group DevOps goals into four categories: culture, automation, measurement, and sharing (CAMS), and DevOps tools can aid in these areas. These tools can make development and operations workflows more streamlined and collaborative, automating previously time-consuming, manual, or static tasks involved in integration, development, testing, deployment, or monitoring.
Leverage the power of Open Source in your company Guillaume POTIER
Open source is a major tech key nowadays for companies. In this presentation I try to explain how to carefully choose your OS libraries and how to share some bits of your company code to the OS world.
August OpenNTF Webinar - Git and GitHub ExplainedHoward Greenberg
When OpenNTF began in 2001, source control was little known and sharing of code via the cloud was limited. Fast forward 20 years and GitHub is the dominant sharing site and git the standard technology for source control.
In this webinar Paul Withers and Jesse Gallagher will:
Demystify git
Explain Branching
Show what makes a high quality repository
How to take advantage of GitHub’s broad functionality
Get that coveted "Verified" badge
Go from source control zero to GitHub hero!
The most hated thing a developer can imagine is writing documentation. But on the other hand nothing can compare with a well sorted documentation, in case you want to change or extend something or just want to get into the topic again. We all know, there is no major way how to do documentation, but there a number of principles and todos which makes it much easier for you. This talk is not about tools, like phpDocumentor, nor is it about promoting a special way of documentation. It is about some of the thoughts you should have gone through, before and when writing documentation.
Using the TOGAF® 9.1 Framework with the ArchiMate® 2.1 Modeling LanguageIver Band
This White Paper describes the TOGAF®
9.1 framework and the ArchiMate®
2.1 modeling language, showing at a high level how these two open standards from The Open Group can
be used together.
The main observations are:
The TOGAF framework and the ArchiMate language overlap in their use of viewpoints, and the concept of an underlying common repository of architectural artifacts and models; i.e., they have a firm common foundation.
The two standards complement each other with respect to the definition of an architecture development process and the definition of an Enterprise Architecture modeling language.
The ArchiMate 2.1 standard supports modeling of the architectures throughout the phases of the TOGAF Architecture Development Method (ADM).
The combined use of the TOGAF framework with the ArchiMate modeling language can support better communication with stakeholders inside and outside organizations supporting The Open Group vision of Boundaryless Information Flow™.
- What are Internal Developer Portal (IDP) and Platform Engineering?
- What is Backstage?
- How Backstage can help dev to build developer portal to make their job easier
Jirayut Nimsaeng
Founder & CEO
Opsta (Thailand) Co., Ltd.
Youtube Record: https://youtu.be/u_nLbgWDwsA?t=850
Dev Mountain Tech Festival @ Chiang Mai
November 12, 2022
Mindtree provides devops service that builds continuous delivery capabilities with tool choices through a DevSecOps maturity assessment framework. Click here to know more.
Similar to Best practices for DuraMat software dissemination (20)
Richard's entangled aventures in wonderlandRichard Gill
Since the loophole-free Bell experiments of 2020 and the Nobel prizes in physics of 2022, critics of Bell's work have retreated to the fortress of super-determinism. Now, super-determinism is a derogatory word - it just means "determinism". Palmer, Hance and Hossenfelder argue that quantum mechanics and determinism are not incompatible, using a sophisticated mathematical construction based on a subtle thinning of allowed states and measurements in quantum mechanics, such that what is left appears to make Bell's argument fail, without altering the empirical predictions of quantum mechanics. I think however that it is a smoke screen, and the slogan "lost in math" comes to my mind. I will discuss some other recent disproofs of Bell's theorem using the language of causality based on causal graphs. Causal thinking is also central to law and justice. I will mention surprising connections to my work on serial killer nurse cases, in particular the Dutch case of Lucia de Berk and the current UK case of Lucy Letby.
Multi-source connectivity as the driver of solar wind variability in the heli...Sérgio Sacani
The ambient solar wind that flls the heliosphere originates from multiple
sources in the solar corona and is highly structured. It is often described
as high-speed, relatively homogeneous, plasma streams from coronal
holes and slow-speed, highly variable, streams whose source regions are
under debate. A key goal of ESA/NASA’s Solar Orbiter mission is to identify
solar wind sources and understand what drives the complexity seen in the
heliosphere. By combining magnetic feld modelling and spectroscopic
techniques with high-resolution observations and measurements, we show
that the solar wind variability detected in situ by Solar Orbiter in March
2022 is driven by spatio-temporal changes in the magnetic connectivity to
multiple sources in the solar atmosphere. The magnetic feld footpoints
connected to the spacecraft moved from the boundaries of a coronal hole
to one active region (12961) and then across to another region (12957). This
is refected in the in situ measurements, which show the transition from fast
to highly Alfvénic then to slow solar wind that is disrupted by the arrival of
a coronal mass ejection. Our results describe solar wind variability at 0.5 au
but are applicable to near-Earth observatories.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.Sérgio Sacani
The return of a sample of near-surface atmosphere from Mars would facilitate answers to several first-order science questions surrounding the formation and evolution of the planet. One of the important aspects of terrestrial planet formation in general is the role that primary atmospheres played in influencing the chemistry and structure of the planets and their antecedents. Studies of the martian atmosphere can be used to investigate the role of a primary atmosphere in its history. Atmosphere samples would also inform our understanding of the near-surface chemistry of the planet, and ultimately the prospects for life. High-precision isotopic analyses of constituent gases are needed to address these questions, requiring that the analyses are made on returned samples rather than in situ.
Seminar of U.V. Spectroscopy by SAMIR PANDASAMIR PANDA
Spectroscopy is a branch of science dealing the study of interaction of electromagnetic radiation with matter.
Ultraviolet-visible spectroscopy refers to absorption spectroscopy or reflect spectroscopy in the UV-VIS spectral region.
Ultraviolet-visible spectroscopy is an analytical method that can measure the amount of light received by the analyte.
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Ana Luísa Pinho
Functional Magnetic Resonance Imaging (fMRI) provides means to characterize brain activations in response to behavior. However, cognitive neuroscience has been limited to group-level effects referring to the performance of specific tasks. To obtain the functional profile of elementary cognitive mechanisms, the combination of brain responses to many tasks is required. Yet, to date, both structural atlases and parcellation-based activations do not fully account for cognitive function and still present several limitations. Further, they do not adapt overall to individual characteristics. In this talk, I will give an account of deep-behavioral phenotyping strategies, namely data-driven methods in large task-fMRI datasets, to optimize functional brain-data collection and improve inference of effects-of-interest related to mental processes. Key to this approach is the employment of fast multi-functional paradigms rich on features that can be well parametrized and, consequently, facilitate the creation of psycho-physiological constructs to be modelled with imaging data. Particular emphasis will be given to music stimuli when studying high-order cognitive mechanisms, due to their ecological nature and quality to enable complex behavior compounded by discrete entities. I will also discuss how deep-behavioral phenotyping and individualized models applied to neuroimaging data can better account for the subject-specific organization of domain-general cognitive systems in the human brain. Finally, the accumulation of functional brain signatures brings the possibility to clarify relationships among tasks and create a univocal link between brain systems and mental functions through: (1) the development of ontologies proposing an organization of cognitive processes; and (2) brain-network taxonomies describing functional specialization. To this end, tools to improve commensurability in cognitive science are necessary, such as public repositories, ontology-based platforms and automated meta-analysis tools. I will thus discuss some brain-atlasing resources currently under development, and their applicability in cognitive as well as clinical neuroscience.
Nutraceutical market, scope and growth: Herbal drug technologyLokesh Patil
As consumer awareness of health and wellness rises, the nutraceutical market—which includes goods like functional meals, drinks, and dietary supplements that provide health advantages beyond basic nutrition—is growing significantly. As healthcare expenses rise, the population ages, and people want natural and preventative health solutions more and more, this industry is increasing quickly. Further driving market expansion are product formulation innovations and the use of cutting-edge technology for customized nutrition. With its worldwide reach, the nutraceutical industry is expected to keep growing and provide significant chances for research and investment in a number of categories, including vitamins, minerals, probiotics, and herbal supplements.
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...University of Maribor
Slides from:
11th International Conference on Electrical, Electronics and Computer Engineering (IcETRAN), Niš, 3-6 June 2024
Track: Artificial Intelligence
https://www.etran.rs/2024/en/home-english/
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...Sérgio Sacani
We characterize the earliest galaxy population in the JADES Origins Field (JOF), the deepest
imaging field observed with JWST. We make use of the ancillary Hubble optical images (5 filters
spanning 0.4−0.9µm) and novel JWST images with 14 filters spanning 0.8−5µm, including 7 mediumband filters, and reaching total exposure times of up to 46 hours per filter. We combine all our data
at > 2.3µm to construct an ultradeep image, reaching as deep as ≈ 31.4 AB mag in the stack and
30.3-31.0 AB mag (5σ, r = 0.1” circular aperture) in individual filters. We measure photometric
redshifts and use robust selection criteria to identify a sample of eight galaxy candidates at redshifts
z = 11.5 − 15. These objects show compact half-light radii of R1/2 ∼ 50 − 200pc, stellar masses of
M⋆ ∼ 107−108M⊙, and star-formation rates of SFR ∼ 0.1−1 M⊙ yr−1
. Our search finds no candidates
at 15 < z < 20, placing upper limits at these redshifts. We develop a forward modeling approach to
infer the properties of the evolving luminosity function without binning in redshift or luminosity that
marginalizes over the photometric redshift uncertainty of our candidate galaxies and incorporates the
impact of non-detections. We find a z = 12 luminosity function in good agreement with prior results,
and that the luminosity function normalization and UV luminosity density decline by a factor of ∼ 2.5
from z = 12 to z = 14. We discuss the possible implications of our results in the context of theoretical
models for evolution of the dark matter halo mass function.
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Best practices for DuraMat software dissemination
1. Best practices for DuraMat software dissemination
Anubhav Jain, Baojie Li, Silvana Ovaitt,
Cliff Hansen, Robert White
Sept 26, 2023
slides (already) posted to
https://hackingmaterials.lbl.gov
2. • To help you develop and share software products effectively:
– Best practices in software dissemination
– Save time and effort in the development and dissemination process
– Establishing some consistency across DuraMat projects
– Getting you (and DuraMat) more credit for software products
• This is intended to be a discussion, so
comments/questions/improvements are welcome
– Software best practices often move very quickly as new tools are introduced
Purpose of this discussion
3. DuraMat funds many projects that produce software products –
but currently without many standards or guidance
Project Link
DuraMat data hub https://datahub.duramat.org
PV Analytics https://github.com/pvlib/pvanalytics
PV Ops https://github.com/sandialabs/pvOps
VocMax https://github.com/toddkarin/vocmax
PV Climate Zones https://github.com/toddkarin/pvcz
PVTools https://pvtools.lbl.gov/string-length-calculator
PV ARC thickness estimator https://github.com/DuraMAT/pvarc
PV-terms https://github.com/DuraMAT/pv-terms
Comparative LCOE calculator www.github.com/NREL/PVLCOE
PV-Pro SDM parameter estimation https://github.com/DuraMAT/pvpro
WhatsCracking https://datahub.duramat.org/dataset/whatscracking-application
4. DuraMat funds many projects that produce software products –
but currently without many standards or guidance
Project Link
DuraMat data hub https://datahub.duramat.org
PV Analytics https://github.com/pvlib/pvanalytics
PV Ops https://github.com/sandialabs/pvOps
VocMax https://github.com/toddkarin/vocmax
PV Climate Zones https://github.com/toddkarin/pvcz
PVTools https://pvtools.lbl.gov/string-length-calculator
PV ARC thickness estimator https://github.com/DuraMAT/pvarc
PV-terms https://github.com/DuraMAT/pv-terms
Comparative LCOE calculator www.github.com/NREL/PVLCOE
PV-Pro SDM parameter estimation https://github.com/DuraMAT/pvpro
WhatsCracking https://datahub.duramat.org/dataset/whatscracking-application
+ new projects from recent calls
- Kempe/Ovaitt (Lifetime predictor)
- E. Young (Wind loading)
- Braid (cell crack models)
- Rahman (SIERRA/COMSOL convertor)
5. • We have compiled an online resource for this presentation that
you can skim through
• https://github.com/DuraMAT/software_guide
• There are a few things new in this presentation that are not yet in
the guide
• If you have suggestions, submit a PR to the guide!
An online version of this presentation
6. The level of dissemination should depend on the purpose of the software
Level 1 Level 2 Level 3
Code maturity and
novelty
Code is mostly data
analysis/plots, or using other
already published packages.
The code is largely intended to
demonstrate usage or clarify an
analysis.
Novelty is low, implementing
published ideas.
Code is structured into functions
which are intended to serve as a
general toolset for other
analyses.
Code may contain new
algorithms that may require a
disclosure.
Code is rationally and
thoughtfully organized into
packages, modules, classes and
functions. It may serve as a
framework for downstream
analyses.
Code may contain new
algorithms that may require a
disclosure.
Intended Use and
Lifetime
Typically, used to support and
document published analyses
for enhanced reproducibility –
e.g., something akin to
supporting information for a
journal publication.
Typically, may serve as
documentation for the
innovations of an entire project,
e.g., for multiple publications.
However, the project may no
longer be actively maintained
after project end.
The project is intended to be
used and maintained long-term
by the project team and a
community of users; project lives
on even if/when initial
developers exit the project
7. Level 1: e.g., one-off” scripts that support a plot, table, etc. in another document
q Follow Laboratory-specific guidelines for approval to release your code.
q Inline code documentation. Each public function and class definition
should have its own documentation (e.g., docstring). Use a consistent
format (an example is provided later in this document). A docstring
should include:
q Function purpose
q Input Parameters
q Return parameters
q References (if any)
q Add README
q How to install/run the code as well as associated tests
q How to cite it (i.e. OSTI record, publication, or other)
q Does the README clearly describe the code’s purpose and its
organization
q Add LICENSE that conforms to lab and funding guidelines and includes
copyright-specific wording (see example at end)
will discuss
bullet points
with arrows
in subsequent
slides
8. Inline code documentation example (Python)
Some notes:
• The formatting of the docstring can
depend on if you are autoconverting
the docstrings to HTML documentation
• Common formatting examples include
reST (restructured text), Google
formatting, epyDoc, etc.
• You can add type hinting to further help
in code readability as well as the ability
to use static type checking tools
11. • Talk to your lab’s IP / IT departments for guidance
• BSD/MIT licenses are examples of very “open” licenses that allow others to do what they’d like with
the software
– BSD typically gives some more protection against others using your name to promote their
product, e.g. prevents user claiming “our commercial product uses LBL-approved software
technology for its analysis” or “uses the same algorithms developed by the brilliant scientist
<your_name_here>”
• Be careful about choosing licenses that require all downstream code to also use the same license,
e.g. GPL/Apache
– e.g., if you leave DuraMat and work for a company, you may no longer be able to use your own
code as companies typically avoid any GPL code
– Some labs may actually discourage or ban versions of such licenses because they contain
patent-granting language I don’t understand (e.g., Apache 2.0 and GPL 3.0 for LBL)
– If you really insist on these licenses, suggest talking to DuraMat program (for impact on industry
adoption) as well as your lab’s IPO
Choosing a license – some guidance
12. Level 2: e.g., Repository used for lifetime of a project (software itself is a work product)
q All Level 1 items
q In addition to lab-specific guidelines, ensure that DOE
requirements are being met. For example, this likely
includes:
q Software Record (gets recorded in OSTI.gov and
helps in reporting purposes / credit)
q Lab-specific approval to release code
q Set up a public facing Github repository. This could be
hosted by the project organization, by your institution, or
by your research lab. Examples include:
q github.com/DURAMAT
q github.com/NREL
q Additional README components
q Screenshot or visual aid of the project
q Current status of the project (testing use,
production use, actively maintained, etc.)
q Funding information and institutional branding
(logo, funding acknowledgement text)
q Add Contributor license agreement (CLA) for
contributors
q Include any examples of use, Jupyter notebooks or
scripts used for scientific publications that you want to
make available, and data that can also be made available
for testing/demonstration
q Use a standard layout for the repository (an example of a
standard Python layout is provided at the end of this
document)
q Add a consistent versioning scheme. Examples include
semantic versioning (v0.0.1) and date-based versioning
(v2023.01.25); tools like versioneer may help.
q Ensure your software is easy to install locally, including
any necessary dependencies. For example, Python
projects may include files such as setup.py or
requirements.txt.
q Report your software to your funding program so it can
be included in accomplishments
13. The following text is at the bottom of the LBL BSD-3 license:
You are under no obligation whatsoever to provide any bug fixes, patches, or upgrades to the features,
functionality or performance of the source code ("Enhancements") to anyone; however, if you choose to
make your Enhancements available either publicly, or directly to Lawrence Berkeley National Laboratory
or its contributors, without imposing a separate written license agreement for such Enhancements, then
you hereby grant the following license: a non-exclusive, royalty-free perpetual license to install, use,
modify, prepare derivative works, incorporate into other computer software, distribute, and sublicense
such enhancements or derivative works thereof, in binary and source code form.
Example of a contributor license agreement
https://spdx.org/licenses/BSD-3-Clause-LBNL.html
14. Example of a standard project layout for a Python project
You can look up standard project
layouts for the programming
language you are using
Some details of the layout may
depend on the tools you are using
for other tasks such as code
distribution or continuous
integration
15. • The cookiecutter package will set up different
package structures depending on your usage
• The cruft package can help you keep things up to
date as things change
You can also use the cookiecutter package to help
Python Flask web site
ML project
16. Two commands to get started w/cookiecutter
pip install cookiecutter
cookiecutter https://github.com/audreyfeldroy/cookiecutter-pypackage.git
17. Level 3: e.g., Ongoing project
q All Level 1 items
q All Level 2 items
q Implement a release system. One option is to use Github
tags and releases. You can obtain a DOI for each release
via Zenodo:
q Link the Github repo to Zenodo
q Perform the release and tag it
q Update the README to include the DOI identifier
Zenodo provides in the “how to cite” section
q Set up continuous integration (CI) tools (examples include
Github actions to execute CircleCI, Travis CI, etc. against
pull requests)
q A code coverage tool (e.g. coveralls) can help
establish that tests cover the entire codebase and
publish test status (pass/fail, test coverage)
q Check for consistent code formatting; a format checker
(e.g., pylint or Python black) can be used to check the
formatting of pull requests and/or automatically reformat
code
q Add Documentation pages (e.g., HTML documentation).
Documents can be deployed at several places (e.g.,
Github pages, readthedocs). Documentation pages
should provide:
q Getting started. Provide simple instructions to
install the code and run a sample problem. Links
here to Tutorials.
q Examples / Tutorials. Links here to illustrations of
using the code.
q API reference. Links here to the documentation of
each public Class, function and/or method. Note
that this can typically be auto-generated.
q Release notes. Links here to logs of changes with
each tagged release.
q Upload to PyPI, conda, or other easy install code service.
q Consider submitting to a code-centric journal
publication such as Journal of Open Source Software
18. Release versions via Github, citable via Zenodo
Github release
https://github.com/NREL/PV_ICE
Citeable DOI via Zenodo
20. • Pre-commit hooks run a series of checks and automated fixes
against your code before you commit that code to git
• For example, pre-commit hooks can:
– Auto-fix indentation, trailing spaces, line ending, line length,
etc. issues (e.g., via a tool like black). This will essentially free
up any energy in the project from code formatting issues
– Warn against issues like unused imports, undefined variables,
bare ”except” clause, too high code complexity, etc. (via a tool
like flake)
• If set up early on, it keeps your code “on track” of clean code
– It can also be installed and run later, but then you may get a
long list of previous code issues to fix
Keeping your code clean: pre-commit hooks
25. Consider submitting paper to a code-centric journal
Reviews via Github repo
The length of a JOSS paper is 250 – 1000
words
i.e., the entire paper is like a couple of
abstracts
27. • You can run a check on your repo using
Scientific Python’s repo-review tool
• https://learn.scientific-
python.org/development/guides/repo-
review/
• Web version didn’t work for me, but
command line version did
• Example shown at right for pvlib
Auto-checking with Scientific Python’s repo-review
28. • Haven’t done this before personally, but it may be a good
exercise for larger libraries like pvlib
• Create an issue here, it will guide you through the process
– https://github.com/pyOpenSci/software-
submission/issues/new/choose
• If you are nervous or skeptical, one of the options is a
“presubmission inquiry”
Peer-checking with pyopensci
29. Releasing large data sets with code
• Data sets should be formally released into a separate archival repository
(project-specific data hub (e.g., DuraMat Data Hub), Figshare, Dryad, etc.).
• Include in the repository: smaller files that are needed for the code, for
example for unit test or examples, provided they have been cleared for
release and are not infringing copyright from other sources or NDAs.
• Remember not to use links to local files on your computer!