SlideShare a Scribd company logo
1 of 49
Download to read offline
National Center for Supercomputing Applications
University of Illinois at Urbana–Champaign
Citation and Reproducibility in Software
Daniel S. Katz
Assistant Director for Scientific Software & Applications, NCSA
Research Associate Professor, CS
Research Associate Professor, ECE
Research Associate Professor, iSchool
dskatz@illinois.edu, d.katz@ieee.org, @danielskatz
Software as infrastructure
Science
Software	
Computing	
Infrastructure
• Software (including services)
essential for the bulk of science
- About half the papers in recent issues of Science were
software-intensive projects
- Research becoming dependent upon advances in software
• Wide range of software types: system, applications, modeling,
gateways, analysis, algorithms, middleware, libraries
• Software is not a one-time effort,
it must be sustained
• Development, production, and maintenance
are people intensive
• Software life-times are long vs hardware
• Software has under-appreciated value
Computational science research
Create
Hypothesis
Acquire
Resources (e.g.,
Funding,
Software, Data)
Perform
Research (Build
Software &
Data)
Publish
Results (e.g.,
Paper, Book,
Software, Data)
Gain
Recognition
Knowledge
Data science research
Create
Hypothesis
Acquire
Resources (e.g.,
Funding,
Software, Data)
Perform
Research (Build
Software &
Data)
Publish
Results (e.g.,
Paper, Book,
Software, Data)
Gain
Recognition
Acquire
Resources
(Data)
Purposes of software in research
Create
Hypothesis
Acquire
Resources (e.g.,
Funding,
Software, Data)
Perform
Research (Build
Software &
Data)
Publish
Results (e.g.,
Paper, Book,
Software, Data)
Gain
Recognition
Infrastructure
Research
National Center for Supercomputing Applications
University of Illinois at Urbana–Champaign
Software Citation
Software Reproducibility
Credit is a problem in Academia
http://www.phdcomics.com/comics/archive.php?comicid=562
Another Problem
Credit: Amy Brand, MIT
Software Citation Motivation
• Scientific research is becoming:
• More open – scientists want to collaborate; want/need to share
• More digital – outputs such as software and data; easier to share
• Significant time spent developing software & data
• Efforts not recognized or rewarded
• Citations for papers systematically collected, metrics
built
• But not for software & data
• Hypothesis:
Better measurement of contributions (citations, impact, metrics)
—> Rewards (incentives)
—> Career paths, willingness to join communities
—> More sustainable software
Incentives, citation/credit models, and
metrics (1)
1. Downloads
• From web/repository logs/stats
2. Installation/Build
• Add “phone home” mechanism at build level
• e.g., add ‘curl URL’ in makefile
3. Use/Run
• Track centrally; add “phone home” mechanism at run level
• Per app, e.g. add ‘curl URL’ in code, send local log to server
• For a set of apps, e.g., sempervirens project, Debian
popularity contest
• Track locally; ask user to report
• e.g., duecredit project, or via frameworks like Galaxy
Incentives, citation/credit models, and
metrics (2)
4. Impact
• Project CRediT (and Mozilla contributorship badges)
• Record contributorship roles
• Force 11 Software Citation Working Group (and merged WSSSPE software
credit WG)
• Define software citation principles
• Codemeta
• Minimal metadata schemas for science software
• Lots of research projects (including my own research on transitive credit -
http://doi.org/10.5334/jors.by)
• Altmetrics – not citations, but other structured measures of discussion (tweets,
blogs, etc.)
• ImpactStory
• Measure research impact: reads, citations, tweets, etc.
• Depsy (roughly ImpactStory specifically for software)
• Measure software impact: downloads; software reuse (if one package is
reused in another package), literature mentions (citations of a software
package), and social mentions (tweets, etc.)
How to better measure software
contributions
• Citation system was created for papers/books
• We need to either/both
1. Jam software into current citation system
2. Rework citation system
• Focus on 1 as possible; 2 is very hard.
• Challenge: not just how to identify software in a paper
• How to identify software used within research process
Software citation today
• Software and other digital resources currently appear in
publications in very inconsistent ways
• Howison: random sample of 90 articles in the biology
literature -> 7 different ways that software was
mentioned
• Studies on data and facility citation -> similar results
J. Howison and J. Bullard. Software in the scientific literature: Problems with seeing, finding, and using software mentioned in the biology literature. Journal of the
Association for Information Science and Technology, 2015. In press. http://dx.doi.org/10.1002/asi.23538.
Software citation principles: People & Process
• FORCE11 Software Citation group started July 2015
• WSSSPE3 Credit & Citation working group joined September 2015
• ~55 members (researchers, developers, publishers, repositories, librarians)
• Working on GitHub https://github.com/force11/force11-scwg & FORCE11
https://www.force11.org/group/software-citation-working-group
• Reviewed existing community practices & developed use cases
• Drafted software citation principles document
• Started with data citation principles, updated based on software use cases and
related work, updated based working group discussions, community feedback
and review of draft, workshop at FORCE2016 in April
• Discussion via GitHub issues, changes tracked
• Contains 6 principles, motivation, summary of use cases, related work,
discussion & recommendations
• Submitted, reviewed and modified (many times), now published
• Smith AM, Katz DS, Niemeyer KE, FORCE11 Software Citation Working
Group.(2016) Software Citation Principles. PeerJ Computer Science 2:e86. DOI:
10.7717/peerj-cs.86 and https://www.force11.org/software-citation-principles
Software citation principles
• Contents (details on next slides):
• 6 principles: Importance, Credit and Attribution, Unique
Identification, Persistence, Accessibility, Specificity
• Motivation, summary of use cases, related work, and discussion
(including recommendations)
• Format: working document in GitHub, linked from
FORCE11 SCWG page, discussion has been via GitHub
issues, changes have been tracked
• https://github.com/force11/force11-scwg
• Reviews and responses also in PeerJ CS paper
Principle 1. Importance
• Software should be considered a legitimate and
citable product of research. Software citations should
be accorded the same importance in the scholarly
record as citations of other research products, such
as publications and data; they should be included in the
metadata of the citing work, for example in the reference
list of a journal article, and should not be omitted or
separated. Software should be cited on the same basis
as any other research product such as a paper or a
book, that is, authors should cite the appropriate set of
software products just as they cite the appropriate set of
papers.
Principle 2. Credit and Attribution
• Software citations should facilitate giving scholarly
credit and normative, legal attribution to all
contributors to the software, recognizing that a single
style or mechanism of attribution may not be applicable
to all software.
Principle 3. Unique Identification
• A software citation should include a method for
identification that is machine actionable, globally
unique, interoperable, and recognized by at least a
community of the corresponding domain experts, and
preferably by general public researchers.
Principle 4. Persistence
• Unique identifiers and metadata describing the
software and its disposition should persist – even
beyond the lifespan of the software they describe.
Principle 5. Accessibility
• Software citations should facilitate access to the
software itself and to its associated metadata,
documentation, data, and other materials necessary for
both humans and machines to make informed use of
the referenced software.
Principle 6. Specificity
• Software citations should facilitate identification of,
and access to, the specific version of software that
was used. Software identification should be as specific
as necessary, such as using version numbers, revision
numbers, or variants such as platforms.
Example 1: Make your software citable
• Publish it – if it’s on GitHub, follow steps in
https://guides.github.com/activities/citable-code/
• Otherwise, submit it to zenodo or figshare, with
appropriate metadata (including authors, title, …,
citations of … & software that you use)
• Get a DOI
• Create a CITATION file, update your README, tell
people how to cite
• Also, can write a software paper and ask people to cite
that (but this is secondary, just since our current system
doesn’t work well)
Example 2: Cite someone else’s software in
a paper
• Check for a CITATION file or README; if this says how to cite the
software itself, do that
• If not, do your best following the principles
• Try to include all contributors to the software (maybe by just naming the
project)
• Try to include a method for identification that is machine actionable,
globally unique, interoperable – perhaps a URL to a release, a company
product number
• If there’s a landing page that includes metadata, point to that, not
directly to the software (e.g. the GitHub repo URL)
• Include specific version/release information
• If there’s a software paper, can cite this too, but not in place of citing
the software
Working group status & next steps
• Final version of principles document published in PeerJ CS
• Considering endorsement period for both individuals and organizations (will
suggest to FORCE11, might defer to implementation phase)
• Want to endorse? Email/talk to me
• Will create infographic and 1–3 slides
• In progress; draft infographic on next slide
• Will create white paper that works through implementation of some use
cases
• Software Citation Working Group ends
• Software Citation Implementation group starts
• Works with institutions, publishers, funders, researchers, etc.,
• Writes full implementation examples paper?
• Want to join? Sign up on current FORCE11 group page
• https://www.force11.org/group/software-citation-working-group
Credit: Laura Rueda, DataCite
A software citation should
include a method for identification
that is machine actionable,
globally unique, interoperable,
and recognized by at least a
community of the corresponding
domain experts, and preferably
by general public researchers.
UNIQUE
IDENTIFICATION
Software citations should
facilitate access to the
software itself and to its
associated metadata,
documentation, data, and
other materials necessary
for both humans and
machines to make
informed use of the
referenced software.
ACCESSIBILITY
Software should be considered
a legitimate and citable product
of research. Software citations
should be accorded the same
importance in the scholarly
record as citations of other
research products. They should
be included in the metadata of
the citing work. Software should
be cited on the same basis as
any other research product
such as a paper or a book.
IMPORTANCE
Unique identifiers and metadata
describing the software and its
disposition should persist —even
beyond the lifespan of the
software they describe.
PERSISTENCE
Software citations should facilitate giving
scholarly credit and normative, legal attribution
to all contributors to the software, recognizing
that a single style or mechanism of attribution
may not be applicable to all software.
CREDIT AND ATTRIBUTION Software citations should facilitate identification of,
and access to, the specific version of software that
was used. Software identification should be as specific
as necessary, such as using version numbers, revision
numbers, or variants such as platforms.
SPECIFICITY
SOFTWARE
CITATION
PRINCIPLES
Journal of Open Source Software (JOSS)
• In the meantime, there’s JOSS
• A developer friendly journal for research software packages
• “If you've already licensed your code and have good documentation
then we expect that it should take less than an hour to prepare and
submit your paper to JOSS”
• Everything is open:
• Submitted/published paper: http://joss.theoj.org
• Code itself: where is up to the author(s)
• Reviews & process: https://github.com/openjournals/joss-reviews
• Code for the journal itself: https://github.com/openjournals/joss
• Zenodo archives JOSS papers and issues DOIs
• First paper submitted May 4, 2016
• As of 18 January: 62 accepted papers, 21 under review, 17 submitted
(pre-review)
National Center for Supercomputing Applications
University of Illinois at Urbana–Champaign
Software Citation
Software Reproducibility
Adapted from Carole Goble: http://www.slideshare.net/carolegoble/what-is-reproducibility-gobleclean
Reproducibility
• Problems:
• Scientific culture: reproducibility is important at a high level, but not in
specific cases
• Like Mark Twain’s definition of classic books: those that people
praise but don’t read
• No incentives or practices that translate the high level concept of
reproducibility into actions that support actual reproducibility
• Reproducibility can be hard due to a unique situation
• Data can be taken with a unique instrument, transient data
• Unique computer system was used
• Given limited resources, reproducibility is less important than new
research
• A computer run that took months is unlikely to be repeated, because
generating a new result is seen as a better use of the computing
resources than reproducing the old result
Software reproducibility
• Technical concerns
• Software collapse (coined by Konrad Hinsen1): the fact that software stops
working eventually if is not actively maintained
• Software stacks used in computational science have a nearly universal multi-
layer structure
• Project-specific software: whatever it takes to do a computation using
software building blocks from the lower three levels: scripts, workflows,
computational notebooks, small special-purpose libraries and utilities
• Discipline-specific research software: tools and libraries that implement
models and methods which are developed and used by research
communities
• Scientific infrastructure: libraries and utilities used for research in many
different disciplines, such as LAPACK, NumPy, or Gnuplot
• Non-scientific infrastructure: operating systems, compilers, and support code
for I/O, user interfaces, etc.
• Software in each layer builds on and depends on software in all layers below it
• Any changes in any lower layer can cause it to collapse
1http://blog.khinsen.net/posts/2017/01/13/sustainable-software-and-reproducible-research-dealing-with-software-collapse/
Software reproducibility
• Project-specific software (developed by scientists)
• Discipline-specific research software (developed by scientists & developers)
• Scientific infrastructure (developed by developers)
• Non-scientific infrastructure (developed by developers)
• Where to address reproducibility?
• Hinsen1: Just addressing project-specific software isn’t enough to solve
software collapse; enemy is changes in the foundations
• Options are similar for house owners facing the risk of earthquakes
1. Accept that your house or software is short-lived; in case of collapse, start from
scratch
2. Whenever shaking foundations cause damage, do repair work before more
serious collapse happens
3. Make your house or software robust against perturbations from below
4. Choose stable foundations
• Active projects choose 1 & 2
• We don’t know how to do 3 (CS research needed, maybe new thinking2)
• 4 is expensive & limits innovation in top layers (banks, military, NASA)
1http://blog.khinsen.net/posts/2017/01/13/sustainable-software-and-reproducible-research-dealing-with-software-collapse/
2Greg Wilson colloquium at NCSA: https://www.youtube.com/watch?v=vx0DUiv1Gvw
Software reproducibility
• Titus Brown1: “Archivability is a disaster in the software
world”
• Can’t we just use containers/VMs?
• Docker itself isn’t robust2
• VMs and docker images provide bitwise reproducibility, but
aren’t scientifically useful; big black boxes don't really let you
reuse or remix the contents
• Options:
• Run everything all the time
• Hinsen’s option 2
• Aka continuous analysis3, similar to continuous integration
• Acknowledge that exact repeatability has a half life of utility,
and that this is OK
• We don’t build houses to last forever…
1http://ivory.idyll.org/blog/2017-pof-software-archivability.html
3Beaulieu-Jones & Greene, https://doi.org/10.1101/056473
2https://thehftguy.com/2016/11/01/docker-in-production-an-history-of-failure/
Software reproducibility
• Costs and benefits
• If software-enabled results could be made reproducible at no cost, do so
• If cost is 10x cost of original work, don’t
• At 1x, probably still no
• What about at 0.1x (+10%)? Or +20%?
• How to balance reproducibility cost vs lost opportunity of new research?
• Could be general question about the culture of science or specific
question about any one experiment/result
• If not practical to make everything reproducible, could use
cost:benefit ratio to determine what to do
• Also, could factor in confirmation depth (David Soergel1) as a
measure of reproducible scientific research (i.e., benefit)
• Shallowest form of confirmation: peer review
• Deeper forms (different software -> different approach -> different inputs): more
confidence in results
1http://davidsoergel.com/posts/confirmation-depth-as-a-measure-of-reproducible-scientific-research
Conclusions
• Software
• Important today, essential tomorrow
• Credit
• A known problem for papers, worse for software
• Citation
• We know what to do (mostly), now need to do it
• Reproducibility
• We think we know what we want to do
• But don’t know how to do it
• What you can do
• Cite the software you use, make it easy for others to cite the software
you write (and see the “I solemnly pledge” manifesto – http://ceur-
ws.org/Vol-1686/WSSSPE4_paper_15.pdf)
• Work towards reproducibility as much as possible
National Center for Supercomputing Applications
University of Illinois at Urbana–Champaign
Citation and Reproducibility in Software
Daniel S. Katz
Assistant Director for Scientific Software & Applications, NCSA
Research Associate Professor, CS
Research Associate Professor, ECE
Research Associate Professor, iSchool
dskatz@illinois.edu, d.katz@ieee.org, @danielskatz
National Center for Supercomputing Applications
University of Illinois at Urbana–Champaign
Additional Software Citation Principle Slides:
discussion topics
Discussion: What to cite
• Importance principle: “…authors should cite the
appropriate set of software products just as they cite
the appropriate set of papers”
• What software to cite decided by author(s) of product, in
context of community norms and practices
• POWL: “Do not cite standard office software (e.g. Word,
Excel) or programming languages. Provide references
only for specialized software.”
• i.e., if using different software could produce different
data or results, then the software used should be cited
Purdue Online Writing Lab. Reference List: Electronic Sources (Web Publications). https://owl.english.purdue. edu/owl/resource/560/10/, 2015.
Discussion: What to cite (citation vs
provenance & reproducibility)
• Provenance/reproducibility requirements > citation
requirements
• Citation: software important to research outcome
• Provenance: all steps (including software) in research
• For data research product, provenance data includes all
cited software, not vice versa
• Software citation principles cover minimal needs for
software citation for software identification
• Provenance & reproducibility may need more metadata
Discussion: Software papers
• Goal: Software should be cited
• Practice: Papers about software (“software papers”) are
published and cited
• Importance principle (1) and other discussion: The
software itself should be cited on the same basis as
any other research product; authors should cite the
appropriate set of software products
• Ok to cite software paper too, if it contains results
(performance, validation, etc.) that are important to the
work
• If the software authors ask users to cite software paper,
can do so, in addition to citing to the software
Discussion: Derived software
• Imagine Code A is derived from Code B, and a paper
uses and cites Code A
• Should the paper also cite Code B?
• No, any research builds on other research
• Each research product just cites those products that it
directly builds on
• Together, this give credit and knowledge chains
• Science historians study these chains
• More automated analyses may also develop, such as
transitive credit
D. S. Katz and A. M. Smith. Implementing transitive credit with JSON-LD. Journal of Open Research Software, 3:e7, 2015. http://dx.doi.org/10.5334/jors.by.
Discussion: Software peer review
• Important issue for software in science
• Probably out-of-scope in citation discussion
• Goal of software citation is to identify software that has
been used in a scholarly product
• Whether or not that software has been peer-reviewed is
irrelevant
• Possible exception: if peer-review status of software is
part of software metadata
• Working group opinion: not part of the minimal metadata
needed to identify the software
Discussion: Citations in text
• Each publisher/publication has a style it prefers
• e.g., AMS, APA, Chicago, MLA
• Examples for software using these styles published by
Lipson
• Citations typically sent to publishers as text formatted in
that citation style, not as structured metadata
• Recommendation: text citation styles should support:
• a) a label indicating that this is software, e.g. [Computer
program]
• b) support for version information, e.g. Version 1.8.7
C. Lipson. Cite Right, Second Edition: A Quick Guide to Citation Styles–MLA, APA, Chicago, the Sciences, Professions, and More. Chicago Guides to Writing,
Editing, and Publishing. University of Chicago Press, 2011.
Discussion: Citation limits
• Software citation principles
• –> more software citations in scholarly products
• –> more overall citations
• Some journals have strict limits on
• Number of citations
• Number of pages (including references)
• Recommendations to publishers:
• Add specific instructions regarding software citations to
author guidelines to not disincentivize software citation
• Don’t include references in content counted against page
limits
Discussion: Unique identification
• Recommend DOIs for identification of published
software
• However, identifier can point to
1. a specific version of a piece of software
2. the piece of software (all versions of the software)
3. the latest version of a piece of software
• One piece of software may have identifiers of all 3 types
• And maybe 1+ software papers, each with identifiers
• Use cases:
• Cite a specific version
• Cite the software in general
• Link multiple releases together, to understanding all citations
Discussion: Unique identification (cont.)
• Principles intended to apply at all levels
• To all identifiers types, e.g., DOIs, RRIDs, ARKS, etc.
• Though again: recommend when possible use DOIs
that identify specific versions of source code
• RRIDs developed by the FORCE11 Resource
Identification Initiative
• Discussed for use to identify software packages (not specific
versions)
• FORCE11 Resource Identification Technical Specifications
Working Group says “Information resources like software are
better suited to the Software Citation WG”
• Currently no consensus on RRIDs for software
Discussion: Types of software
• Principles and discussion generally focus on software as
source code
• But some software is only available as an executable, a
container, or a service
• Principles intended to apply to all these forms of
software
• Implementation of principles will differ by software type
• When software exists as both source code and
another type, cite the source code
Discussion: Access to software
• Accessibility principle: “software citations should permit
and facilitate access to the software itself”
• Metadata should provide access information
• Free software: metadata includes UID that resolves to
URL to specific version of software
• Commercial software: metadata provides information on
how to access the specific software
• E.g., company’s product number, URL to buy the software
• If software isn’t available now, it still should be cited
along with information about how it was accessed
• Metadata should persist, even when software doesn’t
Discussion: Identifier resolves to …
• Identifier that points directly to software (e.g., GitHub
repo) satisfies Unique Identification (3), Accessibility (5),
and Specificity (6), but not Persistence (4)
• Recommend that identifier should resolve to
persistent landing page that contains metadata and
link to the software itself, rather than directly to
source code
• Ensures longevity of software metadata, even beyond
software lifespan
• Point to figshare, Zenodo, etc., not GitHub
Last Problem
Credit: Amy Brand, MIT

More Related Content

What's hot

Collaborative Data Analysis with Taverna Workflows
Collaborative Data Analysis with Taverna WorkflowsCollaborative Data Analysis with Taverna Workflows
Collaborative Data Analysis with Taverna WorkflowsAndrea Wiggins
 
Tracking Citations to Research Software via PIDs
Tracking Citations to Research Software via PIDsTracking Citations to Research Software via PIDs
Tracking Citations to Research Software via PIDsETH-Bibliothek
 
Tools and Techniques for Creating, Maintaining, and Distributing Shareable Me...
Tools and Techniques for Creating, Maintaining, and Distributing Shareable Me...Tools and Techniques for Creating, Maintaining, and Distributing Shareable Me...
Tools and Techniques for Creating, Maintaining, and Distributing Shareable Me...Jenn Riley
 
Humanidades digitales por Ryan Shaw (University of North Carolina at Chapel H...
Humanidades digitales por Ryan Shaw (University of North Carolina at Chapel H...Humanidades digitales por Ryan Shaw (University of North Carolina at Chapel H...
Humanidades digitales por Ryan Shaw (University of North Carolina at Chapel H...innovatics
 
Software Ecosystems = Big Data
Software Ecosystems = Big DataSoftware Ecosystems = Big Data
Software Ecosystems = Big DataTom Mens
 
Implementing Serial Solutions ERMS, OpenURL and federatd search solutions t UCD
Implementing Serial Solutions ERMS, OpenURL and federatd search solutions t UCDImplementing Serial Solutions ERMS, OpenURL and federatd search solutions t UCD
Implementing Serial Solutions ERMS, OpenURL and federatd search solutions t UCDRos Pan
 
Findability through Traceability - A Realistic Application of Candidate Tr...
Findability through Traceability  - A Realistic Application of Candidate Tr...Findability through Traceability  - A Realistic Application of Candidate Tr...
Findability through Traceability - A Realistic Application of Candidate Tr...Markus Borg
 

What's hot (8)

Collaborative Data Analysis with Taverna Workflows
Collaborative Data Analysis with Taverna WorkflowsCollaborative Data Analysis with Taverna Workflows
Collaborative Data Analysis with Taverna Workflows
 
Tracking Citations to Research Software via PIDs
Tracking Citations to Research Software via PIDsTracking Citations to Research Software via PIDs
Tracking Citations to Research Software via PIDs
 
Tools and Techniques for Creating, Maintaining, and Distributing Shareable Me...
Tools and Techniques for Creating, Maintaining, and Distributing Shareable Me...Tools and Techniques for Creating, Maintaining, and Distributing Shareable Me...
Tools and Techniques for Creating, Maintaining, and Distributing Shareable Me...
 
Humanidades digitales por Ryan Shaw (University of North Carolina at Chapel H...
Humanidades digitales por Ryan Shaw (University of North Carolina at Chapel H...Humanidades digitales por Ryan Shaw (University of North Carolina at Chapel H...
Humanidades digitales por Ryan Shaw (University of North Carolina at Chapel H...
 
NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Wor...
NISO/NFAIS Joint Virtual Conference:  Connecting the Library to the Wider Wor...NISO/NFAIS Joint Virtual Conference:  Connecting the Library to the Wider Wor...
NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Wor...
 
Software Ecosystems = Big Data
Software Ecosystems = Big DataSoftware Ecosystems = Big Data
Software Ecosystems = Big Data
 
Implementing Serial Solutions ERMS, OpenURL and federatd search solutions t UCD
Implementing Serial Solutions ERMS, OpenURL and federatd search solutions t UCDImplementing Serial Solutions ERMS, OpenURL and federatd search solutions t UCD
Implementing Serial Solutions ERMS, OpenURL and federatd search solutions t UCD
 
Findability through Traceability - A Realistic Application of Candidate Tr...
Findability through Traceability  - A Realistic Application of Candidate Tr...Findability through Traceability  - A Realistic Application of Candidate Tr...
Findability through Traceability - A Realistic Application of Candidate Tr...
 

Viewers also liked

Preprints & Scholarly Infrastructure
Preprints & Scholarly InfrastructurePreprints & Scholarly Infrastructure
Preprints & Scholarly InfrastructureCrossref
 
Crossmark Update Webinar
Crossmark Update WebinarCrossmark Update Webinar
Crossmark Update WebinarCrossref
 
CrossMark How To
CrossMark How ToCrossMark How To
CrossMark How ToCrossref
 
What does Open Science, Open Scholarship look like?
What does Open Science, Open Scholarship look like?What does Open Science, Open Scholarship look like?
What does Open Science, Open Scholarship look like?Robin Rice
 
University of Edinburgh RDM Training: MANTRA & beyond
University of Edinburgh RDM Training: MANTRA & beyondUniversity of Edinburgh RDM Training: MANTRA & beyond
University of Edinburgh RDM Training: MANTRA & beyondRobin Rice
 
Designing and delivering an international MOOC on Research Data Management an...
Designing and delivering an international MOOC on Research Data Management an...Designing and delivering an international MOOC on Research Data Management an...
Designing and delivering an international MOOC on Research Data Management an...Robin Rice
 
Overcoming obstacles to sharing data about human subjects
Overcoming obstacles to sharing data about human subjectsOvercoming obstacles to sharing data about human subjects
Overcoming obstacles to sharing data about human subjectsRobin Rice
 
Open data and research data management at the University of Edinburgh: polici...
Open data and research data management at the University of Edinburgh: polici...Open data and research data management at the University of Edinburgh: polici...
Open data and research data management at the University of Edinburgh: polici...Robin Rice
 
Data Library Services at the University of Edinburgh
Data Library Services at the University of EdinburghData Library Services at the University of Edinburgh
Data Library Services at the University of EdinburghRobin Rice
 
‘Good, better, best’? Examining the range and rationales of institutional dat...
‘Good, better, best’? Examining the range and rationales of institutional dat...‘Good, better, best’? Examining the range and rationales of institutional dat...
‘Good, better, best’? Examining the range and rationales of institutional dat...Robin Rice
 
Managing active research in the University of Edinburgh
Managing active research in the University of EdinburghManaging active research in the University of Edinburgh
Managing active research in the University of EdinburghRobin Rice
 
Guiding users through data deposit
Guiding users through data depositGuiding users through data deposit
Guiding users through data depositRobin Rice
 

Viewers also liked (12)

Preprints & Scholarly Infrastructure
Preprints & Scholarly InfrastructurePreprints & Scholarly Infrastructure
Preprints & Scholarly Infrastructure
 
Crossmark Update Webinar
Crossmark Update WebinarCrossmark Update Webinar
Crossmark Update Webinar
 
CrossMark How To
CrossMark How ToCrossMark How To
CrossMark How To
 
What does Open Science, Open Scholarship look like?
What does Open Science, Open Scholarship look like?What does Open Science, Open Scholarship look like?
What does Open Science, Open Scholarship look like?
 
University of Edinburgh RDM Training: MANTRA & beyond
University of Edinburgh RDM Training: MANTRA & beyondUniversity of Edinburgh RDM Training: MANTRA & beyond
University of Edinburgh RDM Training: MANTRA & beyond
 
Designing and delivering an international MOOC on Research Data Management an...
Designing and delivering an international MOOC on Research Data Management an...Designing and delivering an international MOOC on Research Data Management an...
Designing and delivering an international MOOC on Research Data Management an...
 
Overcoming obstacles to sharing data about human subjects
Overcoming obstacles to sharing data about human subjectsOvercoming obstacles to sharing data about human subjects
Overcoming obstacles to sharing data about human subjects
 
Open data and research data management at the University of Edinburgh: polici...
Open data and research data management at the University of Edinburgh: polici...Open data and research data management at the University of Edinburgh: polici...
Open data and research data management at the University of Edinburgh: polici...
 
Data Library Services at the University of Edinburgh
Data Library Services at the University of EdinburghData Library Services at the University of Edinburgh
Data Library Services at the University of Edinburgh
 
‘Good, better, best’? Examining the range and rationales of institutional dat...
‘Good, better, best’? Examining the range and rationales of institutional dat...‘Good, better, best’? Examining the range and rationales of institutional dat...
‘Good, better, best’? Examining the range and rationales of institutional dat...
 
Managing active research in the University of Edinburgh
Managing active research in the University of EdinburghManaging active research in the University of Edinburgh
Managing active research in the University of Edinburgh
 
Guiding users through data deposit
Guiding users through data depositGuiding users through data deposit
Guiding users through data deposit
 

Similar to Citation and reproducibility in software

Crediting informatics and data folks in life science teams
Crediting informatics and data folks in life science teamsCrediting informatics and data folks in life science teams
Crediting informatics and data folks in life science teamsCarole Goble
 
SciForge Workshop@Potsdam Institute for Climate Impact Reserach; Nov 2014
SciForge Workshop@Potsdam Institute for Climate Impact Reserach; Nov 2014SciForge Workshop@Potsdam Institute for Climate Impact Reserach; Nov 2014
SciForge Workshop@Potsdam Institute for Climate Impact Reserach; Nov 2014dreusser
 
Open Source and Science at the National Science Foundation (NSF)
Open Source and Science at the National Science Foundation (NSF)Open Source and Science at the National Science Foundation (NSF)
Open Source and Science at the National Science Foundation (NSF)Daniel S. Katz
 
Requiring Publicly-Funded Software, Algorithms, and Workflows to be Made Publ...
Requiring Publicly-Funded Software, Algorithms, and Workflows to be Made Publ...Requiring Publicly-Funded Software, Algorithms, and Workflows to be Made Publ...
Requiring Publicly-Funded Software, Algorithms, and Workflows to be Made Publ...Daniel S. Katz
 
Software Citation and a Proposal (NSF workshop at Havard Medical School)
Software Citation and a Proposal (NSF workshop at Havard Medical School)Software Citation and a Proposal (NSF workshop at Havard Medical School)
Software Citation and a Proposal (NSF workshop at Havard Medical School)James Howison
 
Software Repositories for Research -- An Environmental Scan
Software Repositories for Research -- An Environmental ScanSoftware Repositories for Research -- An Environmental Scan
Software Repositories for Research -- An Environmental ScanMicah Altman
 
Software: impact, metrics, and citation
Software: impact, metrics, and citationSoftware: impact, metrics, and citation
Software: impact, metrics, and citationDaniel S. Katz
 
Summary of WSSSPE and its working groups
Summary of WSSSPE and its working groupsSummary of WSSSPE and its working groups
Summary of WSSSPE and its working groupsDaniel S. Katz
 
Software Repositories for Research-- An Environmental Scan
Software Repositories for Research-- An Environmental ScanSoftware Repositories for Research-- An Environmental Scan
Software Repositories for Research-- An Environmental ScanMicah Altman
 
Software as a Well-Formed Research Object
Software as a Well-Formed Research ObjectSoftware as a Well-Formed Research Object
Software as a Well-Formed Research ObjectYasmin AlNoamany, PhD
 
Research Software Sustainability: WSSSPE & URSSI
Research Software Sustainability: WSSSPE & URSSIResearch Software Sustainability: WSSSPE & URSSI
Research Software Sustainability: WSSSPE & URSSIDaniel S. Katz
 
FAIR is not Fair Enough, Particularly for Software Citation, Availability, or...
FAIR is not Fair Enough, Particularly for Software Citation, Availability, or...FAIR is not Fair Enough, Particularly for Software Citation, Availability, or...
FAIR is not Fair Enough, Particularly for Software Citation, Availability, or...Daniel S. Katz
 
Talk on Research Data Management
Talk on Research Data ManagementTalk on Research Data Management
Talk on Research Data ManagementAnita de Waard
 
Publishing the Full Research Data Lifecycle
Publishing the Full Research Data LifecyclePublishing the Full Research Data Lifecycle
Publishing the Full Research Data LifecycleAnita de Waard
 
Research software identification - Catherine Jones
Research software identification - Catherine JonesResearch software identification - Catherine Jones
Research software identification - Catherine JonesJisc RDM
 
Reproducible Research in the Humanities
Reproducible Research in the HumanitiesReproducible Research in the Humanities
Reproducible Research in the HumanitiesIain Emsley
 
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne UlitmatumElsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne UlitmatumAnita de Waard
 

Similar to Citation and reproducibility in software (20)

Crediting informatics and data folks in life science teams
Crediting informatics and data folks in life science teamsCrediting informatics and data folks in life science teams
Crediting informatics and data folks in life science teams
 
SciForge Workshop@Potsdam Institute for Climate Impact Reserach; Nov 2014
SciForge Workshop@Potsdam Institute for Climate Impact Reserach; Nov 2014SciForge Workshop@Potsdam Institute for Climate Impact Reserach; Nov 2014
SciForge Workshop@Potsdam Institute for Climate Impact Reserach; Nov 2014
 
Open Source and Science at the National Science Foundation (NSF)
Open Source and Science at the National Science Foundation (NSF)Open Source and Science at the National Science Foundation (NSF)
Open Source and Science at the National Science Foundation (NSF)
 
Requiring Publicly-Funded Software, Algorithms, and Workflows to be Made Publ...
Requiring Publicly-Funded Software, Algorithms, and Workflows to be Made Publ...Requiring Publicly-Funded Software, Algorithms, and Workflows to be Made Publ...
Requiring Publicly-Funded Software, Algorithms, and Workflows to be Made Publ...
 
20171003 lancaster data conversations Chue-Hong
20171003 lancaster data conversations Chue-Hong20171003 lancaster data conversations Chue-Hong
20171003 lancaster data conversations Chue-Hong
 
data analytics.pptx
data analytics.pptxdata analytics.pptx
data analytics.pptx
 
Software Citation and a Proposal (NSF workshop at Havard Medical School)
Software Citation and a Proposal (NSF workshop at Havard Medical School)Software Citation and a Proposal (NSF workshop at Havard Medical School)
Software Citation and a Proposal (NSF workshop at Havard Medical School)
 
Software Repositories for Research -- An Environmental Scan
Software Repositories for Research -- An Environmental ScanSoftware Repositories for Research -- An Environmental Scan
Software Repositories for Research -- An Environmental Scan
 
Software: impact, metrics, and citation
Software: impact, metrics, and citationSoftware: impact, metrics, and citation
Software: impact, metrics, and citation
 
Summary of WSSSPE and its working groups
Summary of WSSSPE and its working groupsSummary of WSSSPE and its working groups
Summary of WSSSPE and its working groups
 
L035478083
L035478083L035478083
L035478083
 
Software Repositories for Research-- An Environmental Scan
Software Repositories for Research-- An Environmental ScanSoftware Repositories for Research-- An Environmental Scan
Software Repositories for Research-- An Environmental Scan
 
Software as a Well-Formed Research Object
Software as a Well-Formed Research ObjectSoftware as a Well-Formed Research Object
Software as a Well-Formed Research Object
 
Research Software Sustainability: WSSSPE & URSSI
Research Software Sustainability: WSSSPE & URSSIResearch Software Sustainability: WSSSPE & URSSI
Research Software Sustainability: WSSSPE & URSSI
 
FAIR is not Fair Enough, Particularly for Software Citation, Availability, or...
FAIR is not Fair Enough, Particularly for Software Citation, Availability, or...FAIR is not Fair Enough, Particularly for Software Citation, Availability, or...
FAIR is not Fair Enough, Particularly for Software Citation, Availability, or...
 
Talk on Research Data Management
Talk on Research Data ManagementTalk on Research Data Management
Talk on Research Data Management
 
Publishing the Full Research Data Lifecycle
Publishing the Full Research Data LifecyclePublishing the Full Research Data Lifecycle
Publishing the Full Research Data Lifecycle
 
Research software identification - Catherine Jones
Research software identification - Catherine JonesResearch software identification - Catherine Jones
Research software identification - Catherine Jones
 
Reproducible Research in the Humanities
Reproducible Research in the HumanitiesReproducible Research in the Humanities
Reproducible Research in the Humanities
 
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne UlitmatumElsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
 

More from Daniel S. Katz

Research software susainability
Research software susainabilityResearch software susainability
Research software susainabilityDaniel S. Katz
 
Software Professionals (RSEs) at NCSA
Software Professionals (RSEs) at NCSASoftware Professionals (RSEs) at NCSA
Software Professionals (RSEs) at NCSADaniel S. Katz
 
Parsl: Pervasive Parallel Programming in Python
Parsl: Pervasive Parallel Programming in PythonParsl: Pervasive Parallel Programming in Python
Parsl: Pervasive Parallel Programming in PythonDaniel S. Katz
 
What is eScience, and where does it go from here?
What is eScience, and where does it go from here?What is eScience, and where does it go from here?
What is eScience, and where does it go from here?Daniel S. Katz
 
Citation and Research Objects: Toward Active Research Objects
Citation and Research Objects: Toward Active Research ObjectsCitation and Research Objects: Toward Active Research Objects
Citation and Research Objects: Toward Active Research ObjectsDaniel S. Katz
 
Fundamentals of software sustainability
Fundamentals of software sustainabilityFundamentals of software sustainability
Fundamentals of software sustainabilityDaniel S. Katz
 
Expressing and sharing workflows
Expressing and sharing workflowsExpressing and sharing workflows
Expressing and sharing workflowsDaniel S. Katz
 
What do we need beyond a DOI?
What do we need beyond a DOI?What do we need beyond a DOI?
What do we need beyond a DOI?Daniel S. Katz
 
Looking at Software Sustainability and Productivity Challenges from NSF
Looking at Software Sustainability and Productivity Challenges from NSFLooking at Software Sustainability and Productivity Challenges from NSF
Looking at Software Sustainability and Productivity Challenges from NSFDaniel S. Katz
 
Scientific research: What Anna Karenina teaches us about useful negative results
Scientific research: What Anna Karenina teaches us about useful negative resultsScientific research: What Anna Karenina teaches us about useful negative results
Scientific research: What Anna Karenina teaches us about useful negative resultsDaniel S. Katz
 
Panel: Our Scholarly Recognition System Doesn’t Still Work
Panel: Our Scholarly Recognition System Doesn’t Still WorkPanel: Our Scholarly Recognition System Doesn’t Still Work
Panel: Our Scholarly Recognition System Doesn’t Still WorkDaniel S. Katz
 
US University Research Funding, Peer Reviews, and Metrics
US University Research Funding, Peer Reviews, and MetricsUS University Research Funding, Peer Reviews, and Metrics
US University Research Funding, Peer Reviews, and MetricsDaniel S. Katz
 
Swift Parallel Scripting for High-Performance Workflow
Swift Parallel Scripting for High-Performance WorkflowSwift Parallel Scripting for High-Performance Workflow
Swift Parallel Scripting for High-Performance WorkflowDaniel S. Katz
 
A Method to Select e-Infrastructure Components to Sustain
A Method to Select e-Infrastructure Components to SustainA Method to Select e-Infrastructure Components to Sustain
A Method to Select e-Infrastructure Components to SustainDaniel S. Katz
 
Multi-component Modeling with Swift at Extreme Scale
Multi-component Modeling with Swift at Extreme ScaleMulti-component Modeling with Swift at Extreme Scale
Multi-component Modeling with Swift at Extreme ScaleDaniel S. Katz
 
Application Fault Tolerance (AFT)
Application Fault Tolerance (AFT)Application Fault Tolerance (AFT)
Application Fault Tolerance (AFT)Daniel S. Katz
 
Funding Software in Academia
Funding Software in AcademiaFunding Software in Academia
Funding Software in AcademiaDaniel S. Katz
 
Metrics & Citation for Software (and Data)
Metrics & Citation for Software (and Data)Metrics & Citation for Software (and Data)
Metrics & Citation for Software (and Data)Daniel S. Katz
 

More from Daniel S. Katz (19)

Research software susainability
Research software susainabilityResearch software susainability
Research software susainability
 
Software Professionals (RSEs) at NCSA
Software Professionals (RSEs) at NCSASoftware Professionals (RSEs) at NCSA
Software Professionals (RSEs) at NCSA
 
Parsl: Pervasive Parallel Programming in Python
Parsl: Pervasive Parallel Programming in PythonParsl: Pervasive Parallel Programming in Python
Parsl: Pervasive Parallel Programming in Python
 
What is eScience, and where does it go from here?
What is eScience, and where does it go from here?What is eScience, and where does it go from here?
What is eScience, and where does it go from here?
 
Citation and Research Objects: Toward Active Research Objects
Citation and Research Objects: Toward Active Research ObjectsCitation and Research Objects: Toward Active Research Objects
Citation and Research Objects: Toward Active Research Objects
 
Fundamentals of software sustainability
Fundamentals of software sustainabilityFundamentals of software sustainability
Fundamentals of software sustainability
 
URSSI
URSSIURSSI
URSSI
 
Expressing and sharing workflows
Expressing and sharing workflowsExpressing and sharing workflows
Expressing and sharing workflows
 
What do we need beyond a DOI?
What do we need beyond a DOI?What do we need beyond a DOI?
What do we need beyond a DOI?
 
Looking at Software Sustainability and Productivity Challenges from NSF
Looking at Software Sustainability and Productivity Challenges from NSFLooking at Software Sustainability and Productivity Challenges from NSF
Looking at Software Sustainability and Productivity Challenges from NSF
 
Scientific research: What Anna Karenina teaches us about useful negative results
Scientific research: What Anna Karenina teaches us about useful negative resultsScientific research: What Anna Karenina teaches us about useful negative results
Scientific research: What Anna Karenina teaches us about useful negative results
 
Panel: Our Scholarly Recognition System Doesn’t Still Work
Panel: Our Scholarly Recognition System Doesn’t Still WorkPanel: Our Scholarly Recognition System Doesn’t Still Work
Panel: Our Scholarly Recognition System Doesn’t Still Work
 
US University Research Funding, Peer Reviews, and Metrics
US University Research Funding, Peer Reviews, and MetricsUS University Research Funding, Peer Reviews, and Metrics
US University Research Funding, Peer Reviews, and Metrics
 
Swift Parallel Scripting for High-Performance Workflow
Swift Parallel Scripting for High-Performance WorkflowSwift Parallel Scripting for High-Performance Workflow
Swift Parallel Scripting for High-Performance Workflow
 
A Method to Select e-Infrastructure Components to Sustain
A Method to Select e-Infrastructure Components to SustainA Method to Select e-Infrastructure Components to Sustain
A Method to Select e-Infrastructure Components to Sustain
 
Multi-component Modeling with Swift at Extreme Scale
Multi-component Modeling with Swift at Extreme ScaleMulti-component Modeling with Swift at Extreme Scale
Multi-component Modeling with Swift at Extreme Scale
 
Application Fault Tolerance (AFT)
Application Fault Tolerance (AFT)Application Fault Tolerance (AFT)
Application Fault Tolerance (AFT)
 
Funding Software in Academia
Funding Software in AcademiaFunding Software in Academia
Funding Software in Academia
 
Metrics & Citation for Software (and Data)
Metrics & Citation for Software (and Data)Metrics & Citation for Software (and Data)
Metrics & Citation for Software (and Data)
 

Recently uploaded

DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about usDynamic Netsoft
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...Christina Lin
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...soniya singh
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 
cybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningcybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningVitsRangannavar
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEOrtus Solutions, Corp
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfjoe51371421
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio, Inc.
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptkotipi9215
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataBradBedford3
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationkaushalgiri8080
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWave PLM
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfkalichargn70th171
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110
 

Recently uploaded (20)

DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about us
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStack
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 
cybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningcybersecurity notes for mca students for learning
cybersecurity notes for mca students for learning
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdf
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.ppt
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanation
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need It
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
 

Citation and reproducibility in software

  • 1. National Center for Supercomputing Applications University of Illinois at Urbana–Champaign Citation and Reproducibility in Software Daniel S. Katz Assistant Director for Scientific Software & Applications, NCSA Research Associate Professor, CS Research Associate Professor, ECE Research Associate Professor, iSchool dskatz@illinois.edu, d.katz@ieee.org, @danielskatz
  • 2. Software as infrastructure Science Software Computing Infrastructure • Software (including services) essential for the bulk of science - About half the papers in recent issues of Science were software-intensive projects - Research becoming dependent upon advances in software • Wide range of software types: system, applications, modeling, gateways, analysis, algorithms, middleware, libraries • Software is not a one-time effort, it must be sustained • Development, production, and maintenance are people intensive • Software life-times are long vs hardware • Software has under-appreciated value
  • 3. Computational science research Create Hypothesis Acquire Resources (e.g., Funding, Software, Data) Perform Research (Build Software & Data) Publish Results (e.g., Paper, Book, Software, Data) Gain Recognition Knowledge
  • 4. Data science research Create Hypothesis Acquire Resources (e.g., Funding, Software, Data) Perform Research (Build Software & Data) Publish Results (e.g., Paper, Book, Software, Data) Gain Recognition Acquire Resources (Data)
  • 5. Purposes of software in research Create Hypothesis Acquire Resources (e.g., Funding, Software, Data) Perform Research (Build Software & Data) Publish Results (e.g., Paper, Book, Software, Data) Gain Recognition Infrastructure Research
  • 6. National Center for Supercomputing Applications University of Illinois at Urbana–Champaign Software Citation Software Reproducibility
  • 7. Credit is a problem in Academia http://www.phdcomics.com/comics/archive.php?comicid=562
  • 9. Software Citation Motivation • Scientific research is becoming: • More open – scientists want to collaborate; want/need to share • More digital – outputs such as software and data; easier to share • Significant time spent developing software & data • Efforts not recognized or rewarded • Citations for papers systematically collected, metrics built • But not for software & data • Hypothesis: Better measurement of contributions (citations, impact, metrics) —> Rewards (incentives) —> Career paths, willingness to join communities —> More sustainable software
  • 10. Incentives, citation/credit models, and metrics (1) 1. Downloads • From web/repository logs/stats 2. Installation/Build • Add “phone home” mechanism at build level • e.g., add ‘curl URL’ in makefile 3. Use/Run • Track centrally; add “phone home” mechanism at run level • Per app, e.g. add ‘curl URL’ in code, send local log to server • For a set of apps, e.g., sempervirens project, Debian popularity contest • Track locally; ask user to report • e.g., duecredit project, or via frameworks like Galaxy
  • 11. Incentives, citation/credit models, and metrics (2) 4. Impact • Project CRediT (and Mozilla contributorship badges) • Record contributorship roles • Force 11 Software Citation Working Group (and merged WSSSPE software credit WG) • Define software citation principles • Codemeta • Minimal metadata schemas for science software • Lots of research projects (including my own research on transitive credit - http://doi.org/10.5334/jors.by) • Altmetrics – not citations, but other structured measures of discussion (tweets, blogs, etc.) • ImpactStory • Measure research impact: reads, citations, tweets, etc. • Depsy (roughly ImpactStory specifically for software) • Measure software impact: downloads; software reuse (if one package is reused in another package), literature mentions (citations of a software package), and social mentions (tweets, etc.)
  • 12. How to better measure software contributions • Citation system was created for papers/books • We need to either/both 1. Jam software into current citation system 2. Rework citation system • Focus on 1 as possible; 2 is very hard. • Challenge: not just how to identify software in a paper • How to identify software used within research process
  • 13. Software citation today • Software and other digital resources currently appear in publications in very inconsistent ways • Howison: random sample of 90 articles in the biology literature -> 7 different ways that software was mentioned • Studies on data and facility citation -> similar results J. Howison and J. Bullard. Software in the scientific literature: Problems with seeing, finding, and using software mentioned in the biology literature. Journal of the Association for Information Science and Technology, 2015. In press. http://dx.doi.org/10.1002/asi.23538.
  • 14. Software citation principles: People & Process • FORCE11 Software Citation group started July 2015 • WSSSPE3 Credit & Citation working group joined September 2015 • ~55 members (researchers, developers, publishers, repositories, librarians) • Working on GitHub https://github.com/force11/force11-scwg & FORCE11 https://www.force11.org/group/software-citation-working-group • Reviewed existing community practices & developed use cases • Drafted software citation principles document • Started with data citation principles, updated based on software use cases and related work, updated based working group discussions, community feedback and review of draft, workshop at FORCE2016 in April • Discussion via GitHub issues, changes tracked • Contains 6 principles, motivation, summary of use cases, related work, discussion & recommendations • Submitted, reviewed and modified (many times), now published • Smith AM, Katz DS, Niemeyer KE, FORCE11 Software Citation Working Group.(2016) Software Citation Principles. PeerJ Computer Science 2:e86. DOI: 10.7717/peerj-cs.86 and https://www.force11.org/software-citation-principles
  • 15. Software citation principles • Contents (details on next slides): • 6 principles: Importance, Credit and Attribution, Unique Identification, Persistence, Accessibility, Specificity • Motivation, summary of use cases, related work, and discussion (including recommendations) • Format: working document in GitHub, linked from FORCE11 SCWG page, discussion has been via GitHub issues, changes have been tracked • https://github.com/force11/force11-scwg • Reviews and responses also in PeerJ CS paper
  • 16. Principle 1. Importance • Software should be considered a legitimate and citable product of research. Software citations should be accorded the same importance in the scholarly record as citations of other research products, such as publications and data; they should be included in the metadata of the citing work, for example in the reference list of a journal article, and should not be omitted or separated. Software should be cited on the same basis as any other research product such as a paper or a book, that is, authors should cite the appropriate set of software products just as they cite the appropriate set of papers.
  • 17. Principle 2. Credit and Attribution • Software citations should facilitate giving scholarly credit and normative, legal attribution to all contributors to the software, recognizing that a single style or mechanism of attribution may not be applicable to all software.
  • 18. Principle 3. Unique Identification • A software citation should include a method for identification that is machine actionable, globally unique, interoperable, and recognized by at least a community of the corresponding domain experts, and preferably by general public researchers.
  • 19. Principle 4. Persistence • Unique identifiers and metadata describing the software and its disposition should persist – even beyond the lifespan of the software they describe.
  • 20. Principle 5. Accessibility • Software citations should facilitate access to the software itself and to its associated metadata, documentation, data, and other materials necessary for both humans and machines to make informed use of the referenced software.
  • 21. Principle 6. Specificity • Software citations should facilitate identification of, and access to, the specific version of software that was used. Software identification should be as specific as necessary, such as using version numbers, revision numbers, or variants such as platforms.
  • 22. Example 1: Make your software citable • Publish it – if it’s on GitHub, follow steps in https://guides.github.com/activities/citable-code/ • Otherwise, submit it to zenodo or figshare, with appropriate metadata (including authors, title, …, citations of … & software that you use) • Get a DOI • Create a CITATION file, update your README, tell people how to cite • Also, can write a software paper and ask people to cite that (but this is secondary, just since our current system doesn’t work well)
  • 23. Example 2: Cite someone else’s software in a paper • Check for a CITATION file or README; if this says how to cite the software itself, do that • If not, do your best following the principles • Try to include all contributors to the software (maybe by just naming the project) • Try to include a method for identification that is machine actionable, globally unique, interoperable – perhaps a URL to a release, a company product number • If there’s a landing page that includes metadata, point to that, not directly to the software (e.g. the GitHub repo URL) • Include specific version/release information • If there’s a software paper, can cite this too, but not in place of citing the software
  • 24. Working group status & next steps • Final version of principles document published in PeerJ CS • Considering endorsement period for both individuals and organizations (will suggest to FORCE11, might defer to implementation phase) • Want to endorse? Email/talk to me • Will create infographic and 1–3 slides • In progress; draft infographic on next slide • Will create white paper that works through implementation of some use cases • Software Citation Working Group ends • Software Citation Implementation group starts • Works with institutions, publishers, funders, researchers, etc., • Writes full implementation examples paper? • Want to join? Sign up on current FORCE11 group page • https://www.force11.org/group/software-citation-working-group
  • 25. Credit: Laura Rueda, DataCite A software citation should include a method for identification that is machine actionable, globally unique, interoperable, and recognized by at least a community of the corresponding domain experts, and preferably by general public researchers. UNIQUE IDENTIFICATION Software citations should facilitate access to the software itself and to its associated metadata, documentation, data, and other materials necessary for both humans and machines to make informed use of the referenced software. ACCESSIBILITY Software should be considered a legitimate and citable product of research. Software citations should be accorded the same importance in the scholarly record as citations of other research products. They should be included in the metadata of the citing work. Software should be cited on the same basis as any other research product such as a paper or a book. IMPORTANCE Unique identifiers and metadata describing the software and its disposition should persist —even beyond the lifespan of the software they describe. PERSISTENCE Software citations should facilitate giving scholarly credit and normative, legal attribution to all contributors to the software, recognizing that a single style or mechanism of attribution may not be applicable to all software. CREDIT AND ATTRIBUTION Software citations should facilitate identification of, and access to, the specific version of software that was used. Software identification should be as specific as necessary, such as using version numbers, revision numbers, or variants such as platforms. SPECIFICITY SOFTWARE CITATION PRINCIPLES
  • 26. Journal of Open Source Software (JOSS) • In the meantime, there’s JOSS • A developer friendly journal for research software packages • “If you've already licensed your code and have good documentation then we expect that it should take less than an hour to prepare and submit your paper to JOSS” • Everything is open: • Submitted/published paper: http://joss.theoj.org • Code itself: where is up to the author(s) • Reviews & process: https://github.com/openjournals/joss-reviews • Code for the journal itself: https://github.com/openjournals/joss • Zenodo archives JOSS papers and issues DOIs • First paper submitted May 4, 2016 • As of 18 January: 62 accepted papers, 21 under review, 17 submitted (pre-review)
  • 27. National Center for Supercomputing Applications University of Illinois at Urbana–Champaign Software Citation Software Reproducibility
  • 28. Adapted from Carole Goble: http://www.slideshare.net/carolegoble/what-is-reproducibility-gobleclean
  • 29. Reproducibility • Problems: • Scientific culture: reproducibility is important at a high level, but not in specific cases • Like Mark Twain’s definition of classic books: those that people praise but don’t read • No incentives or practices that translate the high level concept of reproducibility into actions that support actual reproducibility • Reproducibility can be hard due to a unique situation • Data can be taken with a unique instrument, transient data • Unique computer system was used • Given limited resources, reproducibility is less important than new research • A computer run that took months is unlikely to be repeated, because generating a new result is seen as a better use of the computing resources than reproducing the old result
  • 30. Software reproducibility • Technical concerns • Software collapse (coined by Konrad Hinsen1): the fact that software stops working eventually if is not actively maintained • Software stacks used in computational science have a nearly universal multi- layer structure • Project-specific software: whatever it takes to do a computation using software building blocks from the lower three levels: scripts, workflows, computational notebooks, small special-purpose libraries and utilities • Discipline-specific research software: tools and libraries that implement models and methods which are developed and used by research communities • Scientific infrastructure: libraries and utilities used for research in many different disciplines, such as LAPACK, NumPy, or Gnuplot • Non-scientific infrastructure: operating systems, compilers, and support code for I/O, user interfaces, etc. • Software in each layer builds on and depends on software in all layers below it • Any changes in any lower layer can cause it to collapse 1http://blog.khinsen.net/posts/2017/01/13/sustainable-software-and-reproducible-research-dealing-with-software-collapse/
  • 31. Software reproducibility • Project-specific software (developed by scientists) • Discipline-specific research software (developed by scientists & developers) • Scientific infrastructure (developed by developers) • Non-scientific infrastructure (developed by developers) • Where to address reproducibility? • Hinsen1: Just addressing project-specific software isn’t enough to solve software collapse; enemy is changes in the foundations • Options are similar for house owners facing the risk of earthquakes 1. Accept that your house or software is short-lived; in case of collapse, start from scratch 2. Whenever shaking foundations cause damage, do repair work before more serious collapse happens 3. Make your house or software robust against perturbations from below 4. Choose stable foundations • Active projects choose 1 & 2 • We don’t know how to do 3 (CS research needed, maybe new thinking2) • 4 is expensive & limits innovation in top layers (banks, military, NASA) 1http://blog.khinsen.net/posts/2017/01/13/sustainable-software-and-reproducible-research-dealing-with-software-collapse/ 2Greg Wilson colloquium at NCSA: https://www.youtube.com/watch?v=vx0DUiv1Gvw
  • 32. Software reproducibility • Titus Brown1: “Archivability is a disaster in the software world” • Can’t we just use containers/VMs? • Docker itself isn’t robust2 • VMs and docker images provide bitwise reproducibility, but aren’t scientifically useful; big black boxes don't really let you reuse or remix the contents • Options: • Run everything all the time • Hinsen’s option 2 • Aka continuous analysis3, similar to continuous integration • Acknowledge that exact repeatability has a half life of utility, and that this is OK • We don’t build houses to last forever… 1http://ivory.idyll.org/blog/2017-pof-software-archivability.html 3Beaulieu-Jones & Greene, https://doi.org/10.1101/056473 2https://thehftguy.com/2016/11/01/docker-in-production-an-history-of-failure/
  • 33. Software reproducibility • Costs and benefits • If software-enabled results could be made reproducible at no cost, do so • If cost is 10x cost of original work, don’t • At 1x, probably still no • What about at 0.1x (+10%)? Or +20%? • How to balance reproducibility cost vs lost opportunity of new research? • Could be general question about the culture of science or specific question about any one experiment/result • If not practical to make everything reproducible, could use cost:benefit ratio to determine what to do • Also, could factor in confirmation depth (David Soergel1) as a measure of reproducible scientific research (i.e., benefit) • Shallowest form of confirmation: peer review • Deeper forms (different software -> different approach -> different inputs): more confidence in results 1http://davidsoergel.com/posts/confirmation-depth-as-a-measure-of-reproducible-scientific-research
  • 34. Conclusions • Software • Important today, essential tomorrow • Credit • A known problem for papers, worse for software • Citation • We know what to do (mostly), now need to do it • Reproducibility • We think we know what we want to do • But don’t know how to do it • What you can do • Cite the software you use, make it easy for others to cite the software you write (and see the “I solemnly pledge” manifesto – http://ceur- ws.org/Vol-1686/WSSSPE4_paper_15.pdf) • Work towards reproducibility as much as possible
  • 35. National Center for Supercomputing Applications University of Illinois at Urbana–Champaign Citation and Reproducibility in Software Daniel S. Katz Assistant Director for Scientific Software & Applications, NCSA Research Associate Professor, CS Research Associate Professor, ECE Research Associate Professor, iSchool dskatz@illinois.edu, d.katz@ieee.org, @danielskatz
  • 36. National Center for Supercomputing Applications University of Illinois at Urbana–Champaign Additional Software Citation Principle Slides: discussion topics
  • 37. Discussion: What to cite • Importance principle: “…authors should cite the appropriate set of software products just as they cite the appropriate set of papers” • What software to cite decided by author(s) of product, in context of community norms and practices • POWL: “Do not cite standard office software (e.g. Word, Excel) or programming languages. Provide references only for specialized software.” • i.e., if using different software could produce different data or results, then the software used should be cited Purdue Online Writing Lab. Reference List: Electronic Sources (Web Publications). https://owl.english.purdue. edu/owl/resource/560/10/, 2015.
  • 38. Discussion: What to cite (citation vs provenance & reproducibility) • Provenance/reproducibility requirements > citation requirements • Citation: software important to research outcome • Provenance: all steps (including software) in research • For data research product, provenance data includes all cited software, not vice versa • Software citation principles cover minimal needs for software citation for software identification • Provenance & reproducibility may need more metadata
  • 39. Discussion: Software papers • Goal: Software should be cited • Practice: Papers about software (“software papers”) are published and cited • Importance principle (1) and other discussion: The software itself should be cited on the same basis as any other research product; authors should cite the appropriate set of software products • Ok to cite software paper too, if it contains results (performance, validation, etc.) that are important to the work • If the software authors ask users to cite software paper, can do so, in addition to citing to the software
  • 40. Discussion: Derived software • Imagine Code A is derived from Code B, and a paper uses and cites Code A • Should the paper also cite Code B? • No, any research builds on other research • Each research product just cites those products that it directly builds on • Together, this give credit and knowledge chains • Science historians study these chains • More automated analyses may also develop, such as transitive credit D. S. Katz and A. M. Smith. Implementing transitive credit with JSON-LD. Journal of Open Research Software, 3:e7, 2015. http://dx.doi.org/10.5334/jors.by.
  • 41. Discussion: Software peer review • Important issue for software in science • Probably out-of-scope in citation discussion • Goal of software citation is to identify software that has been used in a scholarly product • Whether or not that software has been peer-reviewed is irrelevant • Possible exception: if peer-review status of software is part of software metadata • Working group opinion: not part of the minimal metadata needed to identify the software
  • 42. Discussion: Citations in text • Each publisher/publication has a style it prefers • e.g., AMS, APA, Chicago, MLA • Examples for software using these styles published by Lipson • Citations typically sent to publishers as text formatted in that citation style, not as structured metadata • Recommendation: text citation styles should support: • a) a label indicating that this is software, e.g. [Computer program] • b) support for version information, e.g. Version 1.8.7 C. Lipson. Cite Right, Second Edition: A Quick Guide to Citation Styles–MLA, APA, Chicago, the Sciences, Professions, and More. Chicago Guides to Writing, Editing, and Publishing. University of Chicago Press, 2011.
  • 43. Discussion: Citation limits • Software citation principles • –> more software citations in scholarly products • –> more overall citations • Some journals have strict limits on • Number of citations • Number of pages (including references) • Recommendations to publishers: • Add specific instructions regarding software citations to author guidelines to not disincentivize software citation • Don’t include references in content counted against page limits
  • 44. Discussion: Unique identification • Recommend DOIs for identification of published software • However, identifier can point to 1. a specific version of a piece of software 2. the piece of software (all versions of the software) 3. the latest version of a piece of software • One piece of software may have identifiers of all 3 types • And maybe 1+ software papers, each with identifiers • Use cases: • Cite a specific version • Cite the software in general • Link multiple releases together, to understanding all citations
  • 45. Discussion: Unique identification (cont.) • Principles intended to apply at all levels • To all identifiers types, e.g., DOIs, RRIDs, ARKS, etc. • Though again: recommend when possible use DOIs that identify specific versions of source code • RRIDs developed by the FORCE11 Resource Identification Initiative • Discussed for use to identify software packages (not specific versions) • FORCE11 Resource Identification Technical Specifications Working Group says “Information resources like software are better suited to the Software Citation WG” • Currently no consensus on RRIDs for software
  • 46. Discussion: Types of software • Principles and discussion generally focus on software as source code • But some software is only available as an executable, a container, or a service • Principles intended to apply to all these forms of software • Implementation of principles will differ by software type • When software exists as both source code and another type, cite the source code
  • 47. Discussion: Access to software • Accessibility principle: “software citations should permit and facilitate access to the software itself” • Metadata should provide access information • Free software: metadata includes UID that resolves to URL to specific version of software • Commercial software: metadata provides information on how to access the specific software • E.g., company’s product number, URL to buy the software • If software isn’t available now, it still should be cited along with information about how it was accessed • Metadata should persist, even when software doesn’t
  • 48. Discussion: Identifier resolves to … • Identifier that points directly to software (e.g., GitHub repo) satisfies Unique Identification (3), Accessibility (5), and Specificity (6), but not Persistence (4) • Recommend that identifier should resolve to persistent landing page that contains metadata and link to the software itself, rather than directly to source code • Ensures longevity of software metadata, even beyond software lifespan • Point to figshare, Zenodo, etc., not GitHub