Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Software in the scientific literature:
Problems with seeing, finding, and using
software mentioned in the biology literatu...
Research Questions
• How is software mentioned in papers?
• What kinds of mentions are used?
• How accessible and reusable...
Sample and Method
• 90 randomly selected articles from biology
literature
• Journals stratified across Journal Impact
Fact...
How many mentions?
• 59 articles mentioned software, 31 did not.
• There were 286 distinct mentions of software.
• Those m...
Types of mentions
Mention Type Example
Cite to Publication … was calculated using biosys (Swofford & Selander 1981).
Cite ...
https://github.com/jameshowison/softcite/blob
/master/data/software-citation-coding.ttl
@jameshowison DOI:
10.6084/m9.figs...
Types of Mentions
@jameshowison DOI:
10.6084/m9.figshare.1146366
Simpler Mention Kinds
@jameshowison DOI:
10.6084/m9.figshare.1146366
By Strata?
@jameshowison DOI:
10.6084/m9.figshare.1146366
What characteristics of software?
@jameshowison DOI:
10.6084/m9.figshare.1146366
Simpler software types
@jameshowison DOI:
10.6084/m9.figshare.1146366
Different software mentioned differently?
@jameshowison DOI:
10.6084/m9.figshare.1146366
How useful are these mentions?
@jameshowison DOI:
10.6084/m9.figshare.1146366
Not much change across strata
@jameshowison DOI:
10.6084/m9.figshare.1146366
Do different mentions work?
@jameshowison DOI:
10.6084/m9.figshare.1146366
Extras
• Only 24% journals had policies that
mentioned software, declining by strata.
– Rarely mention versions.
– Not cle...
Next steps
• Use as “Gold Standard” dataset to train
machine learning
– Me with Yan Gao and Byron Wallace
– You?
• Broader...
Upcoming SlideShare
Loading in …5
×

Software in the scientific literature: Problems with seeing, finding, and using software mentioned in the biology literature.

1,547 views

Published on

Software is increasingly crucial to scholarship, yet the visibility and usefulness of software in the scientific record is in question. Just as with data, the visibility of software in publications is related to incentives to share software in re-usable ways, and so promote efficient science. In this paper we examine software in publications through content analysis of a random sample of 90 biology articles. We develop a coding scheme to identify software “mentions,” and classify them according to their characteristics and ability to realize the functions of citations. Overall we find diverse and problematic practices: only between 31–43% of mentions involve formal citations; informal mentions are very common, even in high impact factor journals and across different kinds of software. Software is frequently inaccessible (15–29% of packages in any form; between 90–98% of specific versions; only between 24–40% provide source code). Cites to publications are particularly poor at providing version information, while informal mentions are particularly poor at providing crediting information. We provide recommendations to improve the practice of software citation, highlighting recent nascent efforts. Software plays an increasingly great role in scientific practice; it deserves a clear and useful place in scholarly communication.

Published in: Software
  • Be the first to comment

  • Be the first to like this

Software in the scientific literature: Problems with seeing, finding, and using software mentioned in the biology literature.

  1. 1. Software in the scientific literature: Problems with seeing, finding, and using software mentioned in the biology literature. James Howison and Julia Bullard Information School University of Texas at Austin This material is based upon work supported by the National Science Foundation under Grant No. SMA-1064209. @jameshowison DOI: 10.6084/m9.figshare.1146366
  2. 2. Research Questions • How is software mentioned in papers? • What kinds of mentions are used? • How accessible and reusable is the software mentioned? • How do these mentions perform the functions of citation? github.com/jameshowison/softcite DOI: 10.6084/m9.figshare.1146366 @jameshowison DOI: 10.6084/m9.figshare.1146366
  3. 3. Sample and Method • 90 randomly selected articles from biology literature • Journals stratified across Journal Impact Factor to balance coverage with influence • Manual content analysis – developed reliable coding scheme across 3 coders, tested with Cohen’s Kappa @jameshowison DOI: 10.6084/m9.figshare.1146366
  4. 4. How many mentions? • 59 articles mentioned software, 31 did not. • There were 286 distinct mentions of software. • Those mentions were to 146 distinct pieces of software. @jameshowison DOI: 10.6084/m9.figshare.1146366
  5. 5. Types of mentions Mention Type Example Cite to Publication … was calculated using biosys (Swofford & Selander 1981). Cite to Project Name or Website … using the program Autodecay version 4.0.29 PPC (Eriksson 1998). Reference List has: ERIKSSON, T. 1998. Autodecay, vers. 4.0.29 Stockholm: Department of Botany. Like Instrument … calculated by t-test using the Prism 3.0 software (GraphPad Software, San Diego, CA, USA). URL in text … freely available from http://www.cibiv.at/software/pda/ . In-text name mention only … were analyzed using MapQTL (4.0) software. Not even name mentioned … was carried out using software implemented in the Java programming language. @jameshowison DOI: 10.6084/m9.figshare.1146366
  6. 6. https://github.com/jameshowison/softcite/blob /master/data/software-citation-coding.ttl @jameshowison DOI: 10.6084/m9.figshare.1146366
  7. 7. Types of Mentions @jameshowison DOI: 10.6084/m9.figshare.1146366
  8. 8. Simpler Mention Kinds @jameshowison DOI: 10.6084/m9.figshare.1146366
  9. 9. By Strata? @jameshowison DOI: 10.6084/m9.figshare.1146366
  10. 10. What characteristics of software? @jameshowison DOI: 10.6084/m9.figshare.1146366
  11. 11. Simpler software types @jameshowison DOI: 10.6084/m9.figshare.1146366
  12. 12. Different software mentioned differently? @jameshowison DOI: 10.6084/m9.figshare.1146366
  13. 13. How useful are these mentions? @jameshowison DOI: 10.6084/m9.figshare.1146366
  14. 14. Not much change across strata @jameshowison DOI: 10.6084/m9.figshare.1146366
  15. 15. Do different mentions work? @jameshowison DOI: 10.6084/m9.figshare.1146366
  16. 16. Extras • Only 24% journals had policies that mentioned software, declining by strata. – Rarely mention versions. – Not clear that these are followed. • Only between 13–30% of packages make a specific request for citation – 32% of mentions didn’t follow the citation. @jameshowison DOI: 10.6084/m9.figshare.1146366
  17. 17. Next steps • Use as “Gold Standard” dataset to train machine learning – Me with Yan Gao and Byron Wallace – You? • Broader studies in other fields – Assessing impact of policy changes • Use as validator on article submission – “It looks like you are trying to cite XYZ, please provide version and use this form” @jameshowison DOI: 10.6084/m9.figshare.1146366

×