SlideShare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.
SlideShare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.
Successfully reported this slideshow.
Activate your 14 day free trial to unlock unlimited reading.
Data publication and Citation for CLIR postdoc seminar
Data publication and Citation for CLIR postdoc seminar
1.
CLIR/DLF Postdoc Seminar
4 August 2014
Data Publication &
Citation Workshop
Carly Strasser
California Digital Library
@carlystrasser
2.
Roadmap
3. Data citation
1. Intro & background
2. Data publication
4. Altmetrics
3.
FromFlickrvialibrarianinsta.tumblr.com
I am not a
librarian.
4.
Enable data
sharing
Encourage
new
incentives
Think about
code sharing
Work with libraries,
publishers and
researchers
Explore new
tools to help
change
system
Build
tools
5.
John Kratz
PhD in Biology from Columbia University
CLIR/DLF Postdoctoral Fellow,
started 12 months ago
Data publication and its importance for data
sharing, reuse, and preservation
6.
From Wikimedia Commons
Back in the day…
From ahswhg.wikispaces.com
7.
Back in the day…
Da Vinci
Curie
Newton
classicalschool.blogspot.com
Darwin
13.
“Reproducibility Crisis”
“Digital Dark Age”
“Erosion of Trust”
14.
All of the research
Early & often
Transparently & openly
FromFlickrbygsagostinho
the way wecommunicate
our
vCan we fix research?
15.
notebook
science/research
source
content
access
data
government
repository
knowledge
FromFlickrbycdsessums
16.
Open Science
Making
data
research
dissemination
available to all
17.
notebook
science
source
content
access
data
government
repository
knowledge
FromFlickrbycdsessums
18.
Open
certain data should be freely available to
everyone to use & republish as they wish,
without restrictions from copyright, patents or
other mechanisms of control
Data
From Flickr by Ninja M.
19.
From Flickr by Iqubal Osman
Culture Shift Required
20.
“I own my data and you
can’t have it.”
“Let me do my work.”
“I’m already too busy.”
“This takes away from
research time.”
22.
You can be
the
Guardian
Steward
Caretaker
Data can’t be owned.
23.
Roadmap
3. Data citation
1. Intro & background
2. Data publication
4. Altmetrics
24.
What does “data
publication” mean?
1. Available
2. Citable
3. Trustworthy
Data
are
25.
Available | Citable | Trustworthy
• Publish means to “make public”.
• You should not have to email the author.
• The data doesn’t have to be open access.
“Email me!”
CC-0 on web
Best practice:
data in a trusted community repository with a
machine-readable license/waiver
26.
Repositories
for data
General content
Non-institutional
Publishers/for-profits
Other
Institutional
Discipline-specific
Repository choices…
27.
Institutional
Discipline-specific
• All data associated
with a paper
• Tells a story
• Clearinghouse for
researcher’s works
• Some of data for a
given paper
• Discoverable
• Integrated systems
• Collection policies
?
Both
Which should a
researcher use?
Which is more
important?
Depends
Repository choices…
28.
Five-element citation: author, year, title,
publisher, identifier
Available | Citable | Trustworthy
Boettiger C, Dushoff J, Weitz JS (2009). Data from: Fluctuation domains in
adaptive evolution. Theoretical Population Biology. Published in Dryad.
doi:10.5061/dryad.j8n0p7vc
More later
29.
Available | Citable | Trustworthy
From Flickr by Percival Lowell
For articles: peer review
For data: ?
peer review?validation?
30.
Technical VS. Scientific
Available | Citable | Trustworthy
31.
Technical review: completeness,
formats
Available | Citable | Trustworthy
Peer review of data
Scientific review: importance,
methods evaluation
vs
32.
Available | Citable | Trustworthy
Peer review of data
• Experts
• Users
• Community
• Use = validation
Who?
33.
1. Data as supplemental material
Data published alongside a traditional journal article.
Available + citable. Review varies.
Potential issues with long-term availability.
What does a data
publication look like?
From Flickr by subsetsum
34.
2. Data paper:
Data + descriptive “data paper”
Standalone journals: Nature Scientific Data, Geoscience Data Journal,
Ecological Archives
OR
Journals that publish data papers: GigaScience, F1000 Research,
Internet Archaeology
What does a data
publication look like?
From Flickr by subsetsum
35.
3. Standalone data
Data published without a related journal article.
Rich metadata (structured or unstructured)
• Institutional repository
• Open Context
• NASA PDS Peer Review Data
• figshare (but no validation)
What does a data
publication look like?
From Flickr by subsetsum
36.
…“publication” insinuates that we are
beholden to the current broken system of
journal publication. The word itself has
too much baggage.
…bureaucrats, funders, and institutions
have a familiarity with the word and it will
ensure the success of the data
publication goals, regardless of whether
we break the mold in the process.
þ
ý
http://datapub.cdlib.org/2012/03/06/data-publication-an-introduction/
38.
From Flickr by Sandia Labs
C. Strasser
C. Strasser
World Bank Photo Collection From Flickr
What do researchers
think of data publication?
39.
Survey of
researchers
N=274
John Kratz, forthcoming
40.
Roadmap
3. Data citation
1. Intro & background
2. Data publication
4. Altmetrics
41.
Identifiers & Data Citation
Allows readers to find data products
Get credit for data and publications
Promotes reproducibility
Example:
Sidlauskas, B. 2007. Data from: Testing for unequal rates of
morphological diversification in the absence of a detailed
phylogeny: a case study from characiform fishes. Dryad Digital
Repository. doi:10.5061/dryad.20
42.
An article about data, but no data
from Joan Starr
43.
And then the hunt for the data…
from Joan Starr
44.
FTP site
And then the hunt for the data…
from Joan Starr
45.
The citation difference: data linked…
from Joan Starr
47.
Identifiers
• String of characters
• Unique
• Linked to a digital
object
DOI: Digital object
identifier
From Flickr by Plbmak
48.
DOIs
ARKs
Strict
metadata
requirements
Flexible
metadata
guidelines
From
the
scholarly
communication
community
From
the
archives
and
museums
community
Established
“brand
name”
Option-‐rich,
open
source
$$$
$
Comparing two…
49.
DOI: 10.1890/1540-9295-10.2.59
ARK: ark:/12025/654xz321/s3/f8.05v
Resolver
Website
with
“object”
Identifiers
How it works
dx.doi.org
50.
From Flickr by Sandia Labs
C. Strasser
C. Strasser
World Bank Photo Collection From Flickr
51.
Identifier for people
Resolver
Person’s
products
Identifiers
for people