national law / jurisdiction-based
“sweat of the brow”
“level of skill”
how internat’l data sharing efforts
attribution vs. citation
which one applies? which is best ﬁt?
what’s the difference?
“credit where credit is due”
“triggered by making of a copy”
does it apply to facts?
how to attribute? (papers, ontologies, data)
“in a manner speciﬁed by ...”
credit where credit is due
entrenched scientiﬁc norm
we shouldn’t use the law to make it
hard to do the wrong thing ...
need for a legally accurate and
reducing or eliminating the need to make the
distinction of what’s protected
requires modular, standards based approach
... must promote legal predictability and certainty.
... must be easy to use and understand.
... must impose the lowest possible transaction costs on
set of principles (not license)
open, accessible, interoperable
create legal zones of certainty
calls for data providers to waive all rights
necessary for data extraction and re-use
requires provider place no additional
obligations (like share-alike) to limit
request behavior (like attribution) through
Creating norms for polar data
1. How to preserve the source information? How should the user or
copier preserve the provenance of the data set. What can be required
by PIC that is locally relevant and acceptable? DOIs? Something like a
notice inside the data? Ping to a URL at PIC? RDFa inside a section of
every database that is provided by PIC?
2. How to cite the data set? Many examples out there including
3. How to preserve quality standards? Perhaps we leave it up to the
4. How to note and release user contributions, mashups, repurposing?
Do we need release guidelines of contributions, annotations, etc. to
data sets. How to reward and track individual contributions to a
collective - trackback, user accounts, etc.? A simple “share alike”
Some draft norms of appropriate scientiﬁc
behavior when using PIC data
• Acknowledge the source of the data in accordance with the wishes of the provider,
and explicitly cite the data when they are used in formal scientiﬁc publication (http://
• Maintain a link to the original information in any derived products, ideally through a
persistent identiﬁer, such as a Digital Object Identiﬁer.
• Understanding that the data are made available “as is” and the accuracy of the data
or documentation are not guaranteed. The provider assumes no responsibility for
misuse or misinterpretation.
• Notify the data provider in the manner they describe on how you plan to use the
data. For projects integrally dependent on the data consider requesting
collaboration and/or co-authorship from the provider.
• Share any derived products in the PIC.
• Agree to IPY Data Policy
at best, we’re partially right.
at worst, we’re really wrong.
data without structure and annotation is a
data should ﬂow in an open, public, and
support recombination and reconﬁguration
into computer models, queryable by search
treated as public good
resist the temptation to treat
embrace the potential to treat instead
as a network resource