Knowledge Sharing in the Sciences - 8JPL - Presentation Transcript
knowledge sharing
in the sciences
kaitlin thaney
program manager, science commons
barcelona, spain - 1 july 2009
This presentation is licensed under the CreativeCommons-Attribution-3.0 license.
information sharing is at the root of
scholarship and science
the system of print publishing is a
system of sharing knowledge
then came the move to digital ...
the web revolutionized
search, commerce, collaboration
sharing became cheaper,
easier technically
costs of copying, moving, storing ...
down to nearly zero
ability to link between nodes of
information (dating back to 1980s)
yet ...
most of the useful knowledge
is inaccessible.
most of the useful knowledge is
in the wrong technology.
we don’t have enough people working
on the problem(s).
(0) the “research web”
(1) step 1: opening access
(2) step 2: access to research tools
(3) step 3: access to data
(4) step 4: open cyberinfrastructure
(5) what’s next?
make sharing easy, legal and scalable
integrated approach
building part of the infrastructure for
knowledge sharing
the “research web”
making the web work better for science
integrating disparate knowledge sources
make better use of existing information
in the digital form
knowledge?
journal articles
data
ontologies
annotations
plasmids and cell lines
have capability to drastically increase
sharing at lower cost ...
... though, still roadblocks ...
silos of knowledge, walls of cost,
secrecy, lagging incentive system for
collaboration and sharing
step one
... it all starts with access to the
scientific content and data ...
scientific revolutions occur when a
sufficient body of data accumulates to
overthrow the dominant theories
we use to frame reality
a so-called paradigm shift
- from thomas kuhn
scholarship entrenched in idea of
transmitting knowledge via paper
mentality reflected even in the way we
describe “papers”
static, one-dimensional documents
in the digital world, “papers” can
become living, breathing works
no longer static PDF documents
linking to data sets, other relevant
papers, information, plasmids, genes
oldest scientific
journal
published in
english-
speaking world
1665
need to change the way we think of
scholarly publishing,
of knowledge sharing
paradigm shift
begin thinking of “papers” as
containers of knowledge
“papers”
IGFBP-5 plays a role in the
regulation of cellular senescence
via a p53-dependent pathway
and in aging-associated vascular
diseases
“networked knowledge”
IGFBP-5 plays a role in the
regulation of cellular senescence
via a p53-dependent pathway
and in aging-associated vascular
diseases
content needs to be legally and
technically accessible
we’ll start with legal ...
thinking of “papers” more as containers of
knowledge
copyright locks that container
traditional transfer of copyright agreement
Open Access (OA)
“ By open access to the literature, we mean its free
availability on the public internet, permitting users to
read, download, copy, distribute, print, search, or link to
the full texts of the articles, crawl them for indexing,
pass them as data to software, or use them for any
other lawful purpose, without financial, legal or
technical barriers other than those inseparable from
gaining access to the internet itself.”
Image from the Public Library of Science, licensed to the public, under
CC-BY-3.0
“The only constraint on reproduction and distribution,
and the only role for copyright in this domain, should
be to give authors control over the integrity of their
work and the right to be properly acknowledged and
cited.”
http://creativecommons.org/licenses/
legal
implementation
step two
access to research tools from
funded research
examples:
lab mice, cell lines,
DNA, stem cells
... the physical
materials
office supplies for
science
ideally ...
contact author, obtain material,
recreate experiment
build on the existing work, publish
and repeat ...
the reality ...
materials difficult to find, fulfill, lack
resources
reagents and assays often re-invented
or reverse engineered
locked in contracts, bureaucracy,
deliberate withholding, “club mentality”
no office superstores for
science
no internet marketplaces
for science
another way to think of it ...
solves the access problem via
contract
UBMTA (standardized material
transfer agreements, or
SLA
MTAs)
SCMTA
standard icons, CC
methodology, metadata
step three
data and the public domain
legal issues:
“it’s complicated”
copyright and databases
what’s protected? is it legal?
facts are free
to what extent is there creative expression?
database protections based on jurisdiction
sui generis,
“sweat of the brow”
Crown copyright
the list goes on ....
social issues:
protection instinct / culture of control
PD relinquishes much of this control, even
control in the service of freedom
“my data”, interpretation issues
fear, uncertainty, doubt (FUD)
issue of license proliferation
whatever you do to the least of the
databases, you do to the integrated system
(the most restrictive wins)
need for a legally accurate and
simple solution
reducing or eliminating the need to make the
distinction of what’s protected
requires modular, standards based approach
to licensing
our solution ...
reconstruction of the public domain
create legal zones of certainty for data
attribution through accompanying norms
3.1 The protocol must promote legal predictability
and certainty.
3.2 The protocol must be easy to use and understand.
3.3 The protocol must impose the lowest possible
transaction costs on users.
For the full text:
http://sciencecommons.org/projects/publishing/open-access-data-protocol/
CC Zero waiver + SC norms
waive rights public domain
attribution / citation through
community norms, not a contract
a protocol, not a license
calls for data providers to waive all rights
necessary for data extraction and re-use
requires provider place no additional
obligations (like share-alike) to limit
downstream use
request behavior (like attribution) through
norms and terms of use
public domain = license, cannot be made
“more free” - only less free
PD = the original commons
at least make metadata open,
if one can’t make data itself open
early adopters,
committing to make their data open
using CC0
(1) Tranche - free, open source
(2) Personal Genome Project
(3) Digg, Flickr, WhiteHouse.gov
(4) EMBL SIDER, TDI Kernel
technical considerations:
persistent URLs
open, stable namespaces
standards, standards, standards
facilitate integration, interoperability
and more ...
step four
invest in open cyberinfrastructure
data without structure and annotation is a
lost opportunity.
data should flow in an open, public, and
extensible infrastructure
support recombination and reconfiguration
into computer models, queryable by search
engine
treated as public good
change requires a new legal infrastructure
to encourage collaboration
traits of legal protocols:
legally accurate
simple for scientists
low transaction costs
facilitate interoperability
business and user friendly
what can you do?
lead by example ...
design for maximum reuse
ensure the freedom to integrate
leverage existing open infrastructure
allows for snap together integration of
the tools, data, research literature
what’s needed?
common standards, right software
accessible data and content
open infrastructure
build for network effects
thank you
kaitlin@creativecommons.org
sciencecommons.org
neurocommons.org
0 comments
Post a comment