brian.hole@ubiquitypress.com
www.ubiquitypress.com
/
@ubiquitypress
Brian
Hole
DPC
Workshop,
York,
5
July
2013
From
Open
Access
to
Open
Data
brian.hole@ubiquitypress.com
www.ubiquitypress.com
/
@ubiquitypress
The
Social
Contract
of
Science
• ValidaKon
• DisseminaKon
• Further
development
ScienKfic
MalpracKce
• Publishers
• Researchers
• Libraries,
repositories…
• All
outputs
brian.hole@ubiquitypress.com
www.ubiquitypress.com
/
@ubiquitypress
Why
data
journals?
Amsterdam
manifesto:
4.
A
data
citaKon
in
a
publicaKon
should
resemble
a
bibliographic
citaKon
and
be
located
in
the
publicaKon’s
reference
list.
• Data
can
(and
should)
be
cited
using
DataCite
DOIs
in
arKcles,
but
this
is
not
enough.
• Researchers
understand
the
value
of
papers
• University
departments
and
the
REF
understand
papers
• Researchers
know
where
to
put
paper
refs,
no
need
for
extra
guidelines
• Publishers
rouKnely
strip
out
anything
else
• Familiar
impact
metrics
can
be
collected
brian.hole@ubiquitypress.com
www.ubiquitypress.com
/
@ubiquitypress
What
is
a
data
paper?
A
data
paper
is
not…
• …
a
research
paper.
A
data
paper
only
describes
a
dataset.
But
it
will
reference
research
papers
that
are
based
on
the
data.
• …
simply
replicaKon
of
the
informaKon
in
a
data
repository.
A
data
paper…
• …
describes
the
methodology
with
which
a
dataset
was
created.
• …
describes
the
dataset
itself.
• …
details
the
reuse
potenKal
of
the
data.
• …
is
oaen
authored
by
a
data
scienKst.
• …
is
citable,
enabling
reuse
to
be
tracked.
brian.hole@ubiquitypress.com
www.ubiquitypress.com
/
@ubiquitypress
1. The
paper
contents
a. The
methods
secKon
of
the
paper
must
provide
sufficient
detail
that
a
reader
can
understand
how
the
resource
was
created.
b. The
resource
must
be
correctly
described.
c. The
reuse
secKon
must
provide
concrete
and
useful
suggesKons
for
reuse
of
the
reuse.
2.
The
deposited
resource
a. The
repository
must
be
suitable
for
resource
and
have
a
sustainability
model.
b. Open
license
permits
unrestricted
access
(e.g.
CC0).
c. A
version
in
an
open,
non-‐proprietary
format.
d. Labeled
in
such
a
way
that
a
3rd
party
can
make
sense
of
it.
e. Must
be
acKonable.
Peer
review
brian.hole@ubiquitypress.com
www.ubiquitypress.com
/
@ubiquitypress
• Data
journals
need
to
be
built
within
the
community,
and
to
adapt
to
its
requirements
Important
principles
• Community
ownership
and
trust
is
important
• Full
transparency
in
processes
and
finances
• Sustainability
• Low
barriers
essenKal
• Zero
to
low
fees
• Quick
online
authoring
• Repository
integraKon
brian.hole@ubiquitypress.com
www.ubiquitypress.com
/
@ubiquitypress
PRIME:
Use
Case
#1
• A
UCL
Researcher
deposits
data
in
an
external
subject
repository.
• The
subject
repository
sends
the
metadata
and
DOI
of
the
data
to
the
UCL
insKtuKonal
repository
so
that
it
has
a
record
of
the
output.
brian.hole@ubiquitypress.com
www.ubiquitypress.com
/
@ubiquitypress
Text
and
data
mining
[the
benefits
of
text
mining
include]:
“increased
researcher
efficiency;
unlocking
hidden
informaKon
and
developing
new
knowledge;
exploring
new
horizons;
improved
research
and
evidence
base;
and
improving
the
search
process
and
quality.
Broader
economic
and
societal
benefits
include
cost
savings
and
producKvity
gains,
innovaKve
new
service
development,
new
business
models
and
new
medical
treatments.”
JISC
“The
downstream
value
of
high
quality,
high
throughput
chemical
informaKon
extracted
from
the
literature
can
be
measured
against
convenKonal
abstracKon
services…
with
a
combined
annual
turnover
of
perhaps
$500-‐1,000
million
dollars.
We
believe
our
tools
are
capable
of
building
the
next
and
beoer
generaKon
of
services.”
Peter
Murray-‐Rust
brian.hole@ubiquitypress.com
www.ubiquitypress.com
/
@ubiquitypress
“Licences
for
Europe”
• Focus
was
to
create
new
licenses
to
enable
TDM
• I.e.
researcher
would
need
one
license
from
each
publisher.
Much
TDM
work
involves
hundreds
of
publishers,
can
take
weeks
just
for
one.
• Focus
pre-‐determined
from
start:
to
come
up
with
proposals
on
licenses
only.
Discussion
of
excepKons
allowed
but
not
to
be
part
of
recommendaKons.
• Unbalanced
setup:
large
corporate
publishers,
technology
sector
poorly
represented.
Working
Group
4:
Text
and
Data
Mining
• UP
walked
out
with
civil
society
groups.
Not
prepared
to
endorse
licenses
as
acceptable.
• Tell
your
publisher
or
associaKon
that
this
is
important
to
you.
• Workshop
at
the
BL
to
inform
policy
makers
in
late
Sept
2013.