Genome sharing projects
around the world
– Open access is not enough
Fiona Nielsen
Goettingen, June 8 2016
Slides will be made available online  Tweets welcome #ELPUB2016
Open access is not enough
Example: 600,000 open access articles
Example: 600,000 open access articles
Genetic researchers search for data to validate their hypothesis
and discover new relations between genetics and disease
We studied this problem in genomics
We interviewed and surveyed genetic researchers
T. A. van Schaik et al
The need to redefine genomic
data sharing: a focus on data
accessibility, Applied &
Translational Genomics, 2014
10.1016/j.atg.2014.09.013
We studied the problem by
qualitative interviews followed
by a survey of researchers in
human genetics
We studied the problem by
qualitative interviews followed
by a survey of researchers in
human genetics
Open Access more frequently accessed
T. A. van Schaik et al
The need to redefine genomic
data sharing: a focus on data
accessibility, Applied &
Translational Genomics, 2014
10.1016/j.atg.2014.09.013
We studied the problem by
qualitative interviews followed
by a survey of researchers in
human genetics
But Open Access is not enough
T. A. van Schaik et al
The need to redefine genomic
data sharing: a focus on data
accessibility, Applied &
Translational Genomics, 2014
10.1016/j.atg.2014.09.013
Researchers spend months to find
and access genomic data, and often
choose to not access data at all
• Genetic researchers know only a handful of data sources
average 4, max 10
• At our last Repositive data census we counted a total of
163 data sources
The visibility gap
• Read more in our recent PLoS Biology paper:
http://journals.plos.org/plosbiology/article?id=10.1371%2Fjournal.pbio.1002418
10-20x more data is available – and Open Access!
Can download the
data straight away
or after logging in.
Need to apply for
access to the data.
Has both Open and Restricted
access data within one repository.
Lots of open access data in genomics
• Make data more visible, discoverable
• Increase data reuse
• Better use of funding
• More impact for biomedical research and drug discovery 
faster impact for patients
How can we close the gap?
Without adding to the confusion
Repositive has launched a portal (in beta)
Discover new data sources
• Indexing metadata,
ie data descriptions
• Easy search
• Simple access
• Free platform
• First ~42,000
genomic data sets
indexedhttp://repositive.io
Repositive increases data discoverability
Make your data visible
• Users can
contribute
descriptions to
improve visibility for
their research
• Researchers want
visibility because
they want credit
http://repositive.io
• Papers with Open Access data receive more citations
• Piwowar HA and Vision TJ (2013) Data reuse and the open data citation advantage. PeerJ, 1: e175.
http://dx.doi.org/10.7717/peerj.175
Does discoverability impact data reuse?
• Does discoverable open data increase data reuse?
• We are doing an experiment
with GigaScience to test the
data access impact of increasing
discoverability of their Open
Access genomics data
Thank you!

Genome sharing projects around the world - Open Access is not enough

  • 1.
    Genome sharing projects aroundthe world – Open access is not enough Fiona Nielsen Goettingen, June 8 2016 Slides will be made available online  Tweets welcome #ELPUB2016
  • 2.
    Open access isnot enough
  • 3.
    Example: 600,000 openaccess articles
  • 4.
    Example: 600,000 openaccess articles
  • 5.
    Genetic researchers searchfor data to validate their hypothesis and discover new relations between genetics and disease We studied this problem in genomics
  • 6.
    We interviewed andsurveyed genetic researchers T. A. van Schaik et al The need to redefine genomic data sharing: a focus on data accessibility, Applied & Translational Genomics, 2014 10.1016/j.atg.2014.09.013 We studied the problem by qualitative interviews followed by a survey of researchers in human genetics
  • 7.
    We studied theproblem by qualitative interviews followed by a survey of researchers in human genetics Open Access more frequently accessed T. A. van Schaik et al The need to redefine genomic data sharing: a focus on data accessibility, Applied & Translational Genomics, 2014 10.1016/j.atg.2014.09.013
  • 8.
    We studied theproblem by qualitative interviews followed by a survey of researchers in human genetics But Open Access is not enough T. A. van Schaik et al The need to redefine genomic data sharing: a focus on data accessibility, Applied & Translational Genomics, 2014 10.1016/j.atg.2014.09.013 Researchers spend months to find and access genomic data, and often choose to not access data at all
  • 9.
    • Genetic researchersknow only a handful of data sources average 4, max 10 • At our last Repositive data census we counted a total of 163 data sources The visibility gap • Read more in our recent PLoS Biology paper: http://journals.plos.org/plosbiology/article?id=10.1371%2Fjournal.pbio.1002418
  • 10.
    10-20x more datais available – and Open Access! Can download the data straight away or after logging in. Need to apply for access to the data. Has both Open and Restricted access data within one repository.
  • 11.
    Lots of openaccess data in genomics
  • 12.
    • Make datamore visible, discoverable • Increase data reuse • Better use of funding • More impact for biomedical research and drug discovery  faster impact for patients How can we close the gap?
  • 13.
    Without adding tothe confusion
  • 14.
    Repositive has launcheda portal (in beta) Discover new data sources • Indexing metadata, ie data descriptions • Easy search • Simple access • Free platform • First ~42,000 genomic data sets indexedhttp://repositive.io
  • 15.
    Repositive increases datadiscoverability Make your data visible • Users can contribute descriptions to improve visibility for their research • Researchers want visibility because they want credit http://repositive.io
  • 16.
    • Papers withOpen Access data receive more citations • Piwowar HA and Vision TJ (2013) Data reuse and the open data citation advantage. PeerJ, 1: e175. http://dx.doi.org/10.7717/peerj.175 Does discoverability impact data reuse? • Does discoverable open data increase data reuse? • We are doing an experiment with GigaScience to test the data access impact of increasing discoverability of their Open Access genomics data
  • 17.

Editor's Notes

  • #3 Further confounded by the data being highly fragmented. Siloed in repositories and institutions around the world.
  • #12 http://personalgenomes.org http://OpenSNP.org http://www.gigasciencejournal.com / http://gigadb.org https://manuelcorpas.com
  • #14 There are many public repositories, but It can be hugely confusing to know where to look for the right kind of data
  • #18 Our mission is to speed up research and diagnostics for genetic diseases by enabling efficient and ethical access to genomic research data