Introductory talk for ANDS workshop on Institutional Repositories and data. The talk situates the topic within the field of scholarly communication before comparing the relative technical simplicity of running repositories of publications with the complexities that accompany a shift to data. The most-retweeted slide is the one viewing the response of repository managers to data through the lens of Elizabeth Kübler-Ross' stages of grieving.
2. Context
• Scholarly Communication Functions
• Registration
• Certification
• Awareness
• Archiving
• [Rewarding]
Roosendaal, H., and Geurts, P. 1997. “Forces and functions in scientific
communication: an analysis of their interplay”. Cooperative Research Information
Systems in Physics, August 31—September 4 1997, Oldenburg, Germany, via Van de
Sompel, et al., “Rethinking Scholarly Communication: Building the System that
Scholars Deserve”. D-Lib Magazine, September 2004.
http://www.dlib.org/dlib/september04/vandesompel/09vandesompel.htm
16/11/2015 CC-BY @atreloar 2
3. Purpose
• “a university-based institutional repository is a set
of services that a university offers to the
members of its community for the management
and dissemination of digital materials created by
the institution and its community members”
Lynch, C., “Institutional Repositories: Essential
Infrastructure for Scholarship in the Digital Age”, portal:
Libraries and the Academy, Volume 3, Number 2, April
2003, pp. 327-336
16/11/2015 CC-BY @atreloar 3
4. Affordances
• Deposit
• can support Registration
• Manage
• may contribute to Archiving
• Find
• may assist with Awareness
• Access
• may assist with Certification and Rewarding
16/11/2015 CC-BY @atreloar 4
7. But data…
• Initially supplementary
• Later 1st class
• JASO
• “Institutional repositories can support new practices
of scholarship that emphasize data as an integral part
of the record and discourse of scholarship.” Lynch, p.
332
16/11/2015 CC-BY @atreloar 7
8. Data
• Deposit:
• uploads time out
• Manage:
• large {size|number} objects
• complex disciplinary–specific metadata
• Discover:
• more difficult because need metadata (no full text)
• Access
• downloads time out or don’t make sense
• hard to support in-situ access for re-use
16/11/2015 CC-BY @atreloar 8
9. Stages
• Denial
• “Our users only want to deposit publications”
• Anger
• “Data aren’t a real scholarly output!”
• Bargaining
• “I suppose we can take *some* data”
• Depression
• “what are we going to do with this 36GB dataset?”
• Acceptance
• “Let’s work out the best approach”
16/11/2015 CC-BY @atreloar 9
10. Solution patterns
• Restrict
• size and number of objects
• Refer
• store elsewhere and point
• Adapt
• reengineer underlying software to provide more
robust storage layer
• “Fedora 4.0 supports large files … uploading content
to the repository and pulling it down via the REST API
has been successfully tested with files up to 1-TB”
16/11/2015 CC-BY @atreloar 10
11. Now
• IRs still mostly publications
• but becoming more integrated with broader research
information management ecosystem
• Data often stored elsewhere
• on-campus stores
• cloud offerings
• But
• who is managing the data?
• is this what we want?
• does anybody (other than us) care?
16/11/2015 CC-BY @atreloar 11
12. Future
• “NHMRC encourages researchers to disseminate and
share their research data through publicly accessible
databases or repositories”
• PLOS recommends “appropriate public [discipline-specific
or generalist] repositories” for associated data
• “ACS is facilitating automated deposition of Supporting
Information (SI) files to a secure hosting environment
[figshare], free from any restrictions on size and format”
• Mendeley Data: “The data submitted by researchers on
this platform will be included in the long-term archive of
DANS”
• Elsevier links to PANGAEA
16/11/2015 CC-BY @atreloar 12
13. Data role for IRs
• “management and dissemination of digital
materials created by the institution” (Lynch)
• Parallel offering to offset ecosystem fluidity
• Owned by an organisation that is likely to last and
that should care about its outputs
• But is this enough?
16/11/2015 CC-BY @atreloar 13