To architect or engineer? Lessons from DataPool on building RDM repositories

To architect or engineer?
Lessons from DataPool on
building RDM repositories

Steve Hitchcock, JISC DataPool Project
9th DCC Research Data Management Forum (RDMF9)
Cambridge, 14-15 November 2012

Why architecting?

http://datapool.soton.ac.uk

DataPool architecture (Sharepoint)

Peter Hancock, iSolutions, University of Southampton

DataPool
Building Capacity, Developing Skills, Supporting Researchers
October 2011

Policy and guidance Training Data repository

SharePoint
Doctoral Training
Centres
Graduate
& staff
training
services
Progress
Case studies + EPrints 3.3
• Imaging, 3D
•Geodata University Strategic
• ++ Research Groups
IDMB EPrints data apps
Informed Surveys of
by data practices
among academics
3-layer metadata
March 2013

Support for Data Capture/share with Assign
Developing/ Management Plans external sources, Large-scale DataCite
e.g. SWORD-ARM data storage DOIs
working with e.g.

JISCMRD Progress Byatt, D. (D.R.Byatt@soton.ac.uk)
Workshop Hitchcock, S. (sh94r@ecs.soton.ac.uk )
24-25 October 2012 White, W. (whw@soton.ac.uk )
Nottingham
http:/datapool.soton.ac.uk/

Data repository platforms
Architected

•DataFlow
• MS Sharepoint
•EPrints

Engineered
From a data repository
Other platforms available perspective
•DSpace, CKAN,
data.bris, etc.

Implementations of DataFlow Model
DataFlow: two data
Curated deposit motivations
DataStage SWORD repository/ar for creators: want to
(practice), need to
chive
(policy)

Two-stage
architecture DataBank

Addresses Dropbox
effect for data EPrints
producers

DSpace QMUL

DataStage: Upload file

DataStage was developed at the University of Oxford
DataStage screenshots courtesy JISC Kaptur project http://www.vads.ac.uk/kaptur/
Thanks to Carlos Silva

DataStage: Submit as data package

3-layer metadata model

Takeda et al., 6th IDCC, Dec. 2010
available from http://eprints.soton.ac.uk/169533/
JISC Institutional Data Management Blueprint (IDMB)
Project, University of Southampton

SharePoint user interface 1: project

SharePoint user interface 2: data

+ fields for format, keywords

Prof. Simon Cox (engng) on Sharepoint
“The concept that formed part of SP thinking (at
Southampton) from the very inception … that ability to use
SP as a way to manage or at least collaborate as part of a 5-10
year programme of work.

“The other side is what we‟re doing with intellectual property
and what we‟re offering for students. I chair a group design
project, and every single student has said „I just do it all on
Dropbox‟. The same is happening with our research. So I
think we have at least to provide a level of service and a level
of integration between our research experience and our
teaching experience. Would these people go to Southampton
rather than University of Nowhereshire on the Web or the
University of Google or the University of Dropbox? These are
deep questions for us.”

ePrintsSoton: Item type: Dataset

Currently EPrints v3.2, customised to ePrintsSoton
Dataset Item Type from 2007

ePrintsSoton: start to deposit Dataset

EPrints data apps

Apps available from EPrints Bazaar http://bazaar.eprints.org/
Apps work with EPrints v3.3 or later

EPrints (test repo) DataShare enabled

App by Tim Brody, EPrints + DataPool

EPrints (test repo) Data Core enabled

Data Core “adds a few
fields and doesn‟t
remove any fields
from the eprint
object. It creates an
alternate workflow for
datasets which is
much smaller than a
normal eprints
workflow.”
App by Patrick McSweeney

EPrints (test repo) Data Core enabled 2

App by Patrick McSweeney

Essex Research Data metadata profile aims
“Using metadata schema relevant to UK HE and
research data (DataCite, INSPIRE and DDI
2.1), we have developed a basic metadata profile
suited to describing research data generated at
institutions with disciplinary diversity. The
inclusion of fields like Funder and Grant number
will ensure future harvesting and linking
opportunities (like RCUK Research Outcome
Systems). The metadata also suits the EPSRC data
registry requirements.”
http://researchdataessex.posterous.com/reposito
ry-beta-metadata-profile-released

EPrints: Essex Research Data repository

Screenshots courtesy
JISC Research Data
@Essex project
Thanks to Louise
Corti, Tom Ensom,
Alexis Wolton

EPrints v3.3.10, customised to Essex Research Data
http://researchdata.essex.ac.uk/

Essex Research Data: observations

•Assumes data deposit, so no selection of EPrints
Item Type
• No selection of e.g. Creative Commons licence,
just copyright
• Requirement for Time Period suggests particular
type of data expected
• Fields for Geographic info (not required)
suggests particular type of data expected

Architects and surroundings
“On one plot aggressively crystalline
blocks by Rogers StirkHarbour are going
up, their diamond shapes having
nothing in particular to do with anything
Nine Elms, around them. On another Foster and
London Partners have designed a series of
usembassylondon
curving, stepped, blobby things, of the
kind usually designed to take advantage
of views on the Med or the Gulf, but are
here facing each other like rows of
daleks. Again, it shows little interest in
anything around it.”
R. Moore, Utopia on Thames, Observer, 11 Nov 2012

Open access repository interoperability

Confederation of Open Access Repositories (COAR)
Dublin Core, CRIS-CERIF
OpenAIRE, RepositoryNet+, Rioxx
RCUK: Research Outcomes System, Gateway to
Research, REF

Is there the same current debate about
interoperability of data repositories?

COAR on OA interoperability
Specific initiatives designed to support interoperability:
AuthorClaim, CRIS-OAR, DataCite, DINI Certificate for
Document and Publication Services, DOI, DRIVER,
Handle System, KE Usage Statistics Guidelines, OAI-
ORE, OAI-PMH, OA-Statistik, OA Repository Junction,
OpenAIRE, ORCID, PersID, PIRUS, SURE, SWORD, and
UK RepositoryNet+.
COAR, The Current State of Open Access Repository
Interoperability (2012), 26 Oct. 2012 v.02

MT @gknight2000 (Gareth Knight) Lincoln's CKan
instance impressive bit.ly/QQd1au Doesn't appear to
support OAIPMH or preservation function #jiscmrd

What next for DataPool repositories?
Sharepoint
• User test and feedback sessions scheduled, will
direct further development
EPrints apps (1 or 2 0f following, initially)
• Develop app based on Essex data repository,
providing other repositories with a 1-click install of
this profile
• Build interoperability (I/O) apps:
e.g. Data Management Plans, Dropbox
• Automate record capture for producers of large-
scale, regular data outputs

To architect or engineer? Lessons from DataPool on building RDM repositories

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to To architect or engineer? Lessons from DataPool on building RDM repositories

Similar to To architect or engineer? Lessons from DataPool on building RDM repositories (20)

Recently uploaded

Recently uploaded (20)

To architect or engineer? Lessons from DataPool on building RDM repositories

Editor's Notes