SlideShare a Scribd company logo
1 of 21
Criteria and evaluation
of research data
repository platforms @
the University of
Pretoria, South Africa
Presented by Dr Heila Pienaar
Library Services
University of Pretoria
Project team
UP IT: Karin, Yzelle, Herman
UP Library: Isak, Johann, Heila
Agenda
• Project Scope & project team
• Research data lifecycle
• E-Research Framework
• Product Investigation
• Criteria & evaluation
• Recommendations
• Next Steps
• Documents produced
Project Scope
The scope of the project was to evaluate products
(commercial and open source) which could be utilised
as a Research Data Repository Platform as part of a
total Research Data Management (RDM) solution at
UP.
A total RDM solution include all phases of the Research
data life cycle, but for the repository solution, the
focus was thus on identifying a potential solution for
the “Dissemination” phase of the research data life
cycle.
RDM Repository Project Team
Business Sponsor – Prof Stephanie Burton (VP:
Research)
ITS Sponsor – Andre Kleynhans (Deputy Director: ITS)
Project Team members:
ITS Project Manager and Business Analyst – Karin Meyer
ITS Infrastructure Architect - Dr Yzelle Roets
ITS eResearch Support Manager – Herman Jacobs
Library Services: Senior IT Consultant – Isak van der Walt
Library Services: Assistant Director: RDM – Johann van Wyk
Library Services: Deputy Director: Strategic Innovation – Dr Heila
Pienaar
DATA FLOW within the RESEARCH DATA LIFE CYCLE
Creating
Data
Processing
Data
Analysing
Data
PROCESSES within the RESEARCH DATA LIFE CYCLE
Research Data Life Cycle
Re-using
Data
Preserving
Data
Giving
Access to
Data
Product Investigation Methodology
Finalisation of product evaluation criteria
• Consulted with various UP stakeholders to obtain their input (Library and ITS staff)
• Consulted with external stakeholders at the NEDICC workshop held at the CSIR
• Consulted with peer Universities, and
• Utilised various selection criteria from other institutions e.g. Leeds University, Texas
Digital Library and the RDA RPRD IG Matrix (http://tinyurl.com/RPRD-matrix) selection
criteria as a basis and adapted it according to UP specific requirements.
Product Short Listing
Products were short listed based on the following:
• Product scan of products being used internationally, and
• Most commonly used products at universities similar to UP (size and research activity).
Product Evaluation
• UP’s formal Request For Information (RFI) process was followed
• Product evaluation criteria list was compiled and send to short listed vendors together
with standard RFI documentation
• The requested information was received from the vendors and prepared for scoring, and
• Products were scored and evaluated.
Evaluation Criteria
• Functional / Business criteria: Deposit and
Upload; Re-Usability; Identity and Access
Management; Reporting; Discovery; Preservation
• Non Functional: Repository Architecture; Data
Management; Data Governance
• Technical aspects: Back-end Management;
Integration; Infrastructure
• Vendor specific: Support, Training, Usage of
Product
• Performance requirements
• Integration requirements
Unique ID Requirement Description Priority
DU-1 Offer customisable metadata schema as per research area or discipline (including mandatory fields). H
DU-2 Offer the indexing of metadata. H
DU-3 Offer sufficient support for geospatial and journal article metadata. Support association of single or multiple files with one metadata record. H
DU-4 Upload and store metadata at a data object level, where a data object is a folder that contains one or more files. M
DU-5
Support multiple file types and formats of data, e.g. MS Excel 2007, MySQL database, raw data file from a Campbell CR10 data logger, any
multimedia, etc.
H
DU-6
The system should have a simple process for uploading large (multi-TB) data sets, potentially consisting of thousands of files. Must have the ability
to upload large data sets (e.g. 2MB, 2 GB, 1 TB).
H
DU-7 Support controlled lists against some metadata fields, either held locally or drawn from an external source e.g. Subject vocabularies. H
DU-8
Support customisation of out-of-the-box help text and provide context sensitive feedback for the depositor e.g. Highlight missing metadata fields,
file upload failure alert.
M
DU-9 Accommodate workflow where data needs to be destructed with an approval process and audit trail. L
DU-10 Researchers must be able to submit data to repository themselves. H
DU-11 Process of submitting data to a repository from other systems/instruments. H
DU-12 Ability to batch upload data into a repository. H
DU-13 Third party must be able to upload dataset on behalf of researcher. H
DU-14 Support generation / labelling of persistent unique identifiers for datasets including DOIs. H
DU-15 Ability to support the submission of data at any research stage (i.e. Initial Data, Working Data, Final Data Stages) to the repository. M
DU-16 Explain how user interface customisation is achieved. H
DU-17 Out-of-the-box user interface intuitive (easy to use) to users. M
DU-18 Out-of-the-box user interface meets accessibility requirements, e.g. W3C WCAG 1. H
DU-19
Assignment of Intellectual Property (IP) rights and multiple content licensing options with terms and conditions exposed clearly human and
machine re-users is possible, such as copyright and creative commons (CC).
H
Table 1: Deposit and Upload functional criteria
Shortlisted Products & RFI Feedback
Product
Vendor / Implementation
Partner
RFI Feedback
DSpace Atmire
Received information on criteria list, proposed
implementation options and its associated cost.
Figshare Digital Science
Received information on criteria list, proposed
implementation options and its associated cost.
Islandora Discoverygarden
Received information on criteria list, proposed
implementation options and its associated cost.
Dataverse Harvard University
Received insufficient information on criteria list,
implementation options and cost.
PURR Purdue University Failed to respond to RFI.
Redbox
Queensland Cyber
Infrastructure Foundation
(QCIF)
Received information on criteria list, but Redbox is
only a meta data repository and not a data
repository.
Implementation options with most important
advantages / disadvantages – Option 1
Option Advantages Disadvantages
Option 1 - Locally
hosted (both
application and
storage are locally
hosted at UP)
• UP not dependent on internet for
access to application
• UP able to manage own data
• Compliance to legal issues
regarding data, i.e. POPI Act
• Risk of security is lower (control
own storage)
• Resources to be provided (includes
Infrastructure and Human resources
for application and storage) which
increase cost
• Required skills set (e.g. web skills) is
limited or not currently available in
ITS
• UP bandwidth will cause restrictions,
i.e. indexing of site
• Open source product - no legal
entity/responsible company for
assistance, support, enhancements,
new releases, etc.
Implementation options with most important
advantages / disadvantages – Option 2
Option Advantages Disadvantages
Option 2 - Hybrid
(application is cloud
hosted, while the
storage is locally
hosted)
• Collaboration with other
institutions in future is easier
• No additional resources (HR or
infrastructure) are required for
the application
• Legal entity exist i.e.. the
application
• Geographic redundancy
• High availability on the UP front
end – no bandwidth constraints
• Meta data as well as data will be
always available, searchable and
able to be indexed
• UP will be in control of their IP
(control own storage)
• Risk of security will be lower
(control own storage)
• Resources to be provided which
includes infrastructure and human
resources for storage as well as RD,
backups, access control, cooling, etc.
• Required skills set (e.g. web skills) is
limited or not currently available in
ITS
• Indexing of site dependent on UP’s
bandwidth
Implementation options with most important
advantages/ disadvantages – Option 3
Option Advantages Disadvantages
Option 3 - Fully
cloud-based (both
the application and
storage are cloud
hosted through the
vendor)
• Collaboration with other
institutions in future is easier
• No additional resources (HR or
infrastructure) are required for
the application
• Legal entity exist i.e. the
application
• Geographic redundancy
• High availability on the UP front
end – no bandwidth constraints
• Meta data as well as data will be
always available, searchable and
able to be indexed
• UP will be in control of their IP
(control own storage)
• Risk of security will be lower
(control own storage)
• UP does not have control of IP
(governance and accessibility to UP’s
data is in the hands of the vendor)
• Possible future sanctions against
some countries may result in some
users from other parts of the world
not being able to reach UP’s
repository
• Growing running cost as UP will have
to pay for up-and downloading as
well as storage of data
Product Evaluation Results
Criteria Figshare Islandora DSpace
BEEEE
All products and associated vendors/implementation partners are internationally based ,
therefore no weight was assigned in the scoring exercise.
Requirements
Criteria (incl
functional, non-
functional, vendor)
85% fit 96% fit 65% fit
Pricing
Preferential criteria:
Hybrid Option
(option 2)
100% Fit
10% fit – only available through
huge custom development
which poses huge risks to UP.
0% Fit
Preferential criteria:
Consortial pricing
100% Fit 0% fit 0% fit
CONFIDENTIAL
Recommendations
The following is recommended for implementing of a Research Data
Repository platform) solution at UP:
• Figshare should be considered as the product of choice
• Implement the Hybrid implementation option with the application
being cloud hosted and a local storage of 20Tb to start with
• Local storage can be supplemented in future with Cloud storage
• Storage should be investigated in line with the total eResearch initiative
and framework of UP
• A business owner needs to be identified to be responsible for a total
RDM implementation
• Implementation of a Research Data Repository platform requires a
significant increase in Human and Infrastructure Resource components,
and
• Consortial pricing can be kept in mind for the future and was not used
as a determining selection criterion.
Next Steps
• Survey & Interviews with researchers (3rd since 2009)
• Appoint a Business owner(s) for a total RDM solution
• Investigate tools that can support the Research-in-
Process phase, e.g. myTardis
• Finalise storage solution (survey results, African
Research Cloud)
• Business Case to secure resources (financial and
human)
• Implementation of repository solution
• Training of researchers
Gap analysis: Figshare (obtained 0 on
these criteria)
Functional criteria:
• Must be able to change data formats, although most formats are agnostic.
• Auto-generate preservation metadata, e.g. PREMIS.
• Ability to migrate files in datasets to new/other formats over time.
• Be compliant with the OAIS reference model.
Non-functional criteria:
Offer de-duplication of data, metadata
Disadvantages:
• The annual subscription fee for Figshare is relatively high
• Customisation is not possible as it is a proprietary product
• The proprietary product aspect also limits the look and feel customisation of the
product to reflect more of UP’s footprint, and
• No local support exists within South Africa.
Context Diagram: Research Data Management
Documents
• UP Research Data Repository Evaluation
• UP Research Data Management Business
Requirements Specification
• Executive summary
• RDM Project Progress Feedback
• Context Diagram for RDM
• Islandora, Figshare, Redbox, DSpace, Dataverse,
PURR requirements criteria feedback documents
Fernihough, S., e-Research (An Implementation
Framework for South African Organisations). 4th
African Conference for Digital Scholarship and
Digital Curation, Pretoria: 16 May 2011.
http://www.nedicc.ac.za/Conference/Upload%2
0Papaers/S%20Ferihough-
eResearchImplementationFramework.pdf

More Related Content

Similar to Criteria and evaluation of research data repository platforms @ the University of Pretoria, South Africa

Research Data Management: An Introductory Webinar from OpenAIRE and EUDAT
Research Data Management: An Introductory Webinar from OpenAIRE and EUDATResearch Data Management: An Introductory Webinar from OpenAIRE and EUDAT
Research Data Management: An Introductory Webinar from OpenAIRE and EUDATOpenAIRE
 
Research Data Management Introduction: EUDAT/Open AIRE Webinar| www.eudat.eu |
Research Data Management Introduction: EUDAT/Open AIRE Webinar| www.eudat.eu | Research Data Management Introduction: EUDAT/Open AIRE Webinar| www.eudat.eu |
Research Data Management Introduction: EUDAT/Open AIRE Webinar| www.eudat.eu | EUDAT
 
Architecture and Standards
Architecture and StandardsArchitecture and Standards
Architecture and StandardsARDC
 
H2020 data pilot openaire
H2020 data pilot openaireH2020 data pilot openaire
H2020 data pilot openaireSarah Jones
 
Open Access Week 2017: Introduction to Open Data Policies in H2020
Open Access Week 2017: Introduction to Open Data Policies in H2020Open Access Week 2017: Introduction to Open Data Policies in H2020
Open Access Week 2017: Introduction to Open Data Policies in H2020OpenAIRE
 
General introduction to Open Data Policies H2020, influence of OD policies on...
General introduction to Open Data Policies H2020, influence of OD policies on...General introduction to Open Data Policies H2020, influence of OD policies on...
General introduction to Open Data Policies H2020, influence of OD policies on...Nancy Pontika
 
ERA CoBioTech Data Management Webinar
ERA CoBioTech Data Management WebinarERA CoBioTech Data Management Webinar
ERA CoBioTech Data Management WebinarFAIRDOM
 
EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 14, 2016...
EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 14, 2016...EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 14, 2016...
EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 14, 2016...EUDAT
 
OpenAIRE webinar on Open Research Data in H2020 (OAW2016)
OpenAIRE webinar on Open Research Data in H2020 (OAW2016)OpenAIRE webinar on Open Research Data in H2020 (OAW2016)
OpenAIRE webinar on Open Research Data in H2020 (OAW2016)OpenAIRE
 
Open Research Data & H2020
Open Research Data & H2020Open Research Data & H2020
Open Research Data & H2020Sarah Jones
 
Allotrope Foundation & OSTHUS at SmartLab Exchange 2015: Update on the Allotr...
Allotrope Foundation & OSTHUS at SmartLab Exchange 2015: Update on the Allotr...Allotrope Foundation & OSTHUS at SmartLab Exchange 2015: Update on the Allotr...
Allotrope Foundation & OSTHUS at SmartLab Exchange 2015: Update on the Allotr...OSTHUS
 
OpenAIRE and Eudat services and tools to support FAIR DMP implementation
OpenAIRE and Eudat services and tools to support FAIR DMP implementation OpenAIRE and Eudat services and tools to support FAIR DMP implementation
OpenAIRE and Eudat services and tools to support FAIR DMP implementation Research Data Alliance
 
OpenAIRE and Eudat services and tools to support FAIR DMP implementation
OpenAIRE and Eudat services and tools to support FAIR DMP implementation OpenAIRE and Eudat services and tools to support FAIR DMP implementation
OpenAIRE and Eudat services and tools to support FAIR DMP implementation Research Data Alliance
 
H2020 Open Research Data pilot
H2020 Open Research Data pilotH2020 Open Research Data pilot
H2020 Open Research Data pilotSarah Jones
 
EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 7, 2016|...
EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 7, 2016|...EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 7, 2016|...
EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 7, 2016|...EUDAT
 
Data management plans and planning - a gentle introduction
Data management plans and planning - a gentle introductionData management plans and planning - a gentle introduction
Data management plans and planning - a gentle introductionMartin Donnelly
 
Open Data: Strategies for Research Data Management (and Planning)
Open Data: Strategies for Research Data  Management (and Planning)Open Data: Strategies for Research Data  Management (and Planning)
Open Data: Strategies for Research Data Management (and Planning)Martin Donnelly
 

Similar to Criteria and evaluation of research data repository platforms @ the University of Pretoria, South Africa (20)

Research Data Management: An Introductory Webinar from OpenAIRE and EUDAT
Research Data Management: An Introductory Webinar from OpenAIRE and EUDATResearch Data Management: An Introductory Webinar from OpenAIRE and EUDAT
Research Data Management: An Introductory Webinar from OpenAIRE and EUDAT
 
Research Data Management Introduction: EUDAT/Open AIRE Webinar| www.eudat.eu |
Research Data Management Introduction: EUDAT/Open AIRE Webinar| www.eudat.eu | Research Data Management Introduction: EUDAT/Open AIRE Webinar| www.eudat.eu |
Research Data Management Introduction: EUDAT/Open AIRE Webinar| www.eudat.eu |
 
Architecture and Standards
Architecture and StandardsArchitecture and Standards
Architecture and Standards
 
Overview of Function Points Analysis
Overview of Function Points Analysis Overview of Function Points Analysis
Overview of Function Points Analysis
 
Function Points
Function PointsFunction Points
Function Points
 
H2020 data pilot openaire
H2020 data pilot openaireH2020 data pilot openaire
H2020 data pilot openaire
 
Open Access Week 2017: Introduction to Open Data Policies in H2020
Open Access Week 2017: Introduction to Open Data Policies in H2020Open Access Week 2017: Introduction to Open Data Policies in H2020
Open Access Week 2017: Introduction to Open Data Policies in H2020
 
General introduction to Open Data Policies H2020, influence of OD policies on...
General introduction to Open Data Policies H2020, influence of OD policies on...General introduction to Open Data Policies H2020, influence of OD policies on...
General introduction to Open Data Policies H2020, influence of OD policies on...
 
ERA CoBioTech Data Management Webinar
ERA CoBioTech Data Management WebinarERA CoBioTech Data Management Webinar
ERA CoBioTech Data Management Webinar
 
How to elaborate a data management plan
How to elaborate a data management planHow to elaborate a data management plan
How to elaborate a data management plan
 
EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 14, 2016...
EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 14, 2016...EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 14, 2016...
EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 14, 2016...
 
OpenAIRE webinar on Open Research Data in H2020 (OAW2016)
OpenAIRE webinar on Open Research Data in H2020 (OAW2016)OpenAIRE webinar on Open Research Data in H2020 (OAW2016)
OpenAIRE webinar on Open Research Data in H2020 (OAW2016)
 
Open Research Data & H2020
Open Research Data & H2020Open Research Data & H2020
Open Research Data & H2020
 
Allotrope Foundation & OSTHUS at SmartLab Exchange 2015: Update on the Allotr...
Allotrope Foundation & OSTHUS at SmartLab Exchange 2015: Update on the Allotr...Allotrope Foundation & OSTHUS at SmartLab Exchange 2015: Update on the Allotr...
Allotrope Foundation & OSTHUS at SmartLab Exchange 2015: Update on the Allotr...
 
OpenAIRE and Eudat services and tools to support FAIR DMP implementation
OpenAIRE and Eudat services and tools to support FAIR DMP implementation OpenAIRE and Eudat services and tools to support FAIR DMP implementation
OpenAIRE and Eudat services and tools to support FAIR DMP implementation
 
OpenAIRE and Eudat services and tools to support FAIR DMP implementation
OpenAIRE and Eudat services and tools to support FAIR DMP implementation OpenAIRE and Eudat services and tools to support FAIR DMP implementation
OpenAIRE and Eudat services and tools to support FAIR DMP implementation
 
H2020 Open Research Data pilot
H2020 Open Research Data pilotH2020 Open Research Data pilot
H2020 Open Research Data pilot
 
EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 7, 2016|...
EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 7, 2016|...EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 7, 2016|...
EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 7, 2016|...
 
Data management plans and planning - a gentle introduction
Data management plans and planning - a gentle introductionData management plans and planning - a gentle introduction
Data management plans and planning - a gentle introduction
 
Open Data: Strategies for Research Data Management (and Planning)
Open Data: Strategies for Research Data  Management (and Planning)Open Data: Strategies for Research Data  Management (and Planning)
Open Data: Strategies for Research Data Management (and Planning)
 

More from heila1

Society 5
Society 5Society 5
Society 5heila1
 
Dr heila pienaar cvApril2020
Dr heila pienaar cvApril2020Dr heila pienaar cvApril2020
Dr heila pienaar cvApril2020heila1
 
Open sciencerefresher2019
Open sciencerefresher2019Open sciencerefresher2019
Open sciencerefresher2019heila1
 
Scenario discussion: how do these latest trends impact your library
Scenario discussion: how do these latest trends impact your libraryScenario discussion: how do these latest trends impact your library
Scenario discussion: how do these latest trends impact your libraryheila1
 
Strategy and implementation plan to create a 21st Century Academic Library
Strategy and implementation plan to create a 21st Century Academic LibraryStrategy and implementation plan to create a 21st Century Academic Library
Strategy and implementation plan to create a 21st Century Academic Libraryheila1
 
Trends analysis2015
Trends analysis2015Trends analysis2015
Trends analysis2015heila1
 
Dr heila pienaar cvFeb2019
Dr heila pienaar cvFeb2019Dr heila pienaar cvFeb2019
Dr heila pienaar cvFeb2019heila1
 
The world of research data: when should data be closed, shared or open
The world of research data: when should data be closed, shared or openThe world of research data: when should data be closed, shared or open
The world of research data: when should data be closed, shared or openheila1
 
Dr heila pienaar cv
Dr heila pienaar cvDr heila pienaar cv
Dr heila pienaar cvheila1
 
E safety for kids: curriculum, lessons, resources
E safety for kids: curriculum, lessons, resourcesE safety for kids: curriculum, lessons, resources
E safety for kids: curriculum, lessons, resourcesheila1
 
Building a digital scholarship centre on the successes of a Library Makerspace
Building a digital scholarship centre on the successes of a Library MakerspaceBuilding a digital scholarship centre on the successes of a Library Makerspace
Building a digital scholarship centre on the successes of a Library Makerspaceheila1
 
The role of virtual research environments (VRE's) within the context of an e-...
The role of virtual research environments (VRE's) within the context of an e-...The role of virtual research environments (VRE's) within the context of an e-...
The role of virtual research environments (VRE's) within the context of an e-...heila1
 
What does it take to become a 4 x 4 librarian? Implementing the Overdirve e-B...
What does it take to become a 4 x 4 librarian? Implementing the Overdirve e-B...What does it take to become a 4 x 4 librarian? Implementing the Overdirve e-B...
What does it take to become a 4 x 4 librarian? Implementing the Overdirve e-B...heila1
 
Research data management at the University of Pretoria: a case study
Research data management at the University of Pretoria: a case studyResearch data management at the University of Pretoria: a case study
Research data management at the University of Pretoria: a case studyheila1
 
'Makerspaces': should South Africa join the hype?
'Makerspaces': should South Africa join the hype?'Makerspaces': should South Africa join the hype?
'Makerspaces': should South Africa join the hype?heila1
 
Developing an institutional research management plan: guidelines
Developing an institutional research management plan: guidelinesDeveloping an institutional research management plan: guidelines
Developing an institutional research management plan: guidelinesheila1
 
Research data management in a developing country: a personal journey
Research data management in a developing country: a personal journeyResearch data management in a developing country: a personal journey
Research data management in a developing country: a personal journeyheila1
 
What researchers want with regard to research data management (RDM)
What researchers want with regard to research data management (RDM)What researchers want with regard to research data management (RDM)
What researchers want with regard to research data management (RDM)heila1
 
Open science / open research
Open science / open researchOpen science / open research
Open science / open researchheila1
 
Changing research workflows at the University of Pretoria (UP) and the CSIR: ...
Changing research workflows at the University of Pretoria (UP) and the CSIR: ...Changing research workflows at the University of Pretoria (UP) and the CSIR: ...
Changing research workflows at the University of Pretoria (UP) and the CSIR: ...heila1
 

More from heila1 (20)

Society 5
Society 5Society 5
Society 5
 
Dr heila pienaar cvApril2020
Dr heila pienaar cvApril2020Dr heila pienaar cvApril2020
Dr heila pienaar cvApril2020
 
Open sciencerefresher2019
Open sciencerefresher2019Open sciencerefresher2019
Open sciencerefresher2019
 
Scenario discussion: how do these latest trends impact your library
Scenario discussion: how do these latest trends impact your libraryScenario discussion: how do these latest trends impact your library
Scenario discussion: how do these latest trends impact your library
 
Strategy and implementation plan to create a 21st Century Academic Library
Strategy and implementation plan to create a 21st Century Academic LibraryStrategy and implementation plan to create a 21st Century Academic Library
Strategy and implementation plan to create a 21st Century Academic Library
 
Trends analysis2015
Trends analysis2015Trends analysis2015
Trends analysis2015
 
Dr heila pienaar cvFeb2019
Dr heila pienaar cvFeb2019Dr heila pienaar cvFeb2019
Dr heila pienaar cvFeb2019
 
The world of research data: when should data be closed, shared or open
The world of research data: when should data be closed, shared or openThe world of research data: when should data be closed, shared or open
The world of research data: when should data be closed, shared or open
 
Dr heila pienaar cv
Dr heila pienaar cvDr heila pienaar cv
Dr heila pienaar cv
 
E safety for kids: curriculum, lessons, resources
E safety for kids: curriculum, lessons, resourcesE safety for kids: curriculum, lessons, resources
E safety for kids: curriculum, lessons, resources
 
Building a digital scholarship centre on the successes of a Library Makerspace
Building a digital scholarship centre on the successes of a Library MakerspaceBuilding a digital scholarship centre on the successes of a Library Makerspace
Building a digital scholarship centre on the successes of a Library Makerspace
 
The role of virtual research environments (VRE's) within the context of an e-...
The role of virtual research environments (VRE's) within the context of an e-...The role of virtual research environments (VRE's) within the context of an e-...
The role of virtual research environments (VRE's) within the context of an e-...
 
What does it take to become a 4 x 4 librarian? Implementing the Overdirve e-B...
What does it take to become a 4 x 4 librarian? Implementing the Overdirve e-B...What does it take to become a 4 x 4 librarian? Implementing the Overdirve e-B...
What does it take to become a 4 x 4 librarian? Implementing the Overdirve e-B...
 
Research data management at the University of Pretoria: a case study
Research data management at the University of Pretoria: a case studyResearch data management at the University of Pretoria: a case study
Research data management at the University of Pretoria: a case study
 
'Makerspaces': should South Africa join the hype?
'Makerspaces': should South Africa join the hype?'Makerspaces': should South Africa join the hype?
'Makerspaces': should South Africa join the hype?
 
Developing an institutional research management plan: guidelines
Developing an institutional research management plan: guidelinesDeveloping an institutional research management plan: guidelines
Developing an institutional research management plan: guidelines
 
Research data management in a developing country: a personal journey
Research data management in a developing country: a personal journeyResearch data management in a developing country: a personal journey
Research data management in a developing country: a personal journey
 
What researchers want with regard to research data management (RDM)
What researchers want with regard to research data management (RDM)What researchers want with regard to research data management (RDM)
What researchers want with regard to research data management (RDM)
 
Open science / open research
Open science / open researchOpen science / open research
Open science / open research
 
Changing research workflows at the University of Pretoria (UP) and the CSIR: ...
Changing research workflows at the University of Pretoria (UP) and the CSIR: ...Changing research workflows at the University of Pretoria (UP) and the CSIR: ...
Changing research workflows at the University of Pretoria (UP) and the CSIR: ...
 

Recently uploaded

Science 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxScience 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxMaryGraceBautista27
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfMr Bounab Samir
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Celine George
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatYousafMalik24
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxAnupkumar Sharma
 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxDr.Ibrahim Hassaan
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPCeline George
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
 
Q4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptxQ4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptxnelietumpap1
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersSabitha Banu
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfSpandanaRallapalli
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for BeginnersSabitha Banu
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...JhezDiaz1
 
Grade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptxGrade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptxChelloAnnAsuncion2
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 

Recently uploaded (20)

Science 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxScience 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptx
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
 
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptxLEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice great
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
 
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptxYOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptx
 
OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERP
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
 
Q4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptxQ4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptx
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginners
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdf
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for Beginners
 
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptxYOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
 
Grade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptxGrade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptx
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 

Criteria and evaluation of research data repository platforms @ the University of Pretoria, South Africa

  • 1. Criteria and evaluation of research data repository platforms @ the University of Pretoria, South Africa Presented by Dr Heila Pienaar Library Services University of Pretoria Project team UP IT: Karin, Yzelle, Herman UP Library: Isak, Johann, Heila
  • 2. Agenda • Project Scope & project team • Research data lifecycle • E-Research Framework • Product Investigation • Criteria & evaluation • Recommendations • Next Steps • Documents produced
  • 3. Project Scope The scope of the project was to evaluate products (commercial and open source) which could be utilised as a Research Data Repository Platform as part of a total Research Data Management (RDM) solution at UP. A total RDM solution include all phases of the Research data life cycle, but for the repository solution, the focus was thus on identifying a potential solution for the “Dissemination” phase of the research data life cycle.
  • 4. RDM Repository Project Team Business Sponsor – Prof Stephanie Burton (VP: Research) ITS Sponsor – Andre Kleynhans (Deputy Director: ITS) Project Team members: ITS Project Manager and Business Analyst – Karin Meyer ITS Infrastructure Architect - Dr Yzelle Roets ITS eResearch Support Manager – Herman Jacobs Library Services: Senior IT Consultant – Isak van der Walt Library Services: Assistant Director: RDM – Johann van Wyk Library Services: Deputy Director: Strategic Innovation – Dr Heila Pienaar
  • 5. DATA FLOW within the RESEARCH DATA LIFE CYCLE
  • 6. Creating Data Processing Data Analysing Data PROCESSES within the RESEARCH DATA LIFE CYCLE Research Data Life Cycle Re-using Data Preserving Data Giving Access to Data
  • 7.
  • 8. Product Investigation Methodology Finalisation of product evaluation criteria • Consulted with various UP stakeholders to obtain their input (Library and ITS staff) • Consulted with external stakeholders at the NEDICC workshop held at the CSIR • Consulted with peer Universities, and • Utilised various selection criteria from other institutions e.g. Leeds University, Texas Digital Library and the RDA RPRD IG Matrix (http://tinyurl.com/RPRD-matrix) selection criteria as a basis and adapted it according to UP specific requirements. Product Short Listing Products were short listed based on the following: • Product scan of products being used internationally, and • Most commonly used products at universities similar to UP (size and research activity). Product Evaluation • UP’s formal Request For Information (RFI) process was followed • Product evaluation criteria list was compiled and send to short listed vendors together with standard RFI documentation • The requested information was received from the vendors and prepared for scoring, and • Products were scored and evaluated.
  • 9. Evaluation Criteria • Functional / Business criteria: Deposit and Upload; Re-Usability; Identity and Access Management; Reporting; Discovery; Preservation • Non Functional: Repository Architecture; Data Management; Data Governance • Technical aspects: Back-end Management; Integration; Infrastructure • Vendor specific: Support, Training, Usage of Product • Performance requirements • Integration requirements
  • 10. Unique ID Requirement Description Priority DU-1 Offer customisable metadata schema as per research area or discipline (including mandatory fields). H DU-2 Offer the indexing of metadata. H DU-3 Offer sufficient support for geospatial and journal article metadata. Support association of single or multiple files with one metadata record. H DU-4 Upload and store metadata at a data object level, where a data object is a folder that contains one or more files. M DU-5 Support multiple file types and formats of data, e.g. MS Excel 2007, MySQL database, raw data file from a Campbell CR10 data logger, any multimedia, etc. H DU-6 The system should have a simple process for uploading large (multi-TB) data sets, potentially consisting of thousands of files. Must have the ability to upload large data sets (e.g. 2MB, 2 GB, 1 TB). H DU-7 Support controlled lists against some metadata fields, either held locally or drawn from an external source e.g. Subject vocabularies. H DU-8 Support customisation of out-of-the-box help text and provide context sensitive feedback for the depositor e.g. Highlight missing metadata fields, file upload failure alert. M DU-9 Accommodate workflow where data needs to be destructed with an approval process and audit trail. L DU-10 Researchers must be able to submit data to repository themselves. H DU-11 Process of submitting data to a repository from other systems/instruments. H DU-12 Ability to batch upload data into a repository. H DU-13 Third party must be able to upload dataset on behalf of researcher. H DU-14 Support generation / labelling of persistent unique identifiers for datasets including DOIs. H DU-15 Ability to support the submission of data at any research stage (i.e. Initial Data, Working Data, Final Data Stages) to the repository. M DU-16 Explain how user interface customisation is achieved. H DU-17 Out-of-the-box user interface intuitive (easy to use) to users. M DU-18 Out-of-the-box user interface meets accessibility requirements, e.g. W3C WCAG 1. H DU-19 Assignment of Intellectual Property (IP) rights and multiple content licensing options with terms and conditions exposed clearly human and machine re-users is possible, such as copyright and creative commons (CC). H Table 1: Deposit and Upload functional criteria
  • 11. Shortlisted Products & RFI Feedback Product Vendor / Implementation Partner RFI Feedback DSpace Atmire Received information on criteria list, proposed implementation options and its associated cost. Figshare Digital Science Received information on criteria list, proposed implementation options and its associated cost. Islandora Discoverygarden Received information on criteria list, proposed implementation options and its associated cost. Dataverse Harvard University Received insufficient information on criteria list, implementation options and cost. PURR Purdue University Failed to respond to RFI. Redbox Queensland Cyber Infrastructure Foundation (QCIF) Received information on criteria list, but Redbox is only a meta data repository and not a data repository.
  • 12. Implementation options with most important advantages / disadvantages – Option 1 Option Advantages Disadvantages Option 1 - Locally hosted (both application and storage are locally hosted at UP) • UP not dependent on internet for access to application • UP able to manage own data • Compliance to legal issues regarding data, i.e. POPI Act • Risk of security is lower (control own storage) • Resources to be provided (includes Infrastructure and Human resources for application and storage) which increase cost • Required skills set (e.g. web skills) is limited or not currently available in ITS • UP bandwidth will cause restrictions, i.e. indexing of site • Open source product - no legal entity/responsible company for assistance, support, enhancements, new releases, etc.
  • 13. Implementation options with most important advantages / disadvantages – Option 2 Option Advantages Disadvantages Option 2 - Hybrid (application is cloud hosted, while the storage is locally hosted) • Collaboration with other institutions in future is easier • No additional resources (HR or infrastructure) are required for the application • Legal entity exist i.e.. the application • Geographic redundancy • High availability on the UP front end – no bandwidth constraints • Meta data as well as data will be always available, searchable and able to be indexed • UP will be in control of their IP (control own storage) • Risk of security will be lower (control own storage) • Resources to be provided which includes infrastructure and human resources for storage as well as RD, backups, access control, cooling, etc. • Required skills set (e.g. web skills) is limited or not currently available in ITS • Indexing of site dependent on UP’s bandwidth
  • 14. Implementation options with most important advantages/ disadvantages – Option 3 Option Advantages Disadvantages Option 3 - Fully cloud-based (both the application and storage are cloud hosted through the vendor) • Collaboration with other institutions in future is easier • No additional resources (HR or infrastructure) are required for the application • Legal entity exist i.e. the application • Geographic redundancy • High availability on the UP front end – no bandwidth constraints • Meta data as well as data will be always available, searchable and able to be indexed • UP will be in control of their IP (control own storage) • Risk of security will be lower (control own storage) • UP does not have control of IP (governance and accessibility to UP’s data is in the hands of the vendor) • Possible future sanctions against some countries may result in some users from other parts of the world not being able to reach UP’s repository • Growing running cost as UP will have to pay for up-and downloading as well as storage of data
  • 15. Product Evaluation Results Criteria Figshare Islandora DSpace BEEEE All products and associated vendors/implementation partners are internationally based , therefore no weight was assigned in the scoring exercise. Requirements Criteria (incl functional, non- functional, vendor) 85% fit 96% fit 65% fit Pricing Preferential criteria: Hybrid Option (option 2) 100% Fit 10% fit – only available through huge custom development which poses huge risks to UP. 0% Fit Preferential criteria: Consortial pricing 100% Fit 0% fit 0% fit CONFIDENTIAL
  • 16. Recommendations The following is recommended for implementing of a Research Data Repository platform) solution at UP: • Figshare should be considered as the product of choice • Implement the Hybrid implementation option with the application being cloud hosted and a local storage of 20Tb to start with • Local storage can be supplemented in future with Cloud storage • Storage should be investigated in line with the total eResearch initiative and framework of UP • A business owner needs to be identified to be responsible for a total RDM implementation • Implementation of a Research Data Repository platform requires a significant increase in Human and Infrastructure Resource components, and • Consortial pricing can be kept in mind for the future and was not used as a determining selection criterion.
  • 17. Next Steps • Survey & Interviews with researchers (3rd since 2009) • Appoint a Business owner(s) for a total RDM solution • Investigate tools that can support the Research-in- Process phase, e.g. myTardis • Finalise storage solution (survey results, African Research Cloud) • Business Case to secure resources (financial and human) • Implementation of repository solution • Training of researchers
  • 18. Gap analysis: Figshare (obtained 0 on these criteria) Functional criteria: • Must be able to change data formats, although most formats are agnostic. • Auto-generate preservation metadata, e.g. PREMIS. • Ability to migrate files in datasets to new/other formats over time. • Be compliant with the OAIS reference model. Non-functional criteria: Offer de-duplication of data, metadata Disadvantages: • The annual subscription fee for Figshare is relatively high • Customisation is not possible as it is a proprietary product • The proprietary product aspect also limits the look and feel customisation of the product to reflect more of UP’s footprint, and • No local support exists within South Africa.
  • 19. Context Diagram: Research Data Management
  • 20. Documents • UP Research Data Repository Evaluation • UP Research Data Management Business Requirements Specification • Executive summary • RDM Project Progress Feedback • Context Diagram for RDM • Islandora, Figshare, Redbox, DSpace, Dataverse, PURR requirements criteria feedback documents
  • 21. Fernihough, S., e-Research (An Implementation Framework for South African Organisations). 4th African Conference for Digital Scholarship and Digital Curation, Pretoria: 16 May 2011. http://www.nedicc.ac.za/Conference/Upload%2 0Papaers/S%20Ferihough- eResearchImplementationFramework.pdf