SlideShare a Scribd company logo
1 of 27
“Filling the digital preservation gap”
an update from the Jisc Research
Data Spring project at York and Hull
Jenny Mitcham
Digital Archivist
Borthwick Institute for Archives
University of York
13 August 2015
Project aim
“…to investigate
Archivematica and explore
how it might be used to
provide digital preservation
functionality within a wider
infrastructure for Research
Data Management.”
What about Hydra?
• Hydra is not mentioned much in our project
report ...this is deliberate!
• We wanted to keep our findings generic to
make it most useful to a wide range of
institutions who may be interested in digital
preservation...
• ...this means we are more likely to get further
funding
• However...
Project team
University of Hull:
• Chris Awre – Head of Information Services,
Library and Learning Innovation
• Richard Green – Independent Consultant
• Simon Wilson – University Archivist
University of York:
• Julie Allinson – Manager, Digital York
• Jen Mitcham – Digital Archivist
About the project
• Funded as part of Jisc Research Data Spring
• Started 30th March 2015
• Phase 1 is complete
• Phase 2 has just started and will run until
November
• …and we hope phase 3 will
be funded
Project structure
• Phase 1 – explore: testing, research, thinking -
produce a report (3 months)
• Phase 2 – develop: make Archivematica better
for RDM, plan implementation (4 months)
• Phase 3 – implement: set up proof of
concepts at York and Hull (6 months)
Phase 1 -The key questions
• Why? Why are we bothering to 'preserve' research data. What are
the drivers here and what are the risks if we don't? Why are we
looking at Archivematica?
• What? What are the characteristics of research data and how might
it differ from other born digital data that memory institutions are
establishing digital archives to manage and preserve? What types of
files are our researchers producing and how would Archivematica
handle these? What does Archivematica offer us and what benefits
does it bring?
• How? How would we incorporate Archivematica into a wider
technical infrastructure for research data management and what
workflows would we put in place? Where would it sit and what
other systems would it need to talk to? How can we improve
Archivematica for RDM?
• Who? Who else is using Archivematica (or other digital preservation
systems) to do similar things and what can we learn from them?
What staff resource is needed to preserve research data with
Archivematica?
http://digital-archiving.blogspot.co.uk/
Why Archivematica?
“The goal of the Archivematica project is to give
archivists and librarians with limited technical
and financial capacity the tools, methodology
and confidence to begin preserving digital
information today.”
Why Archivematica?
• Standards-based
• Open Source
• Flexible and customisable
• Compatible with hundreds of file formats
• Advanced search and storage management
• Integrated with third-party systems
From https://ww.archivematica.org/en/
What does research data look like?
York RDM questionnaire
2013: Please select the main
types of electronic research
data you generate
Top research data applications at York
The importance of identification
How well are these
formats identified by
digital preservation
tools?
• Better than expected!
• Sometimes partial
• Sometimes quite
generic (without a
version number)
MATLAB N
SPSS Partial
Stata N
R N
EndNote Partial
NVivo N
LaTeX Partial
Python NWolfram
Mathematica Partial
Gaussian N
ChemDraw Partial
SAS Partial
ArcGIS Partial
GraphPad Prism Partial
Adobe Photoshop Partial
ATLAS.ti N
C++ N
Eclipse NA? No native file formats
MS Excel Y
RSB - ImageJ Partial
What does research data look like?
• Potentially quite big
• Wide range of file formats (some well understood
but a long tail of more specialist/obscure formats)
• Sometimes sensitive and/or confidential
• Ever changing (new software and techniques are
used for dynamic and cutting edge research)
• May be different versions of the data (as new
publications are released)
• Value not well understood at the point of deposit
What does Archivematica do?
The short answer:
“It packages data up in a standards compliant
way and prepares it to be stored for the long
term”
What does Archivematica do?
The longer answer:
• Assigns unique identifiers
• Creates a checksum for each object
• Creates a text file with a directory tree of the transfer
• Option to quarantine data for a specified period
• Runs virus checks
• Cleans up file and directory names (removing characters that may cause
problems)
• Runs identification tools so you can find out what file formats you have
• Extracts data from zip files (or not if you would rather not)
• Extracts metadata embedded in the files (if you want)
• Normalises files (if a migration path exists)
• ...
What does Archivematica do?
The really really long answer (if you have time):
• Read the manual
https://ww.archivematica.org/en/docs/archivematica-1.4/
What does Archivematica do?
One final answer (honest):
It gives us a greater level of confidence that we
will be able to continue to provide access to
usable copies of research data over the longer
term
What are the downsides?
• It isn’t a magic bullet
• There is no guarantee your data will be
readable in the future
• It can only be as good as current digital
preservation practice
• It can be fiddly to install correctly
• The GUI isn’t that intuitive
• You need staff who understand it
Phase 2: ‘develop’
1. Enable better workflows for RDM (producing a
DIP on request)
2. Allowing the DIP (access copy of data) to be
usable by different repository systems
3. Helping reduce bottlenecks for big data
4. Workflows for unidentified files
5. Enabling easier querying of data within
Archivematica by third party applications
6. Better documentation
Phase 2: RDM Workflows at York
• We get a copy of data from researcher
• We transfer it to Archivematica
• Archivematica packages it up for storage and
creates the Archival Information Package (AIP)
• Archivematica sends the AIP to archival storage
• Metadata is published in data catalogue
• If someone requests the data Archivematica will
create a Dissemination Information Package (DIP)
• DIP will be uploaded to Digital Library for access
How do York plan to use Archivematica?
How do York plan to use Archivematica?
Where to find out more
http://www.york.ac.uk/borthwick/
Where to find out more
http://digital-archiving.blogspot.co.uk/2015/07/archivematica-fills-digital.html
Where to find out more
Thanks for listening
• You can contact me on:
– jenny.mitcham@york.ac.uk

More Related Content

What's hot

Scaling big data mining infrastructure thetwitte experience - Jimmy Lin and D...
Scaling big data mining infrastructure thetwitte experience - Jimmy Lin and D...Scaling big data mining infrastructure thetwitte experience - Jimmy Lin and D...
Scaling big data mining infrastructure thetwitte experience - Jimmy Lin and D...
Ohud Saud
 
BigData Behind-the-Scenes~20150827
BigData Behind-the-Scenes~20150827BigData Behind-the-Scenes~20150827
BigData Behind-the-Scenes~20150827
Anthony Potappel
 
'Data Management Planning: the role of institutions and researchers' eResearc...
'Data Management Planning: the role of institutions and researchers' eResearc...'Data Management Planning: the role of institutions and researchers' eResearc...
'Data Management Planning: the role of institutions and researchers' eResearc...
Marta Ribeiro
 

What's hot (20)

Research data zone: veilige en geoptimaliseerde netwerkomgeving voor onderzoe...
Research data zone: veilige en geoptimaliseerde netwerkomgeving voor onderzoe...Research data zone: veilige en geoptimaliseerde netwerkomgeving voor onderzoe...
Research data zone: veilige en geoptimaliseerde netwerkomgeving voor onderzoe...
 
How and Why to Share Your Data
How and Why to Share Your DataHow and Why to Share Your Data
How and Why to Share Your Data
 
Data Visibility and Protection at the Scale of Life Sciences
Data Visibility and Protection at the Scale of Life SciencesData Visibility and Protection at the Scale of Life Sciences
Data Visibility and Protection at the Scale of Life Sciences
 
Data Management Planning for Researchers - An Introduction - 2015-02-18 - Un...
Data Management Planning for Researchers -  An Introduction - 2015-02-18 - Un...Data Management Planning for Researchers -  An Introduction - 2015-02-18 - Un...
Data Management Planning for Researchers - An Introduction - 2015-02-18 - Un...
 
Preparing Your Research Material for the Future - 2016-11-16 - Humanities Div...
Preparing Your Research Material for the Future - 2016-11-16 - Humanities Div...Preparing Your Research Material for the Future - 2016-11-16 - Humanities Div...
Preparing Your Research Material for the Future - 2016-11-16 - Humanities Div...
 
Scaling big data mining infrastructure thetwitte experience - Jimmy Lin and D...
Scaling big data mining infrastructure thetwitte experience - Jimmy Lin and D...Scaling big data mining infrastructure thetwitte experience - Jimmy Lin and D...
Scaling big data mining infrastructure thetwitte experience - Jimmy Lin and D...
 
Clouds, Clusters, and Containers: Tools for responsible, collaborative computing
Clouds, Clusters, and Containers: Tools for responsible, collaborative computingClouds, Clusters, and Containers: Tools for responsible, collaborative computing
Clouds, Clusters, and Containers: Tools for responsible, collaborative computing
 
5.15.17 Powering Linked Data and Hosted Solutions with Fedora Webinar Slides
5.15.17 Powering Linked Data and Hosted Solutions with Fedora Webinar Slides5.15.17 Powering Linked Data and Hosted Solutions with Fedora Webinar Slides
5.15.17 Powering Linked Data and Hosted Solutions with Fedora Webinar Slides
 
Getting to grips with Research Data Management
Getting to grips with Research Data ManagementGetting to grips with Research Data Management
Getting to grips with Research Data Management
 
BigData Behind-the-Scenes~20150827
BigData Behind-the-Scenes~20150827BigData Behind-the-Scenes~20150827
BigData Behind-the-Scenes~20150827
 
Preparing Your Research Material for the Future - 2018-06-08 - Humanities Div...
Preparing Your Research Material for the Future - 2018-06-08 - Humanities Div...Preparing Your Research Material for the Future - 2018-06-08 - Humanities Div...
Preparing Your Research Material for the Future - 2018-06-08 - Humanities Div...
 
Big Data Presentation - Data Center Dynamics Sydney 2014 - Dez Blanchfield
Big Data Presentation - Data Center Dynamics Sydney 2014 - Dez BlanchfieldBig Data Presentation - Data Center Dynamics Sydney 2014 - Dez Blanchfield
Big Data Presentation - Data Center Dynamics Sydney 2014 - Dez Blanchfield
 
Linked Open Data about Springer Nature conferences. The story so far
Linked Open Data about Springer Nature conferences. The story so farLinked Open Data about Springer Nature conferences. The story so far
Linked Open Data about Springer Nature conferences. The story so far
 
Data Management Planning for Researchers - An Introduction - 2015-11-04 - Un...
 Data Management Planning for Researchers - An Introduction - 2015-11-04 - Un... Data Management Planning for Researchers - An Introduction - 2015-11-04 - Un...
Data Management Planning for Researchers - An Introduction - 2015-11-04 - Un...
 
Digital Preservation in Production (DPN and DuraCloud Vault)
Digital Preservation in Production (DPN and DuraCloud Vault)Digital Preservation in Production (DPN and DuraCloud Vault)
Digital Preservation in Production (DPN and DuraCloud Vault)
 
'Data Management Planning: the role of institutions and researchers' eResearc...
'Data Management Planning: the role of institutions and researchers' eResearc...'Data Management Planning: the role of institutions and researchers' eResearc...
'Data Management Planning: the role of institutions and researchers' eResearc...
 
2016 Ocean Sciences Meeting tutorial
2016 Ocean Sciences Meeting tutorial2016 Ocean Sciences Meeting tutorial
2016 Ocean Sciences Meeting tutorial
 
Qatar Digital Library Project Workshop
Qatar Digital Library Project WorkshopQatar Digital Library Project Workshop
Qatar Digital Library Project Workshop
 
Big Data - A brief introduction
Big Data - A brief introductionBig Data - A brief introduction
Big Data - A brief introduction
 
Data Storage & Preservation
Data Storage & PreservationData Storage & Preservation
Data Storage & Preservation
 

Viewers also liked

A Child’s Place Casa
A Child’s Place CasaA Child’s Place Casa
A Child’s Place Casa
Kyra Dillard
 
Обучение с результатом. как сделать, чтобы участники не попрактиковались, а н...
Обучение с результатом. как сделать, чтобы участники не попрактиковались, а н...Обучение с результатом. как сделать, чтобы участники не попрактиковались, а н...
Обучение с результатом. как сделать, чтобы участники не попрактиковались, а н...
Natalia Sintsova
 

Viewers also liked (18)

Noticia 1
Noticia 1Noticia 1
Noticia 1
 
Infografía nuevos productos RTB
Infografía nuevos productos RTBInfografía nuevos productos RTB
Infografía nuevos productos RTB
 
Parent-adolescent relationship and adol suicidality
Parent-adolescent relationship and adol suicidalityParent-adolescent relationship and adol suicidality
Parent-adolescent relationship and adol suicidality
 
A Child’s Place Casa
A Child’s Place CasaA Child’s Place Casa
A Child’s Place Casa
 
Infografía proceso RTB
Infografía proceso RTBInfografía proceso RTB
Infografía proceso RTB
 
Yehudah Sunshine Turkey policy paper
Yehudah Sunshine Turkey policy paperYehudah Sunshine Turkey policy paper
Yehudah Sunshine Turkey policy paper
 
Chris Bonello- PPP Presentation V2
Chris Bonello- PPP Presentation V2Chris Bonello- PPP Presentation V2
Chris Bonello- PPP Presentation V2
 
معرفی تهک اداره کل هواشناسی استان اصفهان
معرفی تهک اداره کل هواشناسی استان اصفهانمعرفی تهک اداره کل هواشناسی استان اصفهان
معرفی تهک اداره کل هواشناسی استان اصفهان
 
How To Manage Zero Clients
How To Manage Zero ClientsHow To Manage Zero Clients
How To Manage Zero Clients
 
Jisc Shared Service requirements presentation - 18th November 2015
Jisc Shared Service requirements presentation - 18th November 2015Jisc Shared Service requirements presentation - 18th November 2015
Jisc Shared Service requirements presentation - 18th November 2015
 
Обучение с результатом. как сделать, чтобы участники не попрактиковались, а н...
Обучение с результатом. как сделать, чтобы участники не попрактиковались, а н...Обучение с результатом. как сделать, чтобы участники не попрактиковались, а н...
Обучение с результатом. как сделать, чтобы участники не попрактиковались, а н...
 
Trabajo de case
Trabajo de caseTrabajo de case
Trabajo de case
 
A collaborative approach to "filling the digital preservation gap" for Resear...
A collaborative approach to "filling the digital preservation gap" for Resear...A collaborative approach to "filling the digital preservation gap" for Resear...
A collaborative approach to "filling the digital preservation gap" for Resear...
 
HARMONIE - Flamengo
HARMONIE - FlamengoHARMONIE - Flamengo
HARMONIE - Flamengo
 
Infografía el nuevo profesional de marketing
Infografía el nuevo profesional de marketingInfografía el nuevo profesional de marketing
Infografía el nuevo profesional de marketing
 
YouLab Global - Ageless Living on Purpose!
YouLab Global - Ageless Living on Purpose!YouLab Global - Ageless Living on Purpose!
YouLab Global - Ageless Living on Purpose!
 
PROLOG SYSTEM TUNASKITA
PROLOG SYSTEM TUNASKITAPROLOG SYSTEM TUNASKITA
PROLOG SYSTEM TUNASKITA
 
Hipertensi okuli
Hipertensi okuliHipertensi okuli
Hipertensi okuli
 

Similar to “Filling the digital preservation gap” an update from the Jisc Research Data Spring project at York and Hull

Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
SEAD
 
Filling the Digital Preservation Gap - Acting on Change
Filling the Digital Preservation Gap - Acting on ChangeFilling the Digital Preservation Gap - Acting on Change
Filling the Digital Preservation Gap - Acting on Change
PERICLES_FP7
 

Similar to “Filling the digital preservation gap” an update from the Jisc Research Data Spring project at York and Hull (20)

A collaborative approach to filling the digital preservation gap for RDM
A collaborative approach to filling the digital preservation gap for RDMA collaborative approach to filling the digital preservation gap for RDM
A collaborative approach to filling the digital preservation gap for RDM
 
Implementing Archivematica, research data network
Implementing Archivematica, research data networkImplementing Archivematica, research data network
Implementing Archivematica, research data network
 
Project update: A collaborative approach to "filling the digital preservation...
Project update: A collaborative approach to "filling the digital preservation...Project update: A collaborative approach to "filling the digital preservation...
Project update: A collaborative approach to "filling the digital preservation...
 
"Filling the digital preservation gap" with Archivematica
"Filling the digital preservation gap" with Archivematica"Filling the digital preservation gap" with Archivematica
"Filling the digital preservation gap" with Archivematica
 
Jisc Research Data Management Shared Service Workshop: An institutional persp...
Jisc Research Data Management Shared Service Workshop: An institutional persp...Jisc Research Data Management Shared Service Workshop: An institutional persp...
Jisc Research Data Management Shared Service Workshop: An institutional persp...
 
The workflows for the ingest of digital objects into a repository/digital l...
The workflows for the ingest of  digital objects into a repository/digital l...The workflows for the ingest of  digital objects into a repository/digital l...
The workflows for the ingest of digital objects into a repository/digital l...
 
10-1-13 “Research Data Curation at UC San Diego: An Overview” Presentation Sl...
10-1-13 “Research Data Curation at UC San Diego: An Overview” Presentation Sl...10-1-13 “Research Data Curation at UC San Diego: An Overview” Presentation Sl...
10-1-13 “Research Data Curation at UC San Diego: An Overview” Presentation Sl...
 
Introduction to digital curation
Introduction to digital curationIntroduction to digital curation
Introduction to digital curation
 
Love Your Data Locally
Love Your Data LocallyLove Your Data Locally
Love Your Data Locally
 
10-15-13 “Metadata and Repository Services for Research Data Curation” Presen...
10-15-13 “Metadata and Repository Services for Research Data Curation” Presen...10-15-13 “Metadata and Repository Services for Research Data Curation” Presen...
10-15-13 “Metadata and Repository Services for Research Data Curation” Presen...
 
Ariadne: Lifecycles
Ariadne: LifecyclesAriadne: Lifecycles
Ariadne: Lifecycles
 
OU Library Research Support webinar: Working with research data
OU Library Research Support webinar: Working with research dataOU Library Research Support webinar: Working with research data
OU Library Research Support webinar: Working with research data
 
2013 DataCite Summer Meeting - Purdue University Research Repository (PURR) (...
2013 DataCite Summer Meeting - Purdue University Research Repository (PURR) (...2013 DataCite Summer Meeting - Purdue University Research Repository (PURR) (...
2013 DataCite Summer Meeting - Purdue University Research Repository (PURR) (...
 
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
 
Filling the Digital Preservation Gap - Acting on Change
Filling the Digital Preservation Gap - Acting on ChangeFilling the Digital Preservation Gap - Acting on Change
Filling the Digital Preservation Gap - Acting on Change
 
The workflows for the ingest of digital objects into a repository/digital li...
The workflows for the ingest of digital objects into a repository/digital li...The workflows for the ingest of digital objects into a repository/digital li...
The workflows for the ingest of digital objects into a repository/digital li...
 
UBC Library's Digital Preservation Strategy
UBC Library's Digital Preservation StrategyUBC Library's Digital Preservation Strategy
UBC Library's Digital Preservation Strategy
 
A Maturing Role of Workflows in the Presence of Heterogenous Computing Archit...
A Maturing Role of Workflows in the Presence of Heterogenous Computing Archit...A Maturing Role of Workflows in the Presence of Heterogenous Computing Archit...
A Maturing Role of Workflows in the Presence of Heterogenous Computing Archit...
 
Wednesday 6 May: Hand me the data! What you should know as a humanities resea...
Wednesday 6 May: Hand me the data! What you should know as a humanities resea...Wednesday 6 May: Hand me the data! What you should know as a humanities resea...
Wednesday 6 May: Hand me the data! What you should know as a humanities resea...
 
Caplan and York, 'What It Takes To Make It Last: E-Resources Preservation"
Caplan and York, 'What It Takes To Make It Last:  E-Resources Preservation"Caplan and York, 'What It Takes To Make It Last:  E-Resources Preservation"
Caplan and York, 'What It Takes To Make It Last: E-Resources Preservation"
 

Recently uploaded

CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
amitlee9823
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
amitlee9823
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
amitlee9823
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
only4webmaster01
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
amitlee9823
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 

Recently uploaded (20)

(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 

“Filling the digital preservation gap” an update from the Jisc Research Data Spring project at York and Hull

  • 1. “Filling the digital preservation gap” an update from the Jisc Research Data Spring project at York and Hull Jenny Mitcham Digital Archivist Borthwick Institute for Archives University of York 13 August 2015
  • 2. Project aim “…to investigate Archivematica and explore how it might be used to provide digital preservation functionality within a wider infrastructure for Research Data Management.”
  • 3. What about Hydra? • Hydra is not mentioned much in our project report ...this is deliberate! • We wanted to keep our findings generic to make it most useful to a wide range of institutions who may be interested in digital preservation... • ...this means we are more likely to get further funding • However...
  • 4. Project team University of Hull: • Chris Awre – Head of Information Services, Library and Learning Innovation • Richard Green – Independent Consultant • Simon Wilson – University Archivist University of York: • Julie Allinson – Manager, Digital York • Jen Mitcham – Digital Archivist
  • 5. About the project • Funded as part of Jisc Research Data Spring • Started 30th March 2015 • Phase 1 is complete • Phase 2 has just started and will run until November • …and we hope phase 3 will be funded
  • 6. Project structure • Phase 1 – explore: testing, research, thinking - produce a report (3 months) • Phase 2 – develop: make Archivematica better for RDM, plan implementation (4 months) • Phase 3 – implement: set up proof of concepts at York and Hull (6 months)
  • 7. Phase 1 -The key questions • Why? Why are we bothering to 'preserve' research data. What are the drivers here and what are the risks if we don't? Why are we looking at Archivematica? • What? What are the characteristics of research data and how might it differ from other born digital data that memory institutions are establishing digital archives to manage and preserve? What types of files are our researchers producing and how would Archivematica handle these? What does Archivematica offer us and what benefits does it bring? • How? How would we incorporate Archivematica into a wider technical infrastructure for research data management and what workflows would we put in place? Where would it sit and what other systems would it need to talk to? How can we improve Archivematica for RDM? • Who? Who else is using Archivematica (or other digital preservation systems) to do similar things and what can we learn from them? What staff resource is needed to preserve research data with Archivematica?
  • 9. Why Archivematica? “The goal of the Archivematica project is to give archivists and librarians with limited technical and financial capacity the tools, methodology and confidence to begin preserving digital information today.”
  • 10. Why Archivematica? • Standards-based • Open Source • Flexible and customisable • Compatible with hundreds of file formats • Advanced search and storage management • Integrated with third-party systems From https://ww.archivematica.org/en/
  • 11. What does research data look like? York RDM questionnaire 2013: Please select the main types of electronic research data you generate
  • 12. Top research data applications at York
  • 13. The importance of identification How well are these formats identified by digital preservation tools? • Better than expected! • Sometimes partial • Sometimes quite generic (without a version number) MATLAB N SPSS Partial Stata N R N EndNote Partial NVivo N LaTeX Partial Python NWolfram Mathematica Partial Gaussian N ChemDraw Partial SAS Partial ArcGIS Partial GraphPad Prism Partial Adobe Photoshop Partial ATLAS.ti N C++ N Eclipse NA? No native file formats MS Excel Y RSB - ImageJ Partial
  • 14. What does research data look like? • Potentially quite big • Wide range of file formats (some well understood but a long tail of more specialist/obscure formats) • Sometimes sensitive and/or confidential • Ever changing (new software and techniques are used for dynamic and cutting edge research) • May be different versions of the data (as new publications are released) • Value not well understood at the point of deposit
  • 15. What does Archivematica do? The short answer: “It packages data up in a standards compliant way and prepares it to be stored for the long term”
  • 16. What does Archivematica do? The longer answer: • Assigns unique identifiers • Creates a checksum for each object • Creates a text file with a directory tree of the transfer • Option to quarantine data for a specified period • Runs virus checks • Cleans up file and directory names (removing characters that may cause problems) • Runs identification tools so you can find out what file formats you have • Extracts data from zip files (or not if you would rather not) • Extracts metadata embedded in the files (if you want) • Normalises files (if a migration path exists) • ...
  • 17. What does Archivematica do? The really really long answer (if you have time): • Read the manual https://ww.archivematica.org/en/docs/archivematica-1.4/
  • 18. What does Archivematica do? One final answer (honest): It gives us a greater level of confidence that we will be able to continue to provide access to usable copies of research data over the longer term
  • 19. What are the downsides? • It isn’t a magic bullet • There is no guarantee your data will be readable in the future • It can only be as good as current digital preservation practice • It can be fiddly to install correctly • The GUI isn’t that intuitive • You need staff who understand it
  • 20. Phase 2: ‘develop’ 1. Enable better workflows for RDM (producing a DIP on request) 2. Allowing the DIP (access copy of data) to be usable by different repository systems 3. Helping reduce bottlenecks for big data 4. Workflows for unidentified files 5. Enabling easier querying of data within Archivematica by third party applications 6. Better documentation
  • 21. Phase 2: RDM Workflows at York • We get a copy of data from researcher • We transfer it to Archivematica • Archivematica packages it up for storage and creates the Archival Information Package (AIP) • Archivematica sends the AIP to archival storage • Metadata is published in data catalogue • If someone requests the data Archivematica will create a Dissemination Information Package (DIP) • DIP will be uploaded to Digital Library for access
  • 22. How do York plan to use Archivematica?
  • 23. How do York plan to use Archivematica?
  • 24. Where to find out more http://www.york.ac.uk/borthwick/
  • 25. Where to find out more http://digital-archiving.blogspot.co.uk/2015/07/archivematica-fills-digital.html
  • 26. Where to find out more
  • 27. Thanks for listening • You can contact me on: – jenny.mitcham@york.ac.uk