SlideShare a Scribd company logo
1 of 14
SLIDESMANIA.
The Oxford Common File
Layout
Understanding the specification,
institutional use cases &
implementations
Arran Griffith - Fedora Program Manager
Stefano Cossu - Harvard University Libraries
Thomas Wrobel - Oxford University, Bodleian Libraries
SLIDESMANIA.
The Oxford
Common
File Layout
(OCFL)
“A simple, non-
proprietary, specified,
open-standards approach
to the layout of preservation
persistence.”
● Purpose: provide a preservation-centric,
common approach to filesystem layout
for digital repositories
● Developed and maintained by the OCFL
Editorial Board
● Several implementations of the specification
are in active use
SLIDESMANIA.
Benefits of the OCFL
Parsability
Storage Diversity
Robustness
● Readable by both humans and
machines
Versioning
● Checksums to protect against
corruption and errors between storage
technologies
Completeness
● Ensures content can be stored on any
type of infrastructure including
conventional systems or cloud
systems
● So the repository can be rebuilt from
the files it stores
● All changes are versioned, allowed a
repository’s history to persist
SLIDESMANIA.
Fedora
OCFL-java
Implementation
SLIDESMANIA.
OCFL for Digital Preservation in
● OCFL was incorporated to enhanced long-term digital preservation for Fedora
repositories
● Fedora 6.x writes all data to OCFL formatting using the OCFL-java library
○ Purpose of this was to take advantage of the transparency offered by the
OCFL file structure
Benefits:
● Application-independent persistence
● Human and machine readable data
● Ability to rebuild repository from contents on disk
● Fewer migrations in the future
Standards = What to do Fedora + OCFL = How to do it
SLIDESMANIA.
Harvard
Digital Repository Services
(DRS)
SLIDESMANIA.
DRS at a glance
Scale
Long-standing legacy
Migration time & costs
DRS Futures project
➜ 3-year capital-funded project to
replace current DRS
➜ Design a new repository without
migrating data
➜ >10M objects, >100M files, 2Pb
replicated
➜ Exponential growth foreseen in the
future
➜ More than 1 year to migrate from
POSIX to OCFL
➜ Don’t want to do that again
➜ In operation and continuously
maintained for 22 years
➜ In need of a complete re-engineering
SLIDESMANIA.
Value of OCFL for DRS
Approach What OCFL provides
Assumptions Challenges
➜ A file layout standard specifically
designed for long-term preservation
➜ A software-agnostic data layer
➜ A community dedicated to resolving
digital preservation problems
➜ How do we guarantee performance?
➜ What about backward compatibility?
➜ Do we have enough OCFL-compatible
software choices?
➜ A standard adopted by a sufficiently
large & diverse community that
guarantees the promised stability
➜ A healthy community of implementers
and service providers to implement &
maintain the required tools
➜ Maintain a “storage fabric” separated
from the application layer
➜ Replace current DRS without migrating
or rearranging the data layer
SLIDESMANIA.
Bodleian
Oxford University Research
Archive
SLIDESMANIA.
University of Oxford - Bodleian Libraries
● 300,000+ works
● 100,000 of these have public binary files of which ORA holds the only digital
copy
● Works include:
○ Articles, conference papers, theses, research data, working papers,
posters, and more…
Digital Preservation
Microservices (DPMS)
Oxford University Research Archive
(ORA)
Digital Preservation
Service
Purpose: preserve a versioned copy
of a digital object which will allow
the DPMS to monitor, analyse and
support the system
Purpose: monitor and support the
preservation of binary content and
metadata
SLIDESMANIA.
Advantages of OCFL Advantages of Fedora
University of Oxford - Bodleian Libraries
● Platform & application agnostic
● DPS OCFL layer decreases
migrations
● Back-up & monitoring more
simplified
● Parsability
● Single parent directory = no need
for index or management
application to analyse a given
object
● Well documented RESTful API
● Transaction management
● Authentication & Authorization
● Community support and continued
engagement
SLIDESMANIA.
What’s Next…
University of Oxford - Bodleian Libraries
● Performance and scale testing of Fedora 6.x + OCFL
● Export ORA repository into DPS and integrate with day-to-day
operations
● Expand to other services with the Bodleian Libraries
Reach Us:
Thomas Wrobel - thomas.wrobel@bodleian.ox.ac.uk
ORA Team - ora-dev@bodleian.ox.ac.uk
SLIDESMANIA.
Resourc
es
The Oxford Common File Layout
www.ocfl.io
Fedora Program Info
Wiki:
https://wiki.lyrasis.org/display/FF/Fedora+Repository+Home
Documentation:
https://wiki.lyrasis.org/display/FEDORA6x
Get Connected:
https://wiki.lyrasis.org/display/FF/Mailing+Lists+etc
Harvard DRS Futures
https://sites.harvard.edu/drs-futures/
SLIDESMANIA.
Thank You
Arran Griffith - arran.griffith@lyrasis.org
Stefano Cossu - stefano_cossu@harvard.edu
Thomas Wrobel -
thomas.wrobel@bodleian.ox.ac.uk

More Related Content

Similar to The Oxford Common File Layout

Archivematica integration handshaking towards comprehensive digital preserva...
Archivematica integration  handshaking towards comprehensive digital preserva...Archivematica integration  handshaking towards comprehensive digital preserva...
Archivematica integration handshaking towards comprehensive digital preserva...Artefactual Systems - Archivematica
 
Building and Extensible Storage Ecosystem with WOS
Building and Extensible Storage Ecosystem with WOSBuilding and Extensible Storage Ecosystem with WOS
Building and Extensible Storage Ecosystem with WOSinside-BigData.com
 
2013 05-15 Intro to Archivematica - UBC SLAIS Digital Records Forensics Class
2013 05-15 Intro to Archivematica - UBC SLAIS Digital Records Forensics Class2013 05-15 Intro to Archivematica - UBC SLAIS Digital Records Forensics Class
2013 05-15 Intro to Archivematica - UBC SLAIS Digital Records Forensics ClassCourtney Mumma
 
Service integration to Enhance RDM: RSpace electronic lab notebook at the Uni...
Service integration to Enhance RDM: RSpace electronic lab notebook at the Uni...Service integration to Enhance RDM: RSpace electronic lab notebook at the Uni...
Service integration to Enhance RDM: RSpace electronic lab notebook at the Uni...ResearchSpace
 
Impact of Covid-19 on Learning and Education
Impact of Covid-19 on Learning and EducationImpact of Covid-19 on Learning and Education
Impact of Covid-19 on Learning and EducationMANENDRASINGH30
 
Future Trends for Repositories
Future Trends for RepositoriesFuture Trends for Repositories
Future Trends for RepositoriesTim Donohue
 
The Enterprise File Fabric for OpenIO
The Enterprise File Fabric for OpenIOThe Enterprise File Fabric for OpenIO
The Enterprise File Fabric for OpenIOHybrid Cloud
 
Integrating an electronic lab notebook with a university it environment rdmf ...
Integrating an electronic lab notebook with a university it environment rdmf ...Integrating an electronic lab notebook with a university it environment rdmf ...
Integrating an electronic lab notebook with a university it environment rdmf ...rmacneil88
 
e-infrastructural needs to support informatics
e-infrastructural needs to support informaticse-infrastructural needs to support informatics
e-infrastructural needs to support informaticsDavid Wallom
 
Building Cyber-infrastructure at UNC-CH
Building Cyber-infrastructure at UNC-CHBuilding Cyber-infrastructure at UNC-CH
Building Cyber-infrastructure at UNC-CHGary Wilhelm
 
Desktop as a Service supporting Environmental 'Omics
Desktop as a Service supporting Environmental 'OmicsDesktop as a Service supporting Environmental 'Omics
Desktop as a Service supporting Environmental 'OmicsDavid Wallom
 
How Worthy is DSpace for Digital Libraries
How Worthy is DSpace for Digital LibrariesHow Worthy is DSpace for Digital Libraries
How Worthy is DSpace for Digital LibrariesAmit Shaw
 
Workshop on design and development of institutional repositories using d space
Workshop on design and development of institutional repositories using d spaceWorkshop on design and development of institutional repositories using d space
Workshop on design and development of institutional repositories using d spaceMahesh Palamuttath
 
IR and DSpace - International Seminar, Dhaka University
IR and DSpace - International Seminar, Dhaka UniversityIR and DSpace - International Seminar, Dhaka University
IR and DSpace - International Seminar, Dhaka UniversityMd. Zahid Hossain Shoeb
 
Archival Technologies
Archival TechnologiesArchival Technologies
Archival TechnologiesCliff Landis
 
Integrating DuraCloud with DPN at Chronopolis and the Texas Digital Library
Integrating DuraCloud with DPN at Chronopolis and the Texas Digital LibraryIntegrating DuraCloud with DPN at Chronopolis and the Texas Digital Library
Integrating DuraCloud with DPN at Chronopolis and the Texas Digital LibraryDuraSpace
 

Similar to The Oxford Common File Layout (20)

Archivematica integration handshaking towards comprehensive digital preserva...
Archivematica integration  handshaking towards comprehensive digital preserva...Archivematica integration  handshaking towards comprehensive digital preserva...
Archivematica integration handshaking towards comprehensive digital preserva...
 
Building and Extensible Storage Ecosystem with WOS
Building and Extensible Storage Ecosystem with WOSBuilding and Extensible Storage Ecosystem with WOS
Building and Extensible Storage Ecosystem with WOS
 
2013 05-15 Intro to Archivematica - UBC SLAIS Digital Records Forensics Class
2013 05-15 Intro to Archivematica - UBC SLAIS Digital Records Forensics Class2013 05-15 Intro to Archivematica - UBC SLAIS Digital Records Forensics Class
2013 05-15 Intro to Archivematica - UBC SLAIS Digital Records Forensics Class
 
Service integration to Enhance RDM: RSpace electronic lab notebook at the Uni...
Service integration to Enhance RDM: RSpace electronic lab notebook at the Uni...Service integration to Enhance RDM: RSpace electronic lab notebook at the Uni...
Service integration to Enhance RDM: RSpace electronic lab notebook at the Uni...
 
Impact of Covid-19 on Learning and Education
Impact of Covid-19 on Learning and EducationImpact of Covid-19 on Learning and Education
Impact of Covid-19 on Learning and Education
 
Future Trends for Repositories
Future Trends for RepositoriesFuture Trends for Repositories
Future Trends for Repositories
 
The Enterprise File Fabric for OpenIO
The Enterprise File Fabric for OpenIOThe Enterprise File Fabric for OpenIO
The Enterprise File Fabric for OpenIO
 
Integrating an electronic lab notebook with a university it environment rdmf ...
Integrating an electronic lab notebook with a university it environment rdmf ...Integrating an electronic lab notebook with a university it environment rdmf ...
Integrating an electronic lab notebook with a university it environment rdmf ...
 
Archivematica and the digital archival chain of custody
Archivematica and the digital archival chain of custodyArchivematica and the digital archival chain of custody
Archivematica and the digital archival chain of custody
 
e-infrastructural needs to support informatics
e-infrastructural needs to support informaticse-infrastructural needs to support informatics
e-infrastructural needs to support informatics
 
Wilcox - Open Source Repositories and the Future of Fedora
Wilcox - Open Source Repositories and the Future of FedoraWilcox - Open Source Repositories and the Future of Fedora
Wilcox - Open Source Repositories and the Future of Fedora
 
Building Cyber-infrastructure at UNC-CH
Building Cyber-infrastructure at UNC-CHBuilding Cyber-infrastructure at UNC-CH
Building Cyber-infrastructure at UNC-CH
 
Desktop as a Service supporting Environmental 'Omics
Desktop as a Service supporting Environmental 'OmicsDesktop as a Service supporting Environmental 'Omics
Desktop as a Service supporting Environmental 'Omics
 
How Worthy is DSpace for Digital Libraries
How Worthy is DSpace for Digital LibrariesHow Worthy is DSpace for Digital Libraries
How Worthy is DSpace for Digital Libraries
 
Workshop on design and development of institutional repositories using d space
Workshop on design and development of institutional repositories using d spaceWorkshop on design and development of institutional repositories using d space
Workshop on design and development of institutional repositories using d space
 
IR and DSpace - International Seminar, Dhaka University
IR and DSpace - International Seminar, Dhaka UniversityIR and DSpace - International Seminar, Dhaka University
IR and DSpace - International Seminar, Dhaka University
 
Dspace software
Dspace softwareDspace software
Dspace software
 
Archival Technologies
Archival TechnologiesArchival Technologies
Archival Technologies
 
RDM Programme @ Edinburgh - Service Interoperation
RDM Programme @ Edinburgh - Service InteroperationRDM Programme @ Edinburgh - Service Interoperation
RDM Programme @ Edinburgh - Service Interoperation
 
Integrating DuraCloud with DPN at Chronopolis and the Texas Digital Library
Integrating DuraCloud with DPN at Chronopolis and the Texas Digital LibraryIntegrating DuraCloud with DPN at Chronopolis and the Texas Digital Library
Integrating DuraCloud with DPN at Chronopolis and the Texas Digital Library
 

More from Stefano Cossu

Stefano_Cossu_OR23_deck.pdf
Stefano_Cossu_OR23_deck.pdfStefano_Cossu_OR23_deck.pdf
Stefano_Cossu_OR23_deck.pdfStefano Cossu
 
Scossu gdi iiif_r+d_report_2019
Scossu gdi iiif_r+d_report_2019Scossu gdi iiif_r+d_report_2019
Scossu gdi iiif_r+d_report_2019Stefano Cossu
 
Brace yourselves, the Archives are Coming – Code4Lib 2020, Pittsburgh
Brace yourselves, the Archives are Coming – Code4Lib 2020, PittsburghBrace yourselves, the Archives are Coming – Code4Lib 2020, Pittsburgh
Brace yourselves, the Archives are Coming – Code4Lib 2020, PittsburghStefano Cossu
 
IIIF at the Getty: Vision & Tactics
IIIF at the Getty: Vision & TacticsIIIF at the Getty: Vision & Tactics
IIIF at the Getty: Vision & TacticsStefano Cossu
 
Reconciliation is a Necessity – IIIF Meeting, Edinburgh 2018
Reconciliation is a Necessity – IIIF Meeting, Edinburgh 2018 Reconciliation is a Necessity – IIIF Meeting, Edinburgh 2018
Reconciliation is a Necessity – IIIF Meeting, Edinburgh 2018 Stefano Cossu
 
Labours of Love & Convenience - Open Repositories 2018
Labours of Love & Convenience - Open Repositories 2018Labours of Love & Convenience - Open Repositories 2018
Labours of Love & Convenience - Open Repositories 2018Stefano Cossu
 
Cossu ford the_lake_experience_mw2017
Cossu ford the_lake_experience_mw2017Cossu ford the_lake_experience_mw2017
Cossu ford the_lake_experience_mw2017Stefano Cossu
 
A Little Sweat Goes A Long Way - Museums and The Web 2016
A Little Sweat Goes A Long Way - Museums and The Web 2016A Little Sweat Goes A Long Way - Museums and The Web 2016
A Little Sweat Goes A Long Way - Museums and The Web 2016Stefano Cossu
 
Libraries, Archives, Museums discussion - MCN 2015
Libraries, Archives, Museums discussion - MCN 2015Libraries, Archives, Museums discussion - MCN 2015
Libraries, Archives, Museums discussion - MCN 2015Stefano Cossu
 
AIC Linked Open Data panel Museums and the Web 2015
AIC Linked Open Data panel Museums and the Web 2015AIC Linked Open Data panel Museums and the Web 2015
AIC Linked Open Data panel Museums and the Web 2015Stefano Cossu
 
Stefano Cossu, The Art Institute of Chicago - Open Repositories 2014 presenta...
Stefano Cossu, The Art Institute of Chicago - Open Repositories 2014 presenta...Stefano Cossu, The Art Institute of Chicago - Open Repositories 2014 presenta...
Stefano Cossu, The Art Institute of Chicago - Open Repositories 2014 presenta...Stefano Cossu
 

More from Stefano Cossu (12)

Stefano_Cossu_OR23_deck.pdf
Stefano_Cossu_OR23_deck.pdfStefano_Cossu_OR23_deck.pdf
Stefano_Cossu_OR23_deck.pdf
 
Scossu gdi iiif_r+d_report_2019
Scossu gdi iiif_r+d_report_2019Scossu gdi iiif_r+d_report_2019
Scossu gdi iiif_r+d_report_2019
 
Brace yourselves, the Archives are Coming – Code4Lib 2020, Pittsburgh
Brace yourselves, the Archives are Coming – Code4Lib 2020, PittsburghBrace yourselves, the Archives are Coming – Code4Lib 2020, Pittsburgh
Brace yourselves, the Archives are Coming – Code4Lib 2020, Pittsburgh
 
Behind 12 sunsets
Behind 12 sunsetsBehind 12 sunsets
Behind 12 sunsets
 
IIIF at the Getty: Vision & Tactics
IIIF at the Getty: Vision & TacticsIIIF at the Getty: Vision & Tactics
IIIF at the Getty: Vision & Tactics
 
Reconciliation is a Necessity – IIIF Meeting, Edinburgh 2018
Reconciliation is a Necessity – IIIF Meeting, Edinburgh 2018 Reconciliation is a Necessity – IIIF Meeting, Edinburgh 2018
Reconciliation is a Necessity – IIIF Meeting, Edinburgh 2018
 
Labours of Love & Convenience - Open Repositories 2018
Labours of Love & Convenience - Open Repositories 2018Labours of Love & Convenience - Open Repositories 2018
Labours of Love & Convenience - Open Repositories 2018
 
Cossu ford the_lake_experience_mw2017
Cossu ford the_lake_experience_mw2017Cossu ford the_lake_experience_mw2017
Cossu ford the_lake_experience_mw2017
 
A Little Sweat Goes A Long Way - Museums and The Web 2016
A Little Sweat Goes A Long Way - Museums and The Web 2016A Little Sweat Goes A Long Way - Museums and The Web 2016
A Little Sweat Goes A Long Way - Museums and The Web 2016
 
Libraries, Archives, Museums discussion - MCN 2015
Libraries, Archives, Museums discussion - MCN 2015Libraries, Archives, Museums discussion - MCN 2015
Libraries, Archives, Museums discussion - MCN 2015
 
AIC Linked Open Data panel Museums and the Web 2015
AIC Linked Open Data panel Museums and the Web 2015AIC Linked Open Data panel Museums and the Web 2015
AIC Linked Open Data panel Museums and the Web 2015
 
Stefano Cossu, The Art Institute of Chicago - Open Repositories 2014 presenta...
Stefano Cossu, The Art Institute of Chicago - Open Repositories 2014 presenta...Stefano Cossu, The Art Institute of Chicago - Open Repositories 2014 presenta...
Stefano Cossu, The Art Institute of Chicago - Open Repositories 2014 presenta...
 

Recently uploaded

Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 

Recently uploaded (20)

Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 

The Oxford Common File Layout

  • 1. SLIDESMANIA. The Oxford Common File Layout Understanding the specification, institutional use cases & implementations Arran Griffith - Fedora Program Manager Stefano Cossu - Harvard University Libraries Thomas Wrobel - Oxford University, Bodleian Libraries
  • 2. SLIDESMANIA. The Oxford Common File Layout (OCFL) “A simple, non- proprietary, specified, open-standards approach to the layout of preservation persistence.” ● Purpose: provide a preservation-centric, common approach to filesystem layout for digital repositories ● Developed and maintained by the OCFL Editorial Board ● Several implementations of the specification are in active use
  • 3. SLIDESMANIA. Benefits of the OCFL Parsability Storage Diversity Robustness ● Readable by both humans and machines Versioning ● Checksums to protect against corruption and errors between storage technologies Completeness ● Ensures content can be stored on any type of infrastructure including conventional systems or cloud systems ● So the repository can be rebuilt from the files it stores ● All changes are versioned, allowed a repository’s history to persist
  • 5. SLIDESMANIA. OCFL for Digital Preservation in ● OCFL was incorporated to enhanced long-term digital preservation for Fedora repositories ● Fedora 6.x writes all data to OCFL formatting using the OCFL-java library ○ Purpose of this was to take advantage of the transparency offered by the OCFL file structure Benefits: ● Application-independent persistence ● Human and machine readable data ● Ability to rebuild repository from contents on disk ● Fewer migrations in the future Standards = What to do Fedora + OCFL = How to do it
  • 7. SLIDESMANIA. DRS at a glance Scale Long-standing legacy Migration time & costs DRS Futures project ➜ 3-year capital-funded project to replace current DRS ➜ Design a new repository without migrating data ➜ >10M objects, >100M files, 2Pb replicated ➜ Exponential growth foreseen in the future ➜ More than 1 year to migrate from POSIX to OCFL ➜ Don’t want to do that again ➜ In operation and continuously maintained for 22 years ➜ In need of a complete re-engineering
  • 8. SLIDESMANIA. Value of OCFL for DRS Approach What OCFL provides Assumptions Challenges ➜ A file layout standard specifically designed for long-term preservation ➜ A software-agnostic data layer ➜ A community dedicated to resolving digital preservation problems ➜ How do we guarantee performance? ➜ What about backward compatibility? ➜ Do we have enough OCFL-compatible software choices? ➜ A standard adopted by a sufficiently large & diverse community that guarantees the promised stability ➜ A healthy community of implementers and service providers to implement & maintain the required tools ➜ Maintain a “storage fabric” separated from the application layer ➜ Replace current DRS without migrating or rearranging the data layer
  • 10. SLIDESMANIA. University of Oxford - Bodleian Libraries ● 300,000+ works ● 100,000 of these have public binary files of which ORA holds the only digital copy ● Works include: ○ Articles, conference papers, theses, research data, working papers, posters, and more… Digital Preservation Microservices (DPMS) Oxford University Research Archive (ORA) Digital Preservation Service Purpose: preserve a versioned copy of a digital object which will allow the DPMS to monitor, analyse and support the system Purpose: monitor and support the preservation of binary content and metadata
  • 11. SLIDESMANIA. Advantages of OCFL Advantages of Fedora University of Oxford - Bodleian Libraries ● Platform & application agnostic ● DPS OCFL layer decreases migrations ● Back-up & monitoring more simplified ● Parsability ● Single parent directory = no need for index or management application to analyse a given object ● Well documented RESTful API ● Transaction management ● Authentication & Authorization ● Community support and continued engagement
  • 12. SLIDESMANIA. What’s Next… University of Oxford - Bodleian Libraries ● Performance and scale testing of Fedora 6.x + OCFL ● Export ORA repository into DPS and integrate with day-to-day operations ● Expand to other services with the Bodleian Libraries Reach Us: Thomas Wrobel - thomas.wrobel@bodleian.ox.ac.uk ORA Team - ora-dev@bodleian.ox.ac.uk
  • 13. SLIDESMANIA. Resourc es The Oxford Common File Layout www.ocfl.io Fedora Program Info Wiki: https://wiki.lyrasis.org/display/FF/Fedora+Repository+Home Documentation: https://wiki.lyrasis.org/display/FEDORA6x Get Connected: https://wiki.lyrasis.org/display/FF/Mailing+Lists+etc Harvard DRS Futures https://sites.harvard.edu/drs-futures/
  • 14. SLIDESMANIA. Thank You Arran Griffith - arran.griffith@lyrasis.org Stefano Cossu - stefano_cossu@harvard.edu Thomas Wrobel - thomas.wrobel@bodleian.ox.ac.uk

Editor's Notes

  1. Hello My name is Arran Griffith and I am the program manager for the Fedora Program. I am joined today in-person my my colleague Stefano Cusso from Harvard University Libraries. Our other co-presenter, Thomas Wroble, sends his regards that he wasn’t able to make it, but he’s given us some info on the work they are doing at the Bodleian to share with you. But Stefano and I are talking generally about the Oxford Common File Layout specification, and sharing how each of us are incorporating the OFCL as components of our systems to take advantage of what the specification offers.
  2. The Oxford Common File Layout (OCFL) is a specification that describes an application-independent approach to the storage of digital information in a structured, transparent, and predictable manner. It was developed to provide a standardized approach to filesystem layout within a digital repository that would also promote preservation and support long-term object management best practices within the repositories. It is defined and developed by and editorial board who is responsible for the upkeep, continued development and maintenance of the spec. Currently there are several implementations of the OCFL in active use around the globe - all of which can be found on the ocfl website - ocfl.io. Today though, we are here to share with you our individual use cases involving the OCFL and talk about why we’ve opted to use it and how we’re doing that. If you have any questions about the OCFL specifically, Stefano and I are more than happy to try to answer them, but we are by no means the OCFL experts so we will defer to them and encourage you to join the #ocfl chanel on the Code4Lib slack or reach out on the website which I’ve linked too at the end.
  3. As I mentioned, the purpose of OCFL is to provide an application-indenpendent approach to storing digital content. The specification dictates the way files are structured and written and this, in terms of digital preservation offers many benefits. These are the 5 main benefits offered by the specification: The first being parsability - this means that the files themselves, once written to OCFL are done so in a simple, plain text format which is readable to both machines and humans which means they can be understood in the absence of the original software. Next is robustness. OCFL provides checksums for both the content and metadata to ensure robustness against errors and data corruption between storage technologies. OCLF also offers native versioning - This is part of it’s core DNA. It uses a forward delta algorithm which eliminates unnecessary duplication between versions. Built into the specification is the principle of immutable versioning. Everything is there and exists as versions to allow their history to persist. As I mentioned before, by nature, OCFL allows for storage Diversity - you can use any type of storage system you’d like because the simple file system metaphor with it’s basic files and directories allows you to operate on disk or in the cloud. And lastly OCFL offers Completeness. And what this means is that everything is preserved in the structure of the spec including all the data and associated provenance which allows you to theoretically rebuild your repository from the files you have. Should the unspeakable ever happen and the hardware fails, you can simply take your oclf repo and stand it up again elsewhere because it would be complete and preserved as such.
  4. As you can see there is a lot to gain from implementing the OCFL specification. Now we are going to share a little bit about why each of our programs and institutions hs chosen to incorporate this standard into our systems and software.
  5. Fedora is here to represent how we, as a software, are taking advantage of OCFL and incorporating it into our core. We use the OCFL-java implementation of the spec and this was the major feature improvement with Fedora 6. The community made the decision to use the OCFL standard within the persistence layer of Fedora in order to give our users back the transparency they were looking for and were used too from Fedora 3. OCFL replaced the MODESHAPE back end of Fedora 4, which was kind of this black box of unknown territory. Making the decision to do this required a major re-write of the core software but now gives us this very transparent and largely enhanced long-term digital preservation tool by offering Fedora 6 + OCFL. Fedora benefits from using OCFL for preservation for several reasons: Fedora itself provides a means of reading, writing and delivering digital files to your users, and OCFL provides the standard for which those files are preserved. If we consider long-term preservation, if Fedora were to ever go away, you have all the info within OCFL to stand up your repository again simply from the files on disk. The metadata is still intact as well as all of the provenance required to meet preservation standards. And because of the standardized way that OCFL dictates the file system layout, migrations should be more simplified going forward. There should be no need to reformat data in any way to move into newer versions of the software as was the case with previous Fedora migrations. OCFL and Fedora provides preservation that is also independent of the storage medium. This gives Fedora users more options for storing their objects. Since OCFL stores plain files, you can use whatever storage medium you choose, whether that be local storage or cloud storage. There is support within Fedora via the java client for cloud storage So to sum it all up - standards equal WHAT to do, and this combination of FEdora and OCFL provide the HOW. It’s the combination of the two that provide the best possible software solution for long-term digital preservation.
  6. General info about LTS and DRS
  7. Mention downsides of the approach: restricted choice of solutions that conform to OCFL