Fedora migration considerations
Juliet L. Hardesty
Metadata Analyst, Indiana University
Open Repositories, June 15, 2016
Fedora 3 diagram key
Repository object
Structural metadata datastream
Descriptive metadata datastream
Other metadata datastream
Indiana University President's Office records, 1937-1962.
Subject files, 1937-1962. Aeons, Board of, 1939-1940.
Fedora 3 - documents
VAA8877-06169 Fedora
DC
RELS-
EXT
METS
VAA8877-U-03624
Fedora
DC
RELS-
EXT
PDF
VAA8877-U-03624-001
Fedora
DC
RELS-
EXT
MASTER
Derivatives
MASTER-
MIX
Folder
Document in
folder
Page in document
isMemberOfCollection
isPartOf
isPartOf
Portable soft drink stand at Bowling Green
Fedora 3 - images
P02668 Fedora
DC
RELS-
EXT
METS
PURL
Redirect
isMemberOfCollection
DCMODS
Image
[Program, 2013-2014, no. 117]
Fedora 3 – time-based media
avalon:22187
Fedora
DC
RELS-
EXT
MODSRIGHTS
SECTIONS
TECH/DIS
PLAY
WORKFLOW
Fedora
DC
RELS-
EXT
STRUCTURE MATTERHORN
avalon:22191
Fedora
DC
RELS-
EXT
STRUCTURE MATTERHORN
Item
CD 1 CD 2
avalon:22189
isMemberOfCollection
TECH/DIS
PLAY
TECH/DIS
PLAY
isPartOf
isPartOf
Focusing on Fedora 3 structure
• RELS-EXT defines structure going up
• METS or special datastreams required to
define structure going down
• All in XML as datastreams on objects that
connect together down to the digital file
https://github.com/duraspace/pcdm/wiki
Portland
Common
Data
Model
Document example in PCDM
Wilcox, David and Andrew Woods. “Hands-On: Seeing Fedora 4 Firsthand.” Fedora 4 Training
Workshop. Open Repositories 2015.
METS: fileSec
METS: structMap
VAA8877-06169 VAA8877-U-03626 VAA8877-U-03626-page1
filesdocuments pages
VAA8877-U-03626Proxy VAA8877-U-03626-page1Proxy
thumb.jpg
screen.jpg
large.jpg
pcdm:
hasMember
pcdm:
hasMember
Fedora 4 structure diagram
http://localhost:8080/fcrepo/rest/VAA8877-U-
03625/pages/VAA8877-U-03625-page1Proxy
METS as PCDM in Fedora – SPARQL
query
select DISTINCT ?document ?firstPage ?nextPage where {
{ <http://localhost:8080/fcrepo/rest/VAA8877-06169> pcdm:hasMember ?document .
?document iana:first ?firstPageProxy .
?firstPageProxy ore:proxyFor ?firstPage .
}
union
{
<http://localhost:8080/fcrepo/rest/VAA8877-06169> pcdm:hasMember+ ?page .
?page pcdm:hasFile ?f .
?f ebucore:hasMimeType "image/jpeg"^^xsd:string .
?pageProxy ore:proxyFor ?page .
?pageProxy iana:next ?nextPageProxy .
?nextPageProxy ore:proxyFor ?nextPage .
}
}
SPARQL query results for structure of
folder VAA8877-06169
Transforming METS to PCDM
• Consider structure for collections using METS
• fileSec and structMap
– @GROUPID, @ID, @FILEID (unique identifiers
from both sections) in combination with <div>
structure within structMap can point to grouping
and ordering structure for objects
• Migrating easier if identifiers brought over
Considerations for structural metadata
• RELS-EXT in Fedora 3 migrated to Fedora 4
• Other structure datastreams are on you to
map
• METS – possible to map programmatically but
maybe not in a standard way
Descriptive Metadata - IU
MODS
Fedora 4 options for descriptive
metadata: Option 1
• Migration tools
– migration-utils
– fedora-migrate gem
Neither really takes advantage of Fedora 4/external triplestore
unless original is already RDF; fedora-migrate requires Hydra
Available now, keeps all metadata, nothing lost; can have RDF
statements on object if already in RDF (fedora-migrate)
Descriptive metadata to Fedora 4:
Option 2
• Map only simple statements to RDF
– Minimal descriptive metadata (title, date) or
descriptive metadata indexed for discovery (title,
date, creator, type, subject, genre, language)
– Use ontologies that allow for simple statements
Descriptive metadata to Fedora 4:
Option 2
• Map only simple statements to RDF
– Minimal descriptive metadata (title, date) or
descriptive metadata indexed for discovery (title,
date, creator, type, subject, genre, language)
– Use ontologies that allow for simple statements
Not all metadata is in RDF; changes in
ontologies/standards might not be 1:1 match with original
Creates RDF statements in Fedora 4/external triplestore; great
option if metadata is DC or other non-hierarchical standard
Descriptive metadata to Fedora 4:
Option 3
• Map complex/hierarchical information into
external triplestore
– All RDF statements (simple and complex) go into
external triplestore but only simple statements
are on Fedora 4 object
Cushman photograph – Fedora 4
Cushman photograph – external triplestore
Descriptive metadata to Fedora 4:
Option 3
• Map complex/hierarchical information into
external triplestore
– All RDF statements (simple and complex) go into
external triplestore but only simple statements
are on Fedora 4 object
Separates metadata between repository and triplestore,
problematic if original metadata not kept in repository
More/all metadata available as RDF; updates easier to manage
through triplestore/Fedora 4 functionality
Descriptive metadata to Fedora 4:
Almost Option 4
• MODS “unofficial standard” mapping to RDF
– MODS and RDF Group’s mapping/transformation
scenario (http://mods2rdf.xyz/)
– Available as conversion code to see MODS mapped
into Fedora 4
Work in progress (not always up, transforming few elements);
you might not agree with suggested mappings
Available now to try; might offer standard way to transform
MODS similarly across institutions
Considerations for descriptive
metadata
• Current state of descriptive metadata
• Management needs of Fedora 4
• Transform to RDF statements or also to Linked
Data
• Decision: keep original descriptive metadata
or not?
– At IU, keep original descriptive metadata (for full
item view)
Thank you!
• Julie Hardesty
• jlhardes@iu.edu
• @jlhardes

Fedora Migration Considerations

  • 1.
    Fedora migration considerations JulietL. Hardesty Metadata Analyst, Indiana University Open Repositories, June 15, 2016
  • 2.
    Fedora 3 diagramkey Repository object Structural metadata datastream Descriptive metadata datastream Other metadata datastream
  • 3.
    Indiana University President'sOffice records, 1937-1962. Subject files, 1937-1962. Aeons, Board of, 1939-1940.
  • 4.
    Fedora 3 -documents VAA8877-06169 Fedora DC RELS- EXT METS VAA8877-U-03624 Fedora DC RELS- EXT PDF VAA8877-U-03624-001 Fedora DC RELS- EXT MASTER Derivatives MASTER- MIX Folder Document in folder Page in document isMemberOfCollection isPartOf isPartOf
  • 5.
    Portable soft drinkstand at Bowling Green
  • 6.
    Fedora 3 -images P02668 Fedora DC RELS- EXT METS PURL Redirect isMemberOfCollection DCMODS Image
  • 7.
  • 8.
    Fedora 3 –time-based media avalon:22187 Fedora DC RELS- EXT MODSRIGHTS SECTIONS TECH/DIS PLAY WORKFLOW Fedora DC RELS- EXT STRUCTURE MATTERHORN avalon:22191 Fedora DC RELS- EXT STRUCTURE MATTERHORN Item CD 1 CD 2 avalon:22189 isMemberOfCollection TECH/DIS PLAY TECH/DIS PLAY isPartOf isPartOf
  • 9.
    Focusing on Fedora3 structure • RELS-EXT defines structure going up • METS or special datastreams required to define structure going down • All in XML as datastreams on objects that connect together down to the digital file
  • 10.
  • 11.
    Document example inPCDM Wilcox, David and Andrew Woods. “Hands-On: Seeing Fedora 4 Firsthand.” Fedora 4 Training Workshop. Open Repositories 2015.
  • 12.
  • 13.
  • 15.
    VAA8877-06169 VAA8877-U-03626 VAA8877-U-03626-page1 filesdocumentspages VAA8877-U-03626Proxy VAA8877-U-03626-page1Proxy thumb.jpg screen.jpg large.jpg pcdm: hasMember pcdm: hasMember Fedora 4 structure diagram
  • 16.
  • 17.
    METS as PCDMin Fedora – SPARQL query select DISTINCT ?document ?firstPage ?nextPage where { { <http://localhost:8080/fcrepo/rest/VAA8877-06169> pcdm:hasMember ?document . ?document iana:first ?firstPageProxy . ?firstPageProxy ore:proxyFor ?firstPage . } union { <http://localhost:8080/fcrepo/rest/VAA8877-06169> pcdm:hasMember+ ?page . ?page pcdm:hasFile ?f . ?f ebucore:hasMimeType "image/jpeg"^^xsd:string . ?pageProxy ore:proxyFor ?page . ?pageProxy iana:next ?nextPageProxy . ?nextPageProxy ore:proxyFor ?nextPage . } }
  • 18.
    SPARQL query resultsfor structure of folder VAA8877-06169
  • 19.
    Transforming METS toPCDM • Consider structure for collections using METS • fileSec and structMap – @GROUPID, @ID, @FILEID (unique identifiers from both sections) in combination with <div> structure within structMap can point to grouping and ordering structure for objects • Migrating easier if identifiers brought over
  • 20.
    Considerations for structuralmetadata • RELS-EXT in Fedora 3 migrated to Fedora 4 • Other structure datastreams are on you to map • METS – possible to map programmatically but maybe not in a standard way
  • 21.
  • 22.
    Fedora 4 optionsfor descriptive metadata: Option 1 • Migration tools – migration-utils – fedora-migrate gem Neither really takes advantage of Fedora 4/external triplestore unless original is already RDF; fedora-migrate requires Hydra Available now, keeps all metadata, nothing lost; can have RDF statements on object if already in RDF (fedora-migrate)
  • 23.
    Descriptive metadata toFedora 4: Option 2 • Map only simple statements to RDF – Minimal descriptive metadata (title, date) or descriptive metadata indexed for discovery (title, date, creator, type, subject, genre, language) – Use ontologies that allow for simple statements
  • 25.
    Descriptive metadata toFedora 4: Option 2 • Map only simple statements to RDF – Minimal descriptive metadata (title, date) or descriptive metadata indexed for discovery (title, date, creator, type, subject, genre, language) – Use ontologies that allow for simple statements Not all metadata is in RDF; changes in ontologies/standards might not be 1:1 match with original Creates RDF statements in Fedora 4/external triplestore; great option if metadata is DC or other non-hierarchical standard
  • 26.
    Descriptive metadata toFedora 4: Option 3 • Map complex/hierarchical information into external triplestore – All RDF statements (simple and complex) go into external triplestore but only simple statements are on Fedora 4 object
  • 27.
  • 28.
    Cushman photograph –external triplestore
  • 29.
    Descriptive metadata toFedora 4: Option 3 • Map complex/hierarchical information into external triplestore – All RDF statements (simple and complex) go into external triplestore but only simple statements are on Fedora 4 object Separates metadata between repository and triplestore, problematic if original metadata not kept in repository More/all metadata available as RDF; updates easier to manage through triplestore/Fedora 4 functionality
  • 30.
    Descriptive metadata toFedora 4: Almost Option 4 • MODS “unofficial standard” mapping to RDF – MODS and RDF Group’s mapping/transformation scenario (http://mods2rdf.xyz/) – Available as conversion code to see MODS mapped into Fedora 4 Work in progress (not always up, transforming few elements); you might not agree with suggested mappings Available now to try; might offer standard way to transform MODS similarly across institutions
  • 31.
    Considerations for descriptive metadata •Current state of descriptive metadata • Management needs of Fedora 4 • Transform to RDF statements or also to Linked Data • Decision: keep original descriptive metadata or not? – At IU, keep original descriptive metadata (for full item view)
  • 32.
    Thank you! • JulieHardesty • jlhardes@iu.edu • @jlhardes

Editor's Notes

  • #20 HyBox Idea - https://github.com/projecthydra-labs/hybox-ideas/issues/19 Odering question on Fedora-tech - https://groups.google.com/forum/#!topic/fedora-tech/1dlfy9Nx76Q
  • #25 Can add MODSRDF namespace and add properties using that namespace Tried using mods:genre since it is is a flat non-hierarchical element in XML; even brought in a URI for the genre Documentary films from the LC Genre/Form Terms vocabulary Problem is: this is not MODSRDF MODSRDF is often complex (nested) RDF - How does it work to add a complex MODS field to an object?