Hot Topics: The DuraSpace Community Webinar Series
Series 9: Early Advantage: Introducing New Fedora 4.0 Repositories
Curated by David Wilcox, Fedora Product Manager, DuraSpace
“Fedora 4.0 in Action at Penn State and Stanford”
Wednesday, November 5, 1:00-2:00pm ET
Presented by:
David Wilcox, Fedora Product Manager, DuraSpace
Adam Wead, Developer, Pennsylvania State University and Tom Cramer, Chief Technology Strategist and Associate Director of Digital Library Systems and Services, Stanford University
Injustice - Developers Among Us (SciFiDevCon 2024)
11.5.14 Presentation Slides, “Fedora 4.0 in Action at Penn State and Stanford”
1. Hot Topics: The DuraSpace
Community Webinar Series
Series Nine:
“Early Advantage: Introducing New
Fedora 4.0 Repositories”
Curated by David Wilcox,
Fedora Product Manager, DuraSpace
November 5, 2014 Hot Topics: DuraSpace Community Webinar Series
2. Webinar 2:
Fedora 4.0 in Action at
Penn State and Stanford
Presented by:
Adam Wead, Developer,
Pennsylvania State University
Tom Cramer, Chief Technology Strategist and Associate
Director of Digital Library Systems and Services,
Stanford University
November 5, 2014 Hot Topics: DuraSpace Community Webinar Series
3. Fedora 4.0 Status
• Wrapping up development this week
• Focus on testing and bug fixing
• Production release by end of year
• Next: F3 to F4 migrations
November 5, 2014 Hot Topics: DuraSpace Community Webinar Series
4. Beta Pilot Goals
• Test 4.0 features in a production-like
environment
• Gather feedback for 4.0 release
• Demonstrate diverse use cases
• Encourage early adoption of Fedora 4
November 5, 2014 Hot Topics: DuraSpace Community Webinar Series
5. Beta Pilot Outcomes
• Outcomes reported on the wiki
• Feedback rolled into 4.0 release
• Panel at CNI Fall Meeting
• Next round of Beta Pilots: F3 to F4
migrations
November 5, 2014 Hot Topics: DuraSpace Community Webinar Series
6. Fedora 4 Beta Pilot
Adam Wead, Analyst and Programmer
Penn State University
awead@psu.edu / @amsterdamos
7. Why do a beta pilot?
• currently use Fedora3 via Hydra
• Fedora is central to our mission to provide
repository services at Penn State
• further community development of Hydra and
related Fedora applications
• work with Duraspace while development is
still active
5 Nov 2014 7
8. Fedora at Penn State
ScholarSphere
• the institutional repository at Penn State
• version 2 released in September
• 3 years in production
• 4775 objects / 37GB data
• comprises academic publications and research
data from Penn State's faculty and students
5 Nov 2014 8
9. Fedora at Penn State
ArchiveSphere
• archival collection management
• 72262 objects / 186GB data
• supports the efforts of the University’s
archivists
5 Nov 2014 9
10. Fedora at Penn State
ETDFlow
• electronic theses and dissertations
• supports submission, approval, and
publication workflows
• assets are deposited into ScholarSphere upon
publication
• forthcoming application still under
development
5 Nov 2014 10
11. Fedora at Penn State
Sufia
• core "engine" for all of Penn State's Hydra
applications
• began as the original ScholarSphere and was
extracted into a separate gem
• developed by the Hydra community
• enables intra-institutional use, development,
and support
5 Nov 2014 11
12. Why Fedora 4?
• proven track record
• vested interest: Sufia, ScholarSphere, et. al.
• continued community development with
Hydra
• new features!
5 Nov 2014 12
13. Looking Forward to…
• native RDF support
• better support for large files
• clustering capabilities
• more flexible modeling of content and
metadata
• fixity checking
5 Nov 2014 13
14. Pilot Goals
• content "remodeling"
• compatibility with ActiveFedora
• migration
5 Nov 2014 14
15. Sufia Models: Fedora3
• RDF triples for all
descriptive metadata
• must be stored as text
file in a datastream
• Hydra handles the
CRUD operations
• Fedora only sees a
related datastream
5 Nov 2014 15
16. Sufia Models: Fedora4
• native RDF for any object
or resource
• no attached file of triples
• persisted in the Fedora
object as RDF
• binary content and
related files are child
resources
• child resources can have
RDF too, just like their
parent objects
5 Nov 2014 16
17. ActiveFedora
• integration point between Hydra and Fedora4
• code sprint underway to finish outstanding
issues
• alpha release targeted for this month
• Sufia+Fedora4 work following concurrently
5 Nov 2014 17
18. Migration
• currently have a working proof-of-concept
• uses Hydra stack component to move content
• waiting on ActiveFedora and Fedora4 release
• migration testing in early December
• deploy migrated content to production in
January
5 Nov 2014 18
19. Current Status
• working with Duraspace for 4.0 release
• code sprinting
• communicating progress to the community
• always seeking feedback
• willing to share
5 Nov 2014 19
20. Thank You
Adam Wead
Penn State University
awead@psu.edu / @amsterdamos
21. Links
• Pilot project information
• Sufia
• ActiveFedora
• Migration project
5 Nov 2014 21
22. Exercising Fedora as a
Linked Data Repository
Introducing Triannon and
Stanford’s Fedora 4 Beta Pilot
November 2014
Tom Cramer
Chief Technology Strategist
Stanford University Libraries
@tcramer
23. Use Case 1: Digital Manuscript Annotations
Parker on the Web
24. Use Case 1: Digital Manuscript Annotations
Parker on
the Web
Image annotation & transcription tools
25. Use Case 1: Digital Manuscript Annotations
Image annotation
& transcription tools
Parker on the Web
Open Annotation RDF
(AKA Linked Data)
Open Annotation RDF
(AKA Linked Data)
26. Use Case 1: Digital Manuscript Annotations
Image annotation
Parker on the Web & transcription tools
Open Annotation RDF
(AKA Linked Data)
We have tens of thousands of scholarly annotations expressed
as RDF triples, enriching digital resources in our repositories.
• Where can we store it?
• How can we manage it?
• How can we retrieve it for visualization in new environments?
27. Use Case 2: Linked Data for Libraries (LD4L)
Bibliographic
Data
• MARC
• MODS
• EAD
Person Data
• VIVO
• ORCID
• ISNI
• VIAF
Usage Data
• Circulation
• Citation
• Curation
• Exhibits
• Research
Guides
• Syllabi
• Tags
28. Use Case 2: Linked Data for Libraries (LD4L)
Use Case 1.1: Build a virtual collection
As a faculty member or librarian, I want to create a
virtual collection or exhibit from multiple collections,
so that I can share a focused collection with a <class,
set of researchers, set of students in a disciplinary
area>.
Use Case 1.2: Tag scholarly information resources
to support reuse
As a librarian, I would like to be able to tag
scholarly information resources into curated lists,
so that I can feed these these lists into subject
guides, course reserves, or reference collections.
https://wiki.duraspace.org/display/ld4l/LD4L+Use+Cases
29. 3. Bibliographic 2. Authorities 1. Annotations
Annos
MARC
(Auth)
Annotator
GeoBL
Applicatʼn
SDR
Graph
Search
Work
Instance
Holding
Person
Organization
Place
Subject
Classification
Comment
Tag
Review
OCLC
HighWire
MARC
(Bib)
CAP
Agents Places Topics
Biblio
4. Linked Open Data
30. 3. Bibliographic 2. Authorities 1. Annotations
Annos
MARC
(Auth)
Annotator
• Circulation data
• Citation data
GeoBL
• Curation data
virtual collections,
exhibits, reading lists,
tags, etc.
• Require a store for RDF annos
and body of annotation
(any arbitrary bitstream)
• Need to persist, manage, index
• NOT the ILS nor core
repository
Applicatʼn
• All RDF / linked data
SDR
Graph
Search
Work
Instance
Holding
Person
Organization
Place
Subject
Classification
Comment
Tag
Review
OCLC
HighWire
MARC
(Bib)
CAP
Agents Places Topics
Biblio
4. Linked Open Data
31. What should we use for these two use cases?
1.) Digital Manuscript annotations
2.) Linked data for libraries
32. What should we use for these two use cases?
1.) Digital Manuscript annotations
2.) Linked data for libraries
33. What should we use for these two use cases?
1.) Digital Manuscript annotations
2.) Linked data for libraries
Native RDF store
Manage assets (bitstreams)
Built in service framework
Versioning, indexing, APIs
Easy to deploy
Looking for real world use cases!
34. A note about LDP: Linked Data Platform
• W3C draft specification
• Enables read-write operations of linked data via HTTP
• Developed at same time as Fedora 4
• Fedora 4 one of a handful of current LDP
implementations
• See http://www.w3.org/TR/ldp/
35. Stanford Fedora 4 Beta Pilot
• Install, configure & deploy Fedora 4
• Exercise LDP API for storing annotations
• and associated text/binary objects
• Develop support for RDF references to external objects
• Test scale with millions of small objects
• Integrate with read/write apps and operations
• Annotation tools for write: e.g., Annotator
• Indexing & Visualization for read: solr & Blacklight, Mirador
• See https://wiki.duraspace.org/display/FF/Beta+Pilot+-+Stanford
36. Architecture and Data Flow
Fedora 4
Annotator
Triannon
(Rails engine for Open
json-ld
Annotations stored in Fedora 4)
LDP
Accept &
Return
Annotation
Interact
with F4
via LDP
Validation
and
Serialization
Transformation
for LDP
Mirador
(Annotation Maker
and Viewer)
37. Fedora 4
Architecture and Data Flow: Future
Annotator
Triannon
(Rails engine for Open
json-ld
Annotations stored in Fedora 4)
LDP
Accept &
Return
Annotation
Interact
with F4
via LDP
Validation
and
Serialization
Transformation
for LDP
Mirador
(Annotation Maker
and Viewer)
Future:
Blacklight Solr
38. What We’ve Learned To Date
• Fedora 4 approaching 100% LDP 1.0 Compliant
• Triannon at alpha stage
• Can write, read & delete Open Annotations to/from Fedora 4
• Still to come
• Updates to annotations
• Storage of binary blobs (Annotation bodies) in Fedora 4
• Implement authn/z
• Deploy against real annotation clients
• Populate with data at scale
• Work will continue throughout 2015
• Triannon: https://github.com/sul-dlss/triannon
• Mirador: https://github.com/IIIF/m2
39. Futures
• A wealth of tools…
• Enriching digital
objects and records
through…
• Annotating
• Tagging
• Curating
• Stored as linked data
natively
• Using Fedora 4 as a
management platform
Mirador/Anno
tator
Image
anno. tools
Blacklight-based
apps
Any OA-compatible
annotations
Open
Annotati
ontools
40. Webinar 2:
Fedora 4.0 in Action at
Penn State and Stanford
Questions
November 5, 2014 Hot Topics: DuraSpace Community Webinar Series