Your SlideShare is downloading. ×
0
Hypatia for dlf 2011
Hypatia for dlf 2011
Hypatia for dlf 2011
Hypatia for dlf 2011
Hypatia for dlf 2011
Hypatia for dlf 2011
Hypatia for dlf 2011
Hypatia for dlf 2011
Hypatia for dlf 2011
Hypatia for dlf 2011
Hypatia for dlf 2011
Hypatia for dlf 2011
Hypatia for dlf 2011
Hypatia for dlf 2011
Hypatia for dlf 2011
Hypatia for dlf 2011
Hypatia for dlf 2011
Hypatia for dlf 2011
Hypatia for dlf 2011
Hypatia for dlf 2011
Hypatia for dlf 2011
Hypatia for dlf 2011
Hypatia for dlf 2011
Hypatia for dlf 2011
Hypatia for dlf 2011
Hypatia for dlf 2011
Hypatia for dlf 2011
Hypatia for dlf 2011
Hypatia for dlf 2011
Hypatia for dlf 2011
Hypatia for dlf 2011
Hypatia for dlf 2011
Hypatia for dlf 2011
Hypatia for dlf 2011
Hypatia for dlf 2011
Hypatia for dlf 2011
Hypatia for dlf 2011
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Hypatia for dlf 2011

1,020

Published on

Published in: Technology, Education
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,020
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
7
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • Hypatia
  • Transcript

    • 1. Hypatia Hydra Platform for Access to Information in Archives DLF Forum * Baltimore * October 31, 2011 Stanford University Bradley Daigle Julie Meloni Tom Cramer Michael Olson Naomi Dushay University of Virginia
    • 2. Introduction Tom Cramer AIMS Bradley Daigle Born Digital Materials (& Forensics) Michael Olson Hypatia Functional Requirements Michael Olson Data Models & Loading Naomi Dushay Demonstration Julie Meloni & Michael Olson Q&A Discussion In Sum & Looking Forward Tom Cramer
    • 3. What is Hypatia? <ul><li>Hy dra P latform for A ccess t o I nformation in A rchives </li></ul><ul><li>Repository-powered solution for digital archival materials management, preservation and access </li></ul><ul><li>One component in a larger (eco)system for archivists </li></ul><ul><li>Open source software based on Hydra & Fedora </li></ul><ul><li>The potential nucleus of a larger, sustained, collaborative effort </li></ul>
    • 4. Origins <ul><li>Outgrowth of AIMS project </li></ul><ul><li>Leveraging the Hydra project </li></ul><ul><li>Functional requirements and content from the AIMS partners (Virginia, Hull, Stanford, Yale) </li></ul><ul><li>Technical Development by Stanford, Virginia & MediaShelf (contract) </li></ul>
    • 5. What is Hydra? <ul><li>Partners </li></ul><ul><li>DuraSpace </li></ul><ul><li>Northwestern University </li></ul><ul><li>Notre Dame </li></ul><ul><li>Rock & Roll Hall of Fame </li></ul><ul><li>Stanford University </li></ul><ul><li>University of Hull </li></ul><ul><li>University of Virginia </li></ul><ul><li>+ half dozen more in ramp-up mode </li></ul><ul><li>“ Solution Bundles” </li></ul><ul><li>IR </li></ul><ul><li>ETD’s </li></ul><ul><li>Research Data </li></ul><ul><li>Video </li></ul><ul><li>Images </li></ul><ul><li>Archives  Hypatia </li></ul><ul><li>Open Access Articles </li></ul><ul><li>Digitization Workflow </li></ul><ul><li>Digital Monograph Acquisitions </li></ul><ul><li>Exhibits </li></ul><ul><li>Digital Preservation </li></ul>Technology OSS stack featuring Fedora, solr, Ruby on Rails, Blacklight
    • 6. The Workflow Born Digital Materials Forensic Extraction & Processing Hypatia Repository Object Management & Preservation Arrangement & Description EAD Physical Materials Discovery & Access
    • 7. Iterations & Enrichment Born Digital Materials Forensic Extraction & Processing Hypatia Repository Object Management & Preservation Arrangement & Description EAD Physical Materials Discovery & Access Two Phase Data Processing: Reprocess for object-level access EAD Enrichment: IDs and URLs for files /containers
    • 8.  
    • 9.  
    • 10.  
    • 11.  
    • 12.  
    • 13.  
    • 14. Functional Requirements Gathering <ul><li>Created by AIMS Digital Archivists’ </li></ul><ul><li>January – March 2011 </li></ul><ul><li>Initial Focus on Arrangement and Description – a tool for Archivists’ </li></ul><ul><li>Second focus on Discovery and Access </li></ul>
    • 15.  
    • 16.  
    • 17. How do we get to Hypatia 1 <ul><li>Archival Description – Encoded Archival Description (EAD) </li></ul><ul><ul><li>Repository data </li></ul></ul><ul><ul><li>Collection data (title, extent, …..) </li></ul></ul><ul><ul><li>Physical and Intellectual arrangement </li></ul></ul><ul><ul><li>Challenges with EAD </li></ul></ul><ul><ul><li>Encoding Standards institutional specific </li></ul></ul><ul><ul><li>Doesn’t scale for born digital archives (100,000 of files) </li></ul></ul>
    • 18. How do we to Hypatia 2 <ul><li>Archival payload – disk images / files </li></ul><ul><ul><li>Typically stored on obsolete media </li></ul></ul><ul><ul><li>Minimal descriptive metadata </li></ul></ul>
    • 19. The POWER of Digital Forensics <ul><li>Specialized software to help archivists’ preserve </li></ul><ul><li>provenance by: </li></ul><ul><ul><li>Migrating data off of legacy at risk media </li></ul></ul><ul><ul><li>Captures create, modify date, last accessed date </li></ul></ul><ul><ul><li>Preserves original media file paths, OS and low level formatting? </li></ul></ul><ul><ul><li>Original applications including fonts that created the data </li></ul></ul>
    • 20. Digital Forensic Processing <ul><li>Archivists use commercial or open source software to tag large quantities of born digital archival materials </li></ul><ul><ul><li>Keyword, pattern search to find files that have sensitive information (Health records, Credit Card data, etc.) </li></ul></ul><ul><li>Bulk edit tagging for restricted files, subject, source media (what disk did the file come from?) </li></ul>
    • 21. Active Fedora Solrizer =
    • 22. Content Digital Content Content Content Digital Content Digital Content Digital Content
    • 23. Atomistic Content Model File(Asset) File(Asset) File(Asset) Exposed Object Exposed Object is_part_of is_part_of is_part_of Is_member_of_collection is_member_of
    • 24. Descriptive Metadata (MODS) DC (Fedora) RELS-EXT (model, parent, …) (Fedora) Content Metadata (Stanford) Rights Metadata (hydra)
    • 25. Series Series Series EAD
    • 26. EAD Digital Content Digital Content Digital Content Digital Content Digital Content Digital Content
    • 27. Digital Content Disk Image File File File File File File File File
    • 28. Disk Image File EAD jpg Set Collection Set Disk Image Disk Image jpg jpg jpg File File File File File File File File
    • 29. EAD Set Collection Disk Image Disk Image File File File File File FTK processing File File File File Set Set
    • 30. FTK Output not designed for this: //fo:page-sequence[2][fo:flow/fo:block[text()='Case Information']]/fo:flow/fo:table[1]/ fo:table-body/fo:table-row[3]/fo:table-cell[2]/fo:block/text()
    • 31. FTK practices vary
    • 32. Whole (digital) archival management <ul><li>A Case Study: Feigenbaum Papers at Stanford, prior to SALT </li></ul><ul><li>Previously: </li></ul><ul><ul><li>Files on a separate file store </li></ul></ul><ul><ul><ul><li>Permissions management & preservation challenges </li></ul></ul></ul><ul><ul><li>A distinct index with its own faceted browser (Flamenco) </li></ul></ul><ul><ul><li>A separate Drupal site for collection landing page </li></ul></ul><ul><ul><li>A separate Mysql db for tags </li></ul></ul><ul><ul><li>A separate Finding Aid, stored in Archivists Toolkit </li></ul></ul><ul><li>= a Nightmare to synchronize, update and migrate </li></ul>
    • 33. Hypatia Benefits <ul><li>Integrated solution for archival digital objects management & access </li></ul><ul><li>Granular permissions management for discover, read, edit, and administrate </li></ul><ul><li>Support for multiple arrangements: physical, logical, archival </li></ul><ul><li>Enables ongoing processing as resources become available </li></ul><ul><li>Integrated approach for digital preservation </li></ul>
    • 34. WIIFM? <ul><li>Open source code base for digital archives management </li></ul><ul><li>Nucleus for further, community development </li></ul><ul><li>Forensic toolkit patterns, best practices, Fedora loading scripts </li></ul><ul><li>Data models for EAD and digital archival objects in Fedora </li></ul><ul><li>Functional requirements for arrangement, description, discovery and access for digital archives </li></ul>
    • 35. Next Steps <ul><li>Pilot usage in a archives processing digital materials </li></ul><ul><li>Another round of development </li></ul><ul><li>Development of bulk permissions management, arrangement & description </li></ul><ul><li>Experiment with UI and tools for archivists and for end users </li></ul>
    • 36. 80/20 – 8 Weeks of Development https://github.com/projecthydra/hypatia/graphs/impact
    • 37. Connect Demo: http://hypatia-demo.stanford.edu Wiki: https://wiki.duraspace.org/display/HYPAT/Home List: http://hypatia-tech.googlegroups.com Code: https://github.com/projecthydra/hypatia Hydra: http://projecthydra.org AIMS: http://www2.lib.virginia.edu/aims/

    ×