Audio and Video
Repositories at Scale
Indiana University’s Media Digitization
and Preservation Initiative
Jon Dunn
Indiana...
THE END IS NEAR!!!
(for physical time-based media objects)
2
Degradation
All analog and physical digital media
objects actively degrading
– some catastrophically
3
Vinegar Syndrome
St...
Degradation
4
Obsolescence
• Media formats
• Equipment (playback machines, test
devices)
• Repair parts
• Playback expertise
• Repair ex...
How long do we have?
• Short time window in which to
migrate or lose forever
• 10-15 years?
6
Indiana University
State university
Founded in 1820
8 campuses
110,000 students
18,000 faculty and staff
7
Indiana University Bloomington
Flagship IU campus
Major research university
Strong arts and humanities focus
46,000 studen...
IU Repository and Digital Library
Environment
• Started with audio, 1992
• Repository infrastructure:
• Fedora, Hydra, Dsp...
Context: IU Collections
• School of Music
• Archives of Traditional
Music
• Kinsey Institute
• Moving Image Archive
• Lill...
Sound Directions, 2007
IU, Harvard
Best practices for
audio preservation
digitization and
storage
11
Media Preservation Survey, 2008
12
Findings:
• 569,000 items
• Unique and valuable
collections
• 51 physical formats
• 80 ...
Meeting the Challenge Report, 2012
13
Recommendations:
• Create center to
digitize 300K rare
or unique objects
within 15 y...
• Announced by President McRobbie in October
2013
• Goal: Digitize, preserve, provide access to rare
and unique audio and ...
Public-Private Partnership
Memnon Archiving Services
Brussels, Belgium
15
Digitization and Preservation Phases
16
Pre-Digitization
Digitization
Discovery
and Access
Inventory
Catalog
Prioritize
Ba...
Bins, Boxes, Barcodes, Batches
17
Workflow for Picking and Packing
18
Batches: Format Based
19
Workflow Support: The POD
20
High Volume
• 1500+ objects per week
• Peak of ~12 TB per day of digitization
• Total storage requirement: 9 PB
21
Post-Digitization Workflow
22
Post-Digitization Workflow
23
Metadata for Preservation and Access
Descriptive
– From MARC when available
– EAD finding aids, local inventories
Technica...
IU Scholarly Data Archive (SDA)
• Central storage resource for IU since
1999
• Data replicated between Indianapolis
and Bl...
Access Technology
• Discovery
• IUCAT, Archives Online (EAD Finding
Aids), other environments?
• Delivery
• Key requiremen...
Long-Term Preservation Technology
• 12 petabytes+ to be preserved
• Local storage
• UITS Scholarly Data Archive
• Fedora 4...
MDPI Assets and Opportunities
• Collaborative planning process
– Libraries, UITS, archives, faculty
• Research storage and...
MDPI Challenges
• Dealing with rights issues at scale
• Descriptive metadata and discovery
• Quality control strategies fo...
Stay Tuned…
mdpi.iu.edu
avalonmediasystem.org
30
Upcoming SlideShare
Loading in …5
×

Audio and Video Repositories at Scale - Indiana University’s Media Digitization and Preservation Initiative

604 views

Published on

Jon Dunn spoke about Indiana University's Media Digitization and Preservation Initiative (MDPI) at Open Repositories 2014 on June 12, 2014.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
604
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
5
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Audio and Video Repositories at Scale - Indiana University’s Media Digitization and Preservation Initiative

  1. 1. Audio and Video Repositories at Scale Indiana University’s Media Digitization and Preservation Initiative Jon Dunn Indiana University Libraries Open Repositories 2014 Helsinki, Finland June 12, 2014
  2. 2. THE END IS NEAR!!! (for physical time-based media objects) 2
  3. 3. Degradation All analog and physical digital media objects actively degrading – some catastrophically 3 Vinegar Syndrome Sticky Shed Syndrome Fungus Delamination Plasticizer Exudation Color Fading Curling Windowing Shedding Undiagnosed Unplayability Mechanical IssuesShrinkage Scratches Breaking Hydrolysis Oxidation Efflourescence Binder Breakdown Cupping Crystalline Residue
  4. 4. Degradation 4
  5. 5. Obsolescence • Media formats • Equipment (playback machines, test devices) • Repair parts • Playback expertise • Repair expertise • Tools • Supplies 5
  6. 6. How long do we have? • Short time window in which to migrate or lose forever • 10-15 years? 6
  7. 7. Indiana University State university Founded in 1820 8 campuses 110,000 students 18,000 faculty and staff 7
  8. 8. Indiana University Bloomington Flagship IU campus Major research university Strong arts and humanities focus 46,000 students 8500 faculty and staff 8
  9. 9. IU Repository and Digital Library Environment • Started with audio, 1992 • Repository infrastructure: • Fedora, Hydra, Dspace • Strong collaboration between Libraries and University IT Services • Strong enterprise and research storage infrastructure 9
  10. 10. Context: IU Collections • School of Music • Archives of Traditional Music • Kinsey Institute • Moving Image Archive • Lilly Library • University Archives • Athletics • International collections 10
  11. 11. Sound Directions, 2007 IU, Harvard Best practices for audio preservation digitization and storage 11
  12. 12. Media Preservation Survey, 2008 12 Findings: • 569,000 items • Unique and valuable collections • 51 physical formats • 80 units at IUB Recommendations: • Digitize within 15-20 years • Form task force to plan
  13. 13. Meeting the Challenge Report, 2012 13 Recommendations: • Create center to digitize 300K rare or unique objects within 15 years • Audio, video for preservation • Film for access
  14. 14. • Announced by President McRobbie in October 2013 • Goal: Digitize, preserve, provide access to rare and unique audio and video by 2020 – all IU campuses • Expect to start in 2014 Media Digitization and Preservation Initiative (MDPI), 2013 14
  15. 15. Public-Private Partnership Memnon Archiving Services Brussels, Belgium 15
  16. 16. Digitization and Preservation Phases 16 Pre-Digitization Digitization Discovery and Access Inventory Catalog Prioritize Batch & Queue IU Operation  Massive parallel digitization Digitization of selected unique and highly vulnerable formats Quality control Memnon Metadata Rights Issues Technical aspects  Metadata Technical infrastructure Ongoing monitoring and migration Digital Preservation and Storage
  17. 17. Bins, Boxes, Barcodes, Batches 17
  18. 18. Workflow for Picking and Packing 18
  19. 19. Batches: Format Based 19
  20. 20. Workflow Support: The POD 20
  21. 21. High Volume • 1500+ objects per week • Peak of ~12 TB per day of digitization • Total storage requirement: 9 PB 21
  22. 22. Post-Digitization Workflow 22
  23. 23. Post-Digitization Workflow 23
  24. 24. Metadata for Preservation and Access Descriptive – From MARC when available – EAD finding aids, local inventories Technical – File and original object characteristics – Checksums Process History – Digitization and preservation process Structural – Support relationships between and navigation within objects 24
  25. 25. IU Scholarly Data Archive (SDA) • Central storage resource for IU since 1999 • Data replicated between Indianapolis and Bloomington • Implemented using IBM HPSS software • 40PB IBM TS1140 based capacity • Available from desktop, web, high speed transfer protocols • Currently holds ~6PB of data in 36 million files 25
  26. 26. Access Technology • Discovery • IUCAT, Archives Online (EAD Finding Aids), other environments? • Delivery • Key requirements: usability, reusability, access control, performance • Avalon Media System 26
  27. 27. Long-Term Preservation Technology • 12 petabytes+ to be preserved • Local storage • UITS Scholarly Data Archive • Fedora 4 repository layer • Out-of-region storage • APTrust, DPN • Data swap agreements 27
  28. 28. MDPI Assets and Opportunities • Collaborative planning process – Libraries, UITS, archives, faculty • Research storage and IT resources – 40 petabyte mirrored HSM system – Experience with data-intensive workflows – High-performance networks • Expertise in digitization, repositories, access systems – Variations, Avalon Media System • Potential for external services 28
  29. 29. MDPI Challenges • Dealing with rights issues at scale • Descriptive metadata and discovery • Quality control strategies for mass digitization • Strategies for born-digital media • Out-of-region preservation storage • Approach for film 29
  30. 30. Stay Tuned… mdpi.iu.edu avalonmediasystem.org 30

×