Approaches to preserving digitized taxonomic data:  Prints, manuscripts  &  specimens Chris Freeland Director, Center for ...
Prints / Manuscripts / Specimens Different objects, similar management http://www.flickr.com/photos/biodivlibrary/62578595...
Overview of Talk <ul><li>Why worry about digital preservation? </li></ul><ul><li>Considerations for preservation </li></ul...
WHY WORRY? http://www.flickr.com/photos/biodivlibrary/6008902662
Do it once, do it right <ul><li>Costs more to get object to scanner than to scan </li></ul>
<ul><li>Conversion / Compost / Corruption </li></ul><ul><li>Longevity of digital objects </li></ul><ul><li>File changes </...
CONSIDERATION: COLLABORATION
LOCKSS <ul><li>L ots  O f  C opies  K eeps  S tuff  S afe </li></ul><ul><li>LOCKSS is both a software platform & a concept...
Rule of 3 Museum X Library Y Archive Z 1. Geographic Locations 2. Administrations 3. Technology Platforms
CONSIDERATION: FILE FORMATS
JPEG2000 <ul><li>Wavelet compression, lossless encoding </li></ul><ul><li>12 Parts </li></ul><ul><li>Of particular interes...
http://www.tropicos.org/ImageFullView.aspx?imageid=62182
JPEG2000  (Hurrahs & Hisses) <ul><li>Advantages </li></ul><ul><ul><li>Store a single file for access & preservation </li><...
PDF/A <ul><li>ISO-standardized version of PDF suitable for long-term preservation </li></ul><ul><li>Identifies a &quot;pro...
CONSIDERATION: METADATA
The Great Thing About STANDARDS Is That There Are SO MANY To Choose From
Metadata Preservation <ul><li>Descriptive information (metadata) provides content & context for indexing, reuse </li></ul>...
THE FUTURE
Electronic Publications <ul><li>Happening now, has been for years </li></ul><ul><li>Should take same care in ensuring hete...
http://www.biodiversitylibrary.org/page/22681143   Need a meadow…
… not a monoculture.
There is no silver bullet <ul><li>Make best decision today </li></ul><ul><li>Stay up with technology changes & best practi...
Questions? Chris Freeland Director, Center for Biodiversity Informatics Technical Director, Biodiversity Heritage Library ...
Upcoming SlideShare
Loading in …5
×

Approaches to preserving digitized taxonomic data

3,597 views

Published on

Sherborn Symposium. Natural History Museum, London. 28 October 2011.

Published in: Technology, Education
1 Comment
1 Like
Statistics
Notes
  • Nice talk Chris (I'll ignore the fact that you have 3 logos on the front page), and I appreciate CouchDB being in the mix. From when we talked about it years ago, it's come a long way, and fits into worldwide replication - it's built for it. Here's to making BHL, and the cluster, better, more complete, more available...and just more!
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Views
Total views
3,597
On SlideShare
0
From Embeds
0
Number of Embeds
1,785
Actions
Shares
0
Downloads
11
Comments
1
Likes
1
Embeds 0
No embeds

No notes for slide
  • Subject is almost irrelevant when talking about preservation Way I preserve a scanned image of a specimen is fundamentally the same as how I’d preserve an image of a manuscript But, metadata standards are important to make sure context, descriptive data are properly described.
  • Scanning is people work.
  • Does not specify the management systems or archiving strategy of the file itself. PDF/A is not a total solution. Good format, needs other pieces previously described to be “archival”
  • Approaches to preserving digitized taxonomic data

    1. 1. Approaches to preserving digitized taxonomic data: Prints, manuscripts & specimens Chris Freeland Director, Center for Biodiversity Informatics Technical Director, Biodiversity Heritage Library 28 October 2011 @chrisfreeland
    2. 2. Prints / Manuscripts / Specimens Different objects, similar management http://www.flickr.com/photos/biodivlibrary/6257859557 http://www.flickr.com/photos/chrisfreeland/6018724034 http://www.biodiversitylibrary.org/page/34045915
    3. 3. Overview of Talk <ul><li>Why worry about digital preservation? </li></ul><ul><li>Considerations for preservation </li></ul><ul><ul><li>Collaboration </li></ul></ul><ul><ul><li>File formats </li></ul></ul><ul><ul><li>Metadata standards </li></ul></ul><ul><li>Views to the future </li></ul>Preservation Panic!
    4. 4. WHY WORRY? http://www.flickr.com/photos/biodivlibrary/6008902662
    5. 5. Do it once, do it right <ul><li>Costs more to get object to scanner than to scan </li></ul>
    6. 6. <ul><li>Conversion / Compost / Corruption </li></ul><ul><li>Longevity of digital objects </li></ul><ul><li>File changes </li></ul><ul><li>Media obsolescence </li></ul>Cautionary Tales
    7. 7. CONSIDERATION: COLLABORATION
    8. 8. LOCKSS <ul><li>L ots O f C opies K eeps S tuff S afe </li></ul><ul><li>LOCKSS is both a software platform & a concept </li></ul><ul><ul><li>Software: http://www.lockss.org </li></ul></ul>
    9. 9. Rule of 3 Museum X Library Y Archive Z 1. Geographic Locations 2. Administrations 3. Technology Platforms
    10. 10. CONSIDERATION: FILE FORMATS
    11. 11. JPEG2000 <ul><li>Wavelet compression, lossless encoding </li></ul><ul><li>12 Parts </li></ul><ul><li>Of particular interest to documents & specimens: </li></ul><ul><ul><li>Part 1: Core Coding System, ISO/IEC 15444-1 </li></ul></ul><ul><ul><li>Part 6: Compound image file format </li></ul></ul><ul><ul><li>Part 10: JP3D, Volumetric images </li></ul></ul>http://www.jpeg.org/jpeg2000/
    12. 12. http://www.tropicos.org/ImageFullView.aspx?imageid=62182
    13. 13. JPEG2000 (Hurrahs & Hisses) <ul><li>Advantages </li></ul><ul><ul><li>Store a single file for access & preservation </li></ul></ul><ul><ul><li>Standards-based </li></ul></ul><ul><ul><li>Saves drive space (important at museum scale) </li></ul></ul><ul><li>Disadvantages </li></ul><ul><ul><li>Doesn’t have wide native support in many apps </li></ul></ul><ul><ul><li>Requires an intermediary app to decode & serve </li></ul></ul><ul><ul><ul><li>But, there’s an open source option: djatoka http://djatoka.sourceforge.net </li></ul></ul></ul><ul><ul><li>Reports of data loss </li></ul></ul>
    14. 14. PDF/A <ul><li>ISO-standardized version of PDF suitable for long-term preservation </li></ul><ul><li>Identifies a &quot;profile&quot; for electronic documents that ensures the documents can be reproduced exactly the same way in years to come.* </li></ul><ul><li>Makes the file self-contained (and therefore larger) </li></ul><ul><ul><li>Embeds fonts </li></ul></ul><ul><ul><li>Graphics </li></ul></ul> http://en.wikipedia.org/wiki/PDF/A
    15. 15. CONSIDERATION: METADATA
    16. 16. The Great Thing About STANDARDS Is That There Are SO MANY To Choose From
    17. 17. Metadata Preservation <ul><li>Descriptive information (metadata) provides content & context for indexing, reuse </li></ul><ul><li>Can bundle metadata within files </li></ul><ul><ul><li>EXIF: images, common in digital cameras </li></ul></ul><ul><ul><li>Adobe XMP: docs, images </li></ul></ul><ul><li>Should commit metadata to file system </li></ul><ul><ul><li>Should not manage just in DB or other management system </li></ul></ul>Filesystem <DwC> XML JP2
    18. 18. THE FUTURE
    19. 19. Electronic Publications <ul><li>Happening now, has been for years </li></ul><ul><li>Should take same care in ensuring heterogeneity & diversity in digital management systems as with printed, bound books </li></ul><ul><ul><li>Monolithic libraries have failed over time </li></ul></ul><ul><ul><li>Monolithic electronic archives will, too </li></ul></ul>
    20. 20. http://www.biodiversitylibrary.org/page/22681143 Need a meadow…
    21. 21. … not a monoculture.
    22. 22. There is no silver bullet <ul><li>Make best decision today </li></ul><ul><li>Stay up with technology changes & best practices </li></ul><ul><ul><li><insert library & archive professionals here> </li></ul></ul><ul><li>Evaluate, experiment, document, lead </li></ul><ul><li>Move to stable new technologies when necessary </li></ul>
    23. 23. Questions? Chris Freeland Director, Center for Biodiversity Informatics Technical Director, Biodiversity Heritage Library 28 October 2011 Email: [email_address] Twitter: @chrisfreeland

    ×