“ We are drowning in information, while starving for wisdom. The world henceforth will be run by synthesizers, people able...
500 Years to digitise - forget it!
We need some fresh thinking Stop thinking about individual specimens (especially types)!
We need to standardise & simplify Focus on whole collections that have a standard form, e.g. draws & slides
Metadata capture is rate limiting <ul><li>Surrogate and metadata don’t need to be captured together </li></ul><ul><li>Tie ...
Barcode Everything No metadata recorded. We just add a standard number to specimens
SatScan Collections by smartdrive High Resolution - Low Distortion <ul><li>LOW COST </li></ul><ul><li>Fast to use </li></u...
1mm Fossil zoomed in from imaged 500x500mm samples draw Single moth from a 500x500mm display case imaged in full at 1000 DPI
Images are “good enough” <ul><li>High resolution </li></ul><ul><li>No distortion </li></ul><ul><li>(No parallax or edge ef...
Indexing Conveyor Inspection Module Museum Trays Camera Loading / Unloading Area Loading / Unloading Area 1.2 Mtr 1.2 Mtr ...
SmartDrive MicroScan System  with Microscope Slide Tray Loader on Gliding Framework Concept Illustration Slide Tray Glide ...
Conveyor approach 2 for slides
Metadata capture <ul><li>This is what we prioritise </li></ul><ul><li>Focus on topics - not taxa </li></ul><ul><li>We need...
Innovative metadata capture
Guiding Principles   1. “ No specimen left behind ” (for collections being digitised) 2. Everything has a barcode (unique ...
“ Why Case” - Examples <ul><li>Supporting the monitoring of environmental change… </li></ul><ul><li>Supporting biodiversit...
“ We are drowning in information, while starving for wisdom. The world henceforth will be run by synthesizers, people able...
Upcoming SlideShare
Loading in …5
×

Large-scale digitisation options at the Natural History Museum, London.

1,569 views

Published on

An invited presentation to the Science Information Committee of the Natural History Museum, London, UK. November 6, 2009.

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,569
On SlideShare
0
From Embeds
0
Number of Embeds
9
Actions
Shares
0
Downloads
8
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • This sums up what we (the NHM) should be about for biodiversity. Yet when it come to our thinking on digitisation of the “collection” we are way off base…
  • Using current approaches we cannot deliver it in a timely way. Most grants and benefactors expect a return in a much shorter period. 3,5, perhaps 10 years (e.g. EOL expects a returns in 10). 500 years is 2 orders of magnitude out, and that is just for creating the digital surrogate (incidentally the name digital surrogate isn’t marketable- we need a better name - no one understands what a digital surrogate is). We need a fresh approach…
  • Emphasis has been on prioritizing individual specimens. Traditionally as taxonomists we select types. But types are not that important to all but taxonomists. More importantly the act of specimen selection becomes a rate limiting step. Transaction costs are way too high. We have the dilemma of what to specimens to select, and for whom, for an audience we cannot be certain of either now, or into the future.
  • As part of the fresh approach we need some standards and we need to simplify. What are the standard objects, since it is around these we can simplify and standardise the process of handling. These objects are going to differ between departments, but these are what we should be digitized because they minimize the handling, and handling is the rate limiting step to creating digital surrogates. With well thought out workflows and processing digital surrogates of these objects can be created quickly. One one they are done, we don’t have to go back to a collection and pick up the parts we haven’t digitized.
  • As above
  • Barcodes can be added as required. They don’t need to be recorded. Could be picked up on the pins after the draw has been digitised.
  • Addresses parallax issues and squared distortion issues.
  • Addresses parallax issues and squared distortion issues.
  • Collection draws for digitising an entire collection very quickly. Only the process of putting the barcodes on specimens and putting the draws on the conveyer is manual. Everything else is automated.
  • Slides 1
  • Preferred slides option. This is quicker and gives better quality images. The system is based on that used for automatically scanning and screening histology slides. Some research is needed on adapting it to whole specimens.
  • Captcha is a mechanical turk process. [Explain mechanic turk]. Great way of getting help from a wider audience doing things we cannot automate. However, scientists and other interested parties can be involved too. We can design the system to work for us. A practical example is the Herbaria@home project.
  • [Explain herbarium at home]
  • Large-scale digitisation options at the Natural History Museum, London.

    1. 1. “ We are drowning in information, while starving for wisdom. The world henceforth will be run by synthesizers, people able to put together the right information at the right time , think critically about it, and make important choices ” E. O. Wilson | Harvard University
    2. 2. 500 Years to digitise - forget it!
    3. 3. We need some fresh thinking Stop thinking about individual specimens (especially types)!
    4. 4. We need to standardise & simplify Focus on whole collections that have a standard form, e.g. draws & slides
    5. 5. Metadata capture is rate limiting <ul><li>Surrogate and metadata don’t need to be captured together </li></ul><ul><li>Tie them back together with identifiers as required </li></ul><ul><li>Engaging the public (mechanical turk) with metadata capture </li></ul><ul><li>Its metadata capture we prioritise </li></ul><ul><li>We should digitise everything at low resolution </li></ul><ul><li>Low resolution images tell us 75% of what we want to know </li></ul>Separate metadata capture from digitisation. Metadata is what you prioritise
    6. 6. Barcode Everything No metadata recorded. We just add a standard number to specimens
    7. 7. SatScan Collections by smartdrive High Resolution - Low Distortion <ul><li>LOW COST </li></ul><ul><li>Fast to use </li></ul><ul><li>UK company </li></ul>Low tech reliable equipment
    8. 8. 1mm Fossil zoomed in from imaged 500x500mm samples draw Single moth from a 500x500mm display case imaged in full at 1000 DPI
    9. 9. Images are “good enough” <ul><li>High resolution </li></ul><ul><li>No distortion </li></ul><ul><li>(No parallax or edge effects) </li></ul>
    10. 10. Indexing Conveyor Inspection Module Museum Trays Camera Loading / Unloading Area Loading / Unloading Area 1.2 Mtr 1.2 Mtr 1 Mtr 700 mm All dimensions are approximate – do not scale SmartDrive TrayScan System with Infeed / Outfeed Conveyor Concept Illustration Positioning Guides (Flites) Conveyor approach for draws Digitise whole collections quickly
    11. 11. SmartDrive MicroScan System with Microscope Slide Tray Loader on Gliding Framework Concept Illustration Slide Tray Glide Mechanism Inspection Module SlideTrays with 'Locations' Fibre Optic Illuminator 500mm 1200mm All dimensions are approximate – do not scale Conveyor approach 1 for slides
    12. 12. Conveyor approach 2 for slides
    13. 13. Metadata capture <ul><li>This is what we prioritise </li></ul><ul><li>Focus on topics - not taxa </li></ul><ul><li>We need to be innovative </li></ul><ul><li>Engage the public </li></ul>
    14. 14. Innovative metadata capture
    15. 15. Guiding Principles 1. “ No specimen left behind ” (for collections being digitised) 2. Everything has a barcode (unique identifier) 3. Universally agreed across NHM digitisation projects <ul><li>Further issues </li></ul><ul><li>This approach is not appropriate for everything, but works for most! </li></ul><ul><li>Digital storage - NHM is not thinking clearly about this </li></ul><ul><li>Separating images of specimens from the main image </li></ul><ul><li>Moving specimens (see above) </li></ul><ul><li>Adding new specimens (temporary draws digitised) </li></ul><ul><li>Private or EU funding (its not science, however…) </li></ul><ul><ul><li>Research on issues (handling, processing, storage) </li></ul></ul><ul><ul><li>Public / wider engagement - mechanical turk aspects </li></ul></ul>Much less than 500 Years! We need to build up the WHY case - what are the benefits Summary
    16. 16. “ Why Case” - Examples <ul><li>Supporting the monitoring of environmental change… </li></ul><ul><li>Supporting biodiversity conservation… </li></ul><ul><li>Supporting the sustainable economic use biodiversity… </li></ul><ul><li>Supporting the preservation and use of biological collections… </li></ul><ul><li>Supporting biodiversity research communities and networks… </li></ul><ul><li>Supporting education and training activities… </li></ul>We should prioritise metadata capture around these issues Where possible we should digitise (create surrogates of) everything Controversially, I think digitisation should EVENTUALLY be part of our core business. Metadata capture is NOT part of our core business (it is externally fundable) Also metadata capture is hard to predict - see Ed. Wilson’s quote
    17. 17. “ We are drowning in information, while starving for wisdom. The world henceforth will be run by synthesizers, people able to put together the right information at the right time , think critically about it, and make important choices ” E. O. Wilson | Harvard University

    ×