Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Price2 ecn2013

704 views

Published on

Published in: Technology, Business
  • Be the first to comment

Price2 ecn2013

  1. 1. Rapid, industrial scale digitization of the NHM microscope slide collection Ben Price & Vladimir Blagoderov
  2. 2. Outline • The NHM slide collection • What is Digitization? • The NHM workflow • Psyllid collection • Future prospects
  3. 3. The NHM slide collection • ~ 2 million slides (60 : 40 vertical : horizontal storage)
  4. 4. The NHM slide collection • Mix of slide sizes, mounts, storage cabinets
  5. 5. What is Digitization? ?
  6. 6. What is Digitization? Label data: – Quick to image • 5000 per day – Slow to transcribe (crowdsourcing) – Slow to georeference (crowdsourcing)
  7. 7. What is Digitization? Specimen: – Slow to image • – Data storage • – 100,000 per year GB images Image delivery • Proprietery software
  8. 8. The NHM workflow* Data Capture Handling * Work in progress Imaging Preparation Post Processing
  9. 9. Preparation Handling Imaging Post Processing • Datamatrix Labels (4.5mm) • Processing Scripts (GIMP, Barcodefiler) • Computing Facilities (64bit, 16GB RAM) • Storage & Retrieval (Ke-EMu) – • What is a slide? Delivery (NHM data portal) Data Capture
  10. 10. Preparation Handling Imaging Post Processing • Horizontal vs Vertical storage • Card Slide covers! • Labelling & Handling = up to 90% of the time Data Capture
  11. 11. Preparation Handling Imaging Post Processing Data Capture • Scanner – SLR – Mamiya Leaf – SatScanner • Balance slides per image vs label resolution (PPI) • Single slide imaging?
  12. 12. Preparation Handling Imaging Horizontal Storage: • Less handling – • Tray fits A3 scanner / SLR Can be autocropped Post Processing Data Capture
  13. 13. Preparation Handling Imaging Horizontal Storage: • Less handling – • Tray fits A3 scanner / SLR Manual cropping – Crowd cropping? Post Processing Data Capture
  14. 14. Preparation Handling Imaging Post Processing Vertical storage: • Single type of template (post processing) • High contrast (scripts) • Cheap (foam, card) • More Handling • Autocropping Data Capture
  15. 15. Preparation • Imaging Post Processing Data Capture Resolution tests (PPI) – PPI Handling Canon 650D (18MP sensor) + 50mm Macro 250 Slides 72 300 450 600 45 18 10
  16. 16. Preparation • Handling Imaging Post Processing Data Capture Resolution tests (PPI) – Mamiya Leaf (80MP sensor) + 80mm lens PPI 300 450 600 Slides 180 72 50
  17. 17. Preparation • Handling Imaging Post Processing Data Capture Resolution tests (PPI) – HerbScanner (EPSON A3 size) PPI 300 450 600 Slides 50 50 50
  18. 18. Preparation • Handling Imaging Post Processing Resolution tests (PPI) – Slides SatScanner (0.16x lens, low resolution ~1000 PPI) 72 - 100 Data Capture
  19. 19. Preparation Handling Imaging Post Processing Data Capture
  20. 20. Preparation Handling Imaging Post Processing Data Capture
  21. 21. Preparation Handling Imaging Post Processing Data Capture
  22. 22. Preparation Handling Imaging Post Processing Data Capture
  23. 23. Progress to date • Psyllidae slide collection (4000 slides) • Two digitizers + SatScanner = 4 days • Handling (not Imaging) is the bottleneck • Solutions: – – More digitizers
  24. 24. Progress to date • Theoretical maximum – 1 SatScan: 7000 slides per day (5-8 people) label load 2 image unload label load image label load image unload label load label load image unload label label load image unload 3 4 – Other: 700 - 1000 slides per person per day
  25. 25. Future Plans • Specimen Imaging – Type material
  26. 26. Acknowledgments Peter Johanna Lyndsey Sara Flavia Elisa
  27. 27. Questions?

×