0
Rapid, industrial scale
digitization of the NHM
microscope slide collection
Ben Price & Vladimir Blagoderov
Outline
•

The NHM slide collection

•

What is Digitization?

•

The NHM workflow

•

Psyllid collection

•

Future prosp...
The NHM slide collection
•

~ 2 million slides (60 : 40 vertical : horizontal storage)
The NHM slide collection
•

Mix of slide sizes, mounts, storage cabinets
What is Digitization?

?
What is Digitization?
Label data:
–

Quick to image
•

5000 per day

–

Slow to transcribe (crowdsourcing)

–

Slow to geo...
What is Digitization?
Specimen:
–

Slow to image
•

–

Data storage
•

–

100,000 per year

GB images

Image delivery
•

P...
The NHM workflow*

Data Capture

Handling

* Work in progress

Imaging

Preparation

Post Processing
Preparation

Handling

Imaging

Post Processing

•

Datamatrix Labels (4.5mm)

•

Processing Scripts (GIMP, Barcodefiler)
...
Preparation

Handling

Imaging

Post Processing

•

Horizontal vs Vertical storage

•

Card Slide covers!

•

Labelling & ...
Preparation

Handling

Imaging

Post Processing

Data Capture

•

Scanner – SLR – Mamiya Leaf – SatScanner

•

Balance sli...
Preparation

Handling

Imaging

Horizontal Storage:
•

Less handling
–

•

Tray fits A3 scanner / SLR

Can be autocropped
...
Preparation

Handling

Imaging

Horizontal Storage:
•

Less handling
–

•

Tray fits A3 scanner / SLR

Manual cropping
–

...
Preparation

Handling

Imaging

Post Processing

Vertical storage:
•

Single type of template (post processing)

•

High c...
Preparation

•

Imaging

Post Processing

Data Capture

Resolution tests (PPI)
–

PPI

Handling

Canon 650D (18MP sensor) ...
Preparation

•

Handling

Imaging

Post Processing

Data Capture

Resolution tests (PPI)
–

Mamiya Leaf (80MP sensor) + 80...
Preparation

•

Handling

Imaging

Post Processing

Data Capture

Resolution tests (PPI)
–

HerbScanner (EPSON A3 size)

P...
Preparation

•

Handling

Imaging

Post Processing

Resolution tests (PPI)
–

Slides

SatScanner (0.16x lens, low resoluti...
Preparation

Handling

Imaging

Post Processing

Data Capture
Preparation

Handling

Imaging

Post Processing

Data Capture
Preparation

Handling

Imaging

Post Processing

Data Capture
Preparation

Handling

Imaging

Post Processing

Data Capture
Progress to date
•

Psyllidae slide collection (4000 slides)

•

Two digitizers + SatScanner = 4 days

•

Handling (not Im...
Progress to date
•

Theoretical maximum
–

1

SatScan: 7000 slides per day (5-8 people)

label

load

2

image

unload

la...
Future Plans
•

Specimen Imaging
–

Type material
Acknowledgments

Peter

Johanna

Lyndsey
Sara

Flavia

Elisa
Questions?
Price2 ecn2013
Upcoming SlideShare
Loading in...5
×

Price2 ecn2013

288

Published on

Published in: Technology, Business
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
288
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
5
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Transcript of "Price2 ecn2013"

  1. 1. Rapid, industrial scale digitization of the NHM microscope slide collection Ben Price & Vladimir Blagoderov
  2. 2. Outline • The NHM slide collection • What is Digitization? • The NHM workflow • Psyllid collection • Future prospects
  3. 3. The NHM slide collection • ~ 2 million slides (60 : 40 vertical : horizontal storage)
  4. 4. The NHM slide collection • Mix of slide sizes, mounts, storage cabinets
  5. 5. What is Digitization? ?
  6. 6. What is Digitization? Label data: – Quick to image • 5000 per day – Slow to transcribe (crowdsourcing) – Slow to georeference (crowdsourcing)
  7. 7. What is Digitization? Specimen: – Slow to image • – Data storage • – 100,000 per year GB images Image delivery • Proprietery software
  8. 8. The NHM workflow* Data Capture Handling * Work in progress Imaging Preparation Post Processing
  9. 9. Preparation Handling Imaging Post Processing • Datamatrix Labels (4.5mm) • Processing Scripts (GIMP, Barcodefiler) • Computing Facilities (64bit, 16GB RAM) • Storage & Retrieval (Ke-EMu) – • What is a slide? Delivery (NHM data portal) Data Capture
  10. 10. Preparation Handling Imaging Post Processing • Horizontal vs Vertical storage • Card Slide covers! • Labelling & Handling = up to 90% of the time Data Capture
  11. 11. Preparation Handling Imaging Post Processing Data Capture • Scanner – SLR – Mamiya Leaf – SatScanner • Balance slides per image vs label resolution (PPI) • Single slide imaging?
  12. 12. Preparation Handling Imaging Horizontal Storage: • Less handling – • Tray fits A3 scanner / SLR Can be autocropped Post Processing Data Capture
  13. 13. Preparation Handling Imaging Horizontal Storage: • Less handling – • Tray fits A3 scanner / SLR Manual cropping – Crowd cropping? Post Processing Data Capture
  14. 14. Preparation Handling Imaging Post Processing Vertical storage: • Single type of template (post processing) • High contrast (scripts) • Cheap (foam, card) • More Handling • Autocropping Data Capture
  15. 15. Preparation • Imaging Post Processing Data Capture Resolution tests (PPI) – PPI Handling Canon 650D (18MP sensor) + 50mm Macro 250 Slides 72 300 450 600 45 18 10
  16. 16. Preparation • Handling Imaging Post Processing Data Capture Resolution tests (PPI) – Mamiya Leaf (80MP sensor) + 80mm lens PPI 300 450 600 Slides 180 72 50
  17. 17. Preparation • Handling Imaging Post Processing Data Capture Resolution tests (PPI) – HerbScanner (EPSON A3 size) PPI 300 450 600 Slides 50 50 50
  18. 18. Preparation • Handling Imaging Post Processing Resolution tests (PPI) – Slides SatScanner (0.16x lens, low resolution ~1000 PPI) 72 - 100 Data Capture
  19. 19. Preparation Handling Imaging Post Processing Data Capture
  20. 20. Preparation Handling Imaging Post Processing Data Capture
  21. 21. Preparation Handling Imaging Post Processing Data Capture
  22. 22. Preparation Handling Imaging Post Processing Data Capture
  23. 23. Progress to date • Psyllidae slide collection (4000 slides) • Two digitizers + SatScanner = 4 days • Handling (not Imaging) is the bottleneck • Solutions: – – More digitizers
  24. 24. Progress to date • Theoretical maximum – 1 SatScan: 7000 slides per day (5-8 people) label load 2 image unload label load image label load image unload label load label load image unload label label load image unload 3 4 – Other: 700 - 1000 slides per person per day
  25. 25. Future Plans • Specimen Imaging – Type material
  26. 26. Acknowledgments Peter Johanna Lyndsey Sara Flavia Elisa
  27. 27. Questions?
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×