More Related Content

Similar to Presentation DETA@SPNHC2019: Software for fast high quality transcription of digitized (herbarium) specimens(20)


Presentation DETA@SPNHC2019: Software for fast high quality transcription of digitized (herbarium) specimens

  1. Software for Fast High Quality Transcription of Digitized (Herbarium) Specimens Frank Veldhuizen The Netherlands and Suriname “Making the Case for Natural History Collections” As more effort and resources are spent on digitizing collections and making them available to an ever expanding audience we feel it becomes more and more important to explain what museum collections are, how we preserve them, and most importantly why we have these collections and why they matter.
  2. Challenges in Transcribing of Large Collections - Collections usually contains millions of Herbarium sheets - Digitizing the collections is a big effort - Transcribing the digitized collection is an enormous effort - In duration - In human effort - In consistent quality - Costs are significant
  3. Data Entry and Transcription Application - SaaS application, built in angular = browser based input, extremely light in usage of computer resources and user friendly - High speed transcribing > 60 sheets per hour - Multi Level Quality Control - Utilizing existing look-up tables - Resulting in high quality input >99% correct - Output in all types of modern formats - CSV - XLS - DBA
  4. DETA: Data Entry and Transcription Application
  5. Executed Projects Naturalis The Netherlands - 3.000.000 sheets transcribed - Start in September 2013 - Finish in May 2015 - Transcription of: - Full Taxon - collector info: collector, number, date - Location info: location, country, coordinates - 60 transcriber staff at Alembo - Quality Control and projectmanagement by Picturae and Naturalis Oslo/ Trondheim - 450.000 sheets transcribed - Start in 2016 - Finish in 2017 - Transcription of: - Full Taxon Genus and Species - collector info: collector date - Location info: location, country, coordinates - 30 transcriber staff at Alembo
  6. Executed Projects / some examples ● Plantentuin Meise I Belgium: 600.000 sheets ● Genève Switzerland: 98.000 sheets ● Lyon France: 175.000 sheets ● Montpellier France: 700.000 sheets ● Luxemburg: 45.000 sheets ● Oslo : 122.000 sheets ● Denmark: 29.000 sheets ● Kew gardens: 120.000 sheets ● KPZ: 1.510.000 sheets
  7. Current Projects The Smithsonian Institute – 1.000.000 scans (two projects) – 700.000 covers to be transcribed –Transcription of: •Full Taxon •collector info: collector, number, date •Location info: location, country, coordinates – Duration 2-3 years – 15 transcriber staff at Alembo Australia Royal Botanic Garden Sydney – 700.000 sheets – Start May 2019 –Transcription of: •Full Taxon Genus and Species •collector info: collector date •Location info: location, country, coordinates – Duration 2 years –15 transcriber staff at Alembo
  8. Workflow with multi-level two step Quality Control Transcribing Quality Control Internal Workflow and Control Transcribers Quality Control First Independent control Accepted Batches Rejected Batches Feedback Quality Control Database Rejected Batches Feedback Accepted Batches Approved Batches
  9. When a high level of quality input is of the essence DETA provides awesome quality monitoring tools: - Multi Levels of control can be implemented - This allows for control by independent parties, multiple organisation levels - A specific (and random) sample size can be taken - This allows to increase or decrease the control percentage based on delivered quality - Practically at the start a higher percentage of the Transcribed Herbarium sheets are controlled and during the transcription process the level of control can reduced. - Control per input field is possible - This allows to focus more on important fields - Per person the quality can be monitored - This allows for specific training in case of consistent errors
  10. Live Transcription Link naar Deta Demo
  11. When to utilize DETA and/or Alembo Data Entry and Transcribing Application - Transcribing large collections - In a predictable time period - When high quality is required - Easy to use, low cost Alembo - When advice is welcome - Within a defined period - Professional transcribers - High quality DETA Licence - Commercial application - Continuous developments - Implementation fee - Licensefee per user per month
  12. Thank you Questions ?

Editor's Notes

  1. Liever een Herbarium screenshot
  2. Screenshot van de sidebar
  3. --- Verwerking --- Batches toewijzen (Batches kunnen ook automatisch toegewezen worden aan gebruikers indien het werd ingesteld bij de gebruikersinstellingen) Batch verwerken als een normale Operator Batch controleren als een Controleur Batch exporteren --- Rapportages --- Productie rapportage aantonen Controle rapportage aantonen