SlideShare a Scribd company logo
Software for Fast High Quality Transcription of
Digitized (Herbarium) Specimens
Frank Veldhuizen
www.alembo.nl
The Netherlands and Suriname
“Making the Case for Natural History Collections”
As more effort and resources are spent on digitizing
collections and making them available to an ever
expanding audience we feel it becomes more and more
important to explain what museum collections are, how
we preserve them, and most importantly why we have
these collections and why they matter.
Challenges in Transcribing of
Large Collections
- Collections usually contains millions of
Herbarium sheets
- Digitizing the collections is a big effort
- Transcribing the digitized collection is
an enormous effort
- In duration
- In human effort
- In consistent quality
- Costs are significant
Data Entry and Transcription Application
- SaaS application, built in angular = browser based input,
extremely light in usage of computer resources and user friendly
- High speed transcribing > 60 sheets per hour
- Multi Level Quality Control
- Utilizing existing look-up tables
- Resulting in high quality input >99% correct
- Output in all types of modern formats
- CSV
- XLS
- DBA
DETA:
Data Entry and Transcription Application
Executed Projects
Naturalis The Netherlands
- 3.000.000 sheets transcribed
- Start in September 2013
- Finish in May 2015
- Transcription of:
- Full Taxon
- collector info: collector, number, date
- Location info: location, country,
coordinates
- 60 transcriber staff at Alembo
- Quality Control and projectmanagement by
Picturae and Naturalis
Oslo/ Trondheim
- 450.000 sheets transcribed
- Start in 2016
- Finish in 2017
- Transcription of:
- Full Taxon Genus and Species
- collector info: collector date
- Location info: location, country,
coordinates
- 30 transcriber staff at Alembo
Executed Projects / some examples
● Plantentuin Meise I Belgium: 600.000 sheets
● Genève Switzerland: 98.000 sheets
● Lyon France: 175.000 sheets
● Montpellier France: 700.000 sheets
● Luxemburg: 45.000 sheets
● Oslo : 122.000 sheets
● Denmark: 29.000 sheets
● Kew gardens: 120.000 sheets
● KPZ: 1.510.000 sheets
Current Projects
The Smithsonian Institute
– 1.000.000 scans (two projects)
– 700.000 covers to be transcribed
–Transcription of:
•Full Taxon
•collector info: collector, number, date
•Location info: location, country,
coordinates
– Duration 2-3 years
– 15 transcriber staff at Alembo
Australia Royal Botanic Garden Sydney
– 700.000 sheets
– Start May 2019
–Transcription of:
•Full Taxon Genus and Species
•collector info: collector date
•Location info: location, country, coordinates
– Duration 2 years
–15 transcriber staff at Alembo
Workflow with multi-level two step Quality
Control
Transcribing
Quality
Control
Internal Workflow and Control Transcribers
Quality
Control
First Independent control
Accepted Batches
Rejected Batches
Feedback
Quality
Control
Database
Rejected Batches
Feedback
Accepted Batches
Approved Batches
When a high level of quality input is of
the essence
DETA provides awesome quality monitoring tools:
- Multi Levels of control can be implemented
- This allows for control by independent parties, multiple organisation levels
- A specific (and random) sample size can be taken
- This allows to increase or decrease the control percentage based on
delivered quality
- Practically at the start a higher percentage of the Transcribed Herbarium
sheets are controlled and during the transcription process the level of
control can reduced.
- Control per input field is possible
- This allows to focus more on important fields
- Per person the quality can be monitored
- This allows for specific training in case of consistent errors
Live Transcription
Link naar Deta
Demo
When to utilize DETA and/or Alembo
Data Entry and Transcribing Application
- Transcribing large collections
- In a predictable time period
- When high quality is required
- Easy to use, low cost
Alembo
- When advice is welcome
- Within a defined period
- Professional transcribers
- High quality
DETA Licence
- Commercial application
- Continuous developments
- Implementation fee
- Licensefee per user per month
Thank you
Questions
?
www.alembo.nl

More Related Content

Similar to Presentation DETA@SPNHC2019: Software for fast high quality transcription of digitized (herbarium) specimens

Stream reasoning: mastering the velocity and the variety dimensions of Big Da...
Stream reasoning: mastering the velocity and the variety dimensions of Big Da...Stream reasoning: mastering the velocity and the variety dimensions of Big Da...
Stream reasoning: mastering the velocity and the variety dimensions of Big Da...
Emanuele Della Valle
 
Stream Processing in Uber
Stream Processing in UberStream Processing in Uber
Stream Processing in Uber
C4Media
 
Rethinking Streaming Analytics for Scale
Rethinking Streaming Analytics for ScaleRethinking Streaming Analytics for Scale
Rethinking Streaming Analytics for Scale
C4Media
 
Accelerating Time to Science: Transforming Research in the Cloud
Accelerating Time to Science: Transforming Research in the CloudAccelerating Time to Science: Transforming Research in the Cloud
Accelerating Time to Science: Transforming Research in the Cloud
Jamie Kinney
 
Introduction to Data streaming - 05/12/2014
Introduction to Data streaming - 05/12/2014Introduction to Data streaming - 05/12/2014
Introduction to Data streaming - 05/12/2014
Raja Chiky
 
Prototype Phase Kick-off Event and Ceremony
Prototype Phase Kick-off Event and CeremonyPrototype Phase Kick-off Event and Ceremony
Prototype Phase Kick-off Event and Ceremony
Archiver
 
Research Cyberinfrastructure at UCSD - David Minor - RDAP12
Research Cyberinfrastructure at UCSD - David Minor - RDAP12Research Cyberinfrastructure at UCSD - David Minor - RDAP12
Research Cyberinfrastructure at UCSD - David Minor - RDAP12
ASIS&T
 
Accelerating Delivery of Data Products - The EBSCO Way
Accelerating Delivery of Data Products - The EBSCO WayAccelerating Delivery of Data Products - The EBSCO Way
Accelerating Delivery of Data Products - The EBSCO Way
MongoDB
 
MCN2016 - Photographing a Collection - From Public Galleries to Factories
MCN2016 - Photographing a Collection - From Public Galleries to FactoriesMCN2016 - Photographing a Collection - From Public Galleries to Factories
MCN2016 - Photographing a Collection - From Public Galleries to Factories
David Sanderson
 
VERDOODT Measuring clouds. A large scale acquisition and preservation service...
VERDOODT Measuring clouds. A large scale acquisition and preservation service...VERDOODT Measuring clouds. A large scale acquisition and preservation service...
VERDOODT Measuring clouds. A large scale acquisition and preservation service...
FIAT/IFTA
 
Optique presentation
Optique presentationOptique presentation
Optique presentation
DBOnto
 
Michele Tonutti - Scaling is caring - Codemotion Amsterdam 2019
Michele Tonutti - Scaling is caring - Codemotion Amsterdam 2019Michele Tonutti - Scaling is caring - Codemotion Amsterdam 2019
Michele Tonutti - Scaling is caring - Codemotion Amsterdam 2019
Codemotion
 
Accidental Collection Assessment: the NCSU Libraries Collection Move
Accidental Collection Assessment: the NCSU Libraries Collection MoveAccidental Collection Assessment: the NCSU Libraries Collection Move
Accidental Collection Assessment: the NCSU Libraries Collection Move
Hilary Davis
 
Using Request Queues for Enhancing the Performance of Operations in Smart Homes
Using Request Queues for Enhancing the Performance of Operations in Smart HomesUsing Request Queues for Enhancing the Performance of Operations in Smart Homes
Using Request Queues for Enhancing the Performance of Operations in Smart Homes
Andreas Kamilaris
 
Analyzing Big Data in Medicine with Virtual Research Environments and Microse...
Analyzing Big Data in Medicine with Virtual Research Environments and Microse...Analyzing Big Data in Medicine with Virtual Research Environments and Microse...
Analyzing Big Data in Medicine with Virtual Research Environments and Microse...
Ola Spjuth
 
High Performance Computing and the Opportunity with Cognitive Technology
 High Performance Computing and the Opportunity with Cognitive Technology High Performance Computing and the Opportunity with Cognitive Technology
High Performance Computing and the Opportunity with Cognitive Technology
IBM Watson
 
Digital Colony Counter. pptx
Digital Colony Counter.             pptxDigital Colony Counter.             pptx
Digital Colony Counter. pptx
labdexunofficial
 
Celsius Bloodhound: Automatizing searching and fetching records from library ...
Celsius Bloodhound: Automatizing searching and fetching records from library ...Celsius Bloodhound: Automatizing searching and fetching records from library ...
Celsius Bloodhound: Automatizing searching and fetching records from library ...
Servicio de Difusión de la Creación Intelectual (SEDICI)
 
Italy: ARIADNE - Success stories from partners and the research community
Italy: ARIADNE - Success stories from partners and the research communityItaly: ARIADNE - Success stories from partners and the research community
Italy: ARIADNE - Success stories from partners and the research community
ariadnenetwork
 
AWS Enterprise Day | Closing Keynote, Singapore - Dr Werner Vogels
AWS Enterprise Day | Closing Keynote, Singapore - Dr Werner VogelsAWS Enterprise Day | Closing Keynote, Singapore - Dr Werner Vogels
AWS Enterprise Day | Closing Keynote, Singapore - Dr Werner Vogels
Amazon Web Services
 

Similar to Presentation DETA@SPNHC2019: Software for fast high quality transcription of digitized (herbarium) specimens (20)

Stream reasoning: mastering the velocity and the variety dimensions of Big Da...
Stream reasoning: mastering the velocity and the variety dimensions of Big Da...Stream reasoning: mastering the velocity and the variety dimensions of Big Da...
Stream reasoning: mastering the velocity and the variety dimensions of Big Da...
 
Stream Processing in Uber
Stream Processing in UberStream Processing in Uber
Stream Processing in Uber
 
Rethinking Streaming Analytics for Scale
Rethinking Streaming Analytics for ScaleRethinking Streaming Analytics for Scale
Rethinking Streaming Analytics for Scale
 
Accelerating Time to Science: Transforming Research in the Cloud
Accelerating Time to Science: Transforming Research in the CloudAccelerating Time to Science: Transforming Research in the Cloud
Accelerating Time to Science: Transforming Research in the Cloud
 
Introduction to Data streaming - 05/12/2014
Introduction to Data streaming - 05/12/2014Introduction to Data streaming - 05/12/2014
Introduction to Data streaming - 05/12/2014
 
Prototype Phase Kick-off Event and Ceremony
Prototype Phase Kick-off Event and CeremonyPrototype Phase Kick-off Event and Ceremony
Prototype Phase Kick-off Event and Ceremony
 
Research Cyberinfrastructure at UCSD - David Minor - RDAP12
Research Cyberinfrastructure at UCSD - David Minor - RDAP12Research Cyberinfrastructure at UCSD - David Minor - RDAP12
Research Cyberinfrastructure at UCSD - David Minor - RDAP12
 
Accelerating Delivery of Data Products - The EBSCO Way
Accelerating Delivery of Data Products - The EBSCO WayAccelerating Delivery of Data Products - The EBSCO Way
Accelerating Delivery of Data Products - The EBSCO Way
 
MCN2016 - Photographing a Collection - From Public Galleries to Factories
MCN2016 - Photographing a Collection - From Public Galleries to FactoriesMCN2016 - Photographing a Collection - From Public Galleries to Factories
MCN2016 - Photographing a Collection - From Public Galleries to Factories
 
VERDOODT Measuring clouds. A large scale acquisition and preservation service...
VERDOODT Measuring clouds. A large scale acquisition and preservation service...VERDOODT Measuring clouds. A large scale acquisition and preservation service...
VERDOODT Measuring clouds. A large scale acquisition and preservation service...
 
Optique presentation
Optique presentationOptique presentation
Optique presentation
 
Michele Tonutti - Scaling is caring - Codemotion Amsterdam 2019
Michele Tonutti - Scaling is caring - Codemotion Amsterdam 2019Michele Tonutti - Scaling is caring - Codemotion Amsterdam 2019
Michele Tonutti - Scaling is caring - Codemotion Amsterdam 2019
 
Accidental Collection Assessment: the NCSU Libraries Collection Move
Accidental Collection Assessment: the NCSU Libraries Collection MoveAccidental Collection Assessment: the NCSU Libraries Collection Move
Accidental Collection Assessment: the NCSU Libraries Collection Move
 
Using Request Queues for Enhancing the Performance of Operations in Smart Homes
Using Request Queues for Enhancing the Performance of Operations in Smart HomesUsing Request Queues for Enhancing the Performance of Operations in Smart Homes
Using Request Queues for Enhancing the Performance of Operations in Smart Homes
 
Analyzing Big Data in Medicine with Virtual Research Environments and Microse...
Analyzing Big Data in Medicine with Virtual Research Environments and Microse...Analyzing Big Data in Medicine with Virtual Research Environments and Microse...
Analyzing Big Data in Medicine with Virtual Research Environments and Microse...
 
High Performance Computing and the Opportunity with Cognitive Technology
 High Performance Computing and the Opportunity with Cognitive Technology High Performance Computing and the Opportunity with Cognitive Technology
High Performance Computing and the Opportunity with Cognitive Technology
 
Digital Colony Counter. pptx
Digital Colony Counter.             pptxDigital Colony Counter.             pptx
Digital Colony Counter. pptx
 
Celsius Bloodhound: Automatizing searching and fetching records from library ...
Celsius Bloodhound: Automatizing searching and fetching records from library ...Celsius Bloodhound: Automatizing searching and fetching records from library ...
Celsius Bloodhound: Automatizing searching and fetching records from library ...
 
Italy: ARIADNE - Success stories from partners and the research community
Italy: ARIADNE - Success stories from partners and the research communityItaly: ARIADNE - Success stories from partners and the research community
Italy: ARIADNE - Success stories from partners and the research community
 
AWS Enterprise Day | Closing Keynote, Singapore - Dr Werner Vogels
AWS Enterprise Day | Closing Keynote, Singapore - Dr Werner VogelsAWS Enterprise Day | Closing Keynote, Singapore - Dr Werner Vogels
AWS Enterprise Day | Closing Keynote, Singapore - Dr Werner Vogels
 

Recently uploaded

原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
ihavuls
 
Experts live - Improving user adoption with AI
Experts live - Improving user adoption with AIExperts live - Improving user adoption with AI
Experts live - Improving user adoption with AI
jitskeb
 
一比一原版(Sheffield毕业证书)谢菲尔德大学毕业证如何办理
一比一原版(Sheffield毕业证书)谢菲尔德大学毕业证如何办理一比一原版(Sheffield毕业证书)谢菲尔德大学毕业证如何办理
一比一原版(Sheffield毕业证书)谢菲尔德大学毕业证如何办理
1tyxnjpia
 
一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理
一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理
一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理
slg6lamcq
 
How To Control IO Usage using Resource Manager
How To Control IO Usage using Resource ManagerHow To Control IO Usage using Resource Manager
How To Control IO Usage using Resource Manager
Alireza Kamrani
 
Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......
Sachin Paul
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
Lars Albertsson
 
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
hyfjgavov
 
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
Social Samosa
 
Cell The Unit of Life for NEET Multiple Choice Questions.docx
Cell The Unit of Life for NEET Multiple Choice Questions.docxCell The Unit of Life for NEET Multiple Choice Questions.docx
Cell The Unit of Life for NEET Multiple Choice Questions.docx
vasanthatpuram
 
Building a Quantum Computer Neutral Atom.pdf
Building a Quantum Computer Neutral Atom.pdfBuilding a Quantum Computer Neutral Atom.pdf
Building a Quantum Computer Neutral Atom.pdf
cjimenez2581
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
aqzctr7x
 
原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理
原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理 原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理
原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理
tzu5xla
 
writing report business partner b1+ .pdf
writing report business partner b1+ .pdfwriting report business partner b1+ .pdf
writing report business partner b1+ .pdf
VyNguyen709676
 
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
mkkikqvo
 
Jio cinema Retention & Engagement Strategy.pdf
Jio cinema Retention & Engagement Strategy.pdfJio cinema Retention & Engagement Strategy.pdf
Jio cinema Retention & Engagement Strategy.pdf
inaya7568
 
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
Timothy Spann
 
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
z6osjkqvd
 
A presentation that explain the Power BI Licensing
A presentation that explain the Power BI LicensingA presentation that explain the Power BI Licensing
A presentation that explain the Power BI Licensing
AlessioFois2
 
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docxDATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
SaffaIbrahim1
 

Recently uploaded (20)

原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
 
Experts live - Improving user adoption with AI
Experts live - Improving user adoption with AIExperts live - Improving user adoption with AI
Experts live - Improving user adoption with AI
 
一比一原版(Sheffield毕业证书)谢菲尔德大学毕业证如何办理
一比一原版(Sheffield毕业证书)谢菲尔德大学毕业证如何办理一比一原版(Sheffield毕业证书)谢菲尔德大学毕业证如何办理
一比一原版(Sheffield毕业证书)谢菲尔德大学毕业证如何办理
 
一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理
一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理
一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理
 
How To Control IO Usage using Resource Manager
How To Control IO Usage using Resource ManagerHow To Control IO Usage using Resource Manager
How To Control IO Usage using Resource Manager
 
Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
 
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
 
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
 
Cell The Unit of Life for NEET Multiple Choice Questions.docx
Cell The Unit of Life for NEET Multiple Choice Questions.docxCell The Unit of Life for NEET Multiple Choice Questions.docx
Cell The Unit of Life for NEET Multiple Choice Questions.docx
 
Building a Quantum Computer Neutral Atom.pdf
Building a Quantum Computer Neutral Atom.pdfBuilding a Quantum Computer Neutral Atom.pdf
Building a Quantum Computer Neutral Atom.pdf
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
 
原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理
原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理 原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理
原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理
 
writing report business partner b1+ .pdf
writing report business partner b1+ .pdfwriting report business partner b1+ .pdf
writing report business partner b1+ .pdf
 
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
 
Jio cinema Retention & Engagement Strategy.pdf
Jio cinema Retention & Engagement Strategy.pdfJio cinema Retention & Engagement Strategy.pdf
Jio cinema Retention & Engagement Strategy.pdf
 
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
 
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
 
A presentation that explain the Power BI Licensing
A presentation that explain the Power BI LicensingA presentation that explain the Power BI Licensing
A presentation that explain the Power BI Licensing
 
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docxDATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
 

Presentation DETA@SPNHC2019: Software for fast high quality transcription of digitized (herbarium) specimens

  • 1. Software for Fast High Quality Transcription of Digitized (Herbarium) Specimens Frank Veldhuizen www.alembo.nl The Netherlands and Suriname “Making the Case for Natural History Collections” As more effort and resources are spent on digitizing collections and making them available to an ever expanding audience we feel it becomes more and more important to explain what museum collections are, how we preserve them, and most importantly why we have these collections and why they matter.
  • 2. Challenges in Transcribing of Large Collections - Collections usually contains millions of Herbarium sheets - Digitizing the collections is a big effort - Transcribing the digitized collection is an enormous effort - In duration - In human effort - In consistent quality - Costs are significant
  • 3. Data Entry and Transcription Application - SaaS application, built in angular = browser based input, extremely light in usage of computer resources and user friendly - High speed transcribing > 60 sheets per hour - Multi Level Quality Control - Utilizing existing look-up tables - Resulting in high quality input >99% correct - Output in all types of modern formats - CSV - XLS - DBA
  • 4. DETA: Data Entry and Transcription Application
  • 5. Executed Projects Naturalis The Netherlands - 3.000.000 sheets transcribed - Start in September 2013 - Finish in May 2015 - Transcription of: - Full Taxon - collector info: collector, number, date - Location info: location, country, coordinates - 60 transcriber staff at Alembo - Quality Control and projectmanagement by Picturae and Naturalis Oslo/ Trondheim - 450.000 sheets transcribed - Start in 2016 - Finish in 2017 - Transcription of: - Full Taxon Genus and Species - collector info: collector date - Location info: location, country, coordinates - 30 transcriber staff at Alembo
  • 6. Executed Projects / some examples ● Plantentuin Meise I Belgium: 600.000 sheets ● Genève Switzerland: 98.000 sheets ● Lyon France: 175.000 sheets ● Montpellier France: 700.000 sheets ● Luxemburg: 45.000 sheets ● Oslo : 122.000 sheets ● Denmark: 29.000 sheets ● Kew gardens: 120.000 sheets ● KPZ: 1.510.000 sheets
  • 7. Current Projects The Smithsonian Institute – 1.000.000 scans (two projects) – 700.000 covers to be transcribed –Transcription of: •Full Taxon •collector info: collector, number, date •Location info: location, country, coordinates – Duration 2-3 years – 15 transcriber staff at Alembo Australia Royal Botanic Garden Sydney – 700.000 sheets – Start May 2019 –Transcription of: •Full Taxon Genus and Species •collector info: collector date •Location info: location, country, coordinates – Duration 2 years –15 transcriber staff at Alembo
  • 8. Workflow with multi-level two step Quality Control Transcribing Quality Control Internal Workflow and Control Transcribers Quality Control First Independent control Accepted Batches Rejected Batches Feedback Quality Control Database Rejected Batches Feedback Accepted Batches Approved Batches
  • 9. When a high level of quality input is of the essence DETA provides awesome quality monitoring tools: - Multi Levels of control can be implemented - This allows for control by independent parties, multiple organisation levels - A specific (and random) sample size can be taken - This allows to increase or decrease the control percentage based on delivered quality - Practically at the start a higher percentage of the Transcribed Herbarium sheets are controlled and during the transcription process the level of control can reduced. - Control per input field is possible - This allows to focus more on important fields - Per person the quality can be monitored - This allows for specific training in case of consistent errors
  • 11. When to utilize DETA and/or Alembo Data Entry and Transcribing Application - Transcribing large collections - In a predictable time period - When high quality is required - Easy to use, low cost Alembo - When advice is welcome - Within a defined period - Professional transcribers - High quality DETA Licence - Commercial application - Continuous developments - Implementation fee - Licensefee per user per month

Editor's Notes

  1. Liever een Herbarium screenshot
  2. Screenshot van de sidebar
  3. --- Verwerking --- Batches toewijzen (Batches kunnen ook automatisch toegewezen worden aan gebruikers indien het werd ingesteld bij de gebruikersinstellingen) Batch verwerken als een normale Operator Batch controleren als een Controleur Batch exporteren --- Rapportages --- Productie rapportage aantonen Controle rapportage aantonen