1) The Biodiversity Heritage Library (BHL) contains over 100,000 books and 39 million digitized pages of biodiversity literature from the 18th-21st centuries.
2) BHL provides structured metadata and unstructured text from OCR to allow for searching and data mining through APIs, a user interface, and data exports.
3) A key challenge for BHL has been accurately extracting scientific names of organisms from the historical literature, which it has addressed through algorithms like TaxonFinder and new collaborations.
4) BHL is now exploring gamifying tasks like correcting OCR and identifying images to engage users in improving data quality and extracting additional information from