Your SlideShare is downloading. ×
All creatures great and small:  metadata for biodiversity illustrations
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Saving this for later?

Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime - even offline.

Text the download link to your phone

Standard text messaging rates apply

All creatures great and small: metadata for biodiversity illustrations

923
views

Published on

This is a talk proposed by Gaurav Vaidya (University of Colorado Boulder), William Ulate (Missouri Botanical Garden), Robert Guralnick (University of Colorado Boulder), and Trish Rose-Sandler …

This is a talk proposed by Gaurav Vaidya (University of Colorado Boulder), William Ulate (Missouri Botanical Garden), Robert Guralnick (University of Colorado Boulder), and Trish Rose-Sandler (Missouri Botanical Garden)

The Art of Life project is developing a metadata schema for describing and improving access to the natural history images contained in the 38 million pages digitized by the Biodiversity Heritage Library. These images paint a vibrant picture of Europeans’ first encounters with exotic plants and animals in the 17th and 18th centuries, drawn by some of the finest illustrators in the world. They also provide valuable documentation of when, where, and who first observed a species. We present our preliminary schema, which accommodates a variety of delivery systems, including Flickr and Wikimedia Commons, and varying levels of information detail.

Published in: Technology

1 Comment
0 Likes
Statistics
Notes
  • Be the first to like this

No Downloads
Views
Total Views
923
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
3
Comments
1
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • This project aims to develop software tools and a metadata schema for visual resources contained within the scanned literature made available through BHL digitization activities.
  • Biodiversity Heritage Library (BHL) is a consortium of natural history and botanical libraries that cooperate to digitize and make accessible the legacy literature of biodiversity held in their collections and to make that literature available for open access and responsible use as a part of a global “biodiversity commons.” One of our primary audiences are taxonomists who use BHL to find the first occurrence of name for a species in the historic literature. Also to track how that name has changed over time. What began as an consortium in the US and UK it is now an increasingly global effort – there are now BHL nodes in China, Australia, Egypt, Europe, Brazil and soon to be AfricaThe BHL US/UK operation is made up of 6 fulltime staff located at both the Smithsonian Libraries and Missouri Botanical Garden as well as contributions from staff at the member institutions who allow a certain percentage of their staff time to work on BHL (comes out to a little over 16 FTEs from the 14 member institutions)Missouri Botanical Garden has been involved in the BHL both as a data contributor (our library has digitized books and journals for the project) and as the home for the technical development of the project
  • At this point we have a critical mass of content-We have over 57 thousand titles-108 thousand volumes-Almost 40 million pagesWe have a portal where we serve all of BHL content at biodiversitylibrary.orgAll of BHL data is open access andwe provide data in a variety of exportsWe encourage digital library aggregators to incorporate our records into their portals and library catalogs. We provide data in ways that it can be mined & recontextualized
  • Problem Statement- Art of Life evolved out of a need in the BHL that was expressed by our users. We had a critical mass of content online, BHL users knew there were amazing images within the BHL pages but there was no easy way to find them other than opening up a BHL book or volume and scrolling through page by page to find illustrations. There is no descriptive metadata attached to the illustration that would tell you the content of the image, date when they were created or who was involved in their creation.
  • We have created a BHL account in Flickr and pushed over 48,000 images so far but but this is all a very manual process that takes considerable staff time.
  • This is the Art of Life workflow diagram which identifies the 4 processes the illustrations will go through as they move through each stage of the workflowThe Extract stage is where BHL pages will be run through an algorithm to identify which pages contain illustrations, whether they be full plates or only a section of the pageAt the Classify stage, the pages with illustrations will be tagged by Art of Life staff as being one or several broad types such as drawing/painting, photograph, diagram and even map. For the Describe stage, the illustrations will be pushed into platforms such as Flickr and Wikimedia Commons where both the general public and specialists can describe them in much greater detail such as adding a title, creator, date (if different from date of publication), and subjectsIn the Share stage, the metadata for the illustrations will be reingested back into the BHL portal for searching there. And of course we want to be able to preserve any contributed metadata from external platforms. We also want to broaden the audience for these illustrations because we believe they have a wide appeal to artists, biologists, humanities scholars, particularly historians of science; librarians, education and outreach. Many of the audiences don’t know about BHL and won’t go to the BHL platform looking for the content so we want to push the illustrations out to environments where they already are: Encyclopedia of Life, ARTstor, and even ITunes.
  •  A challenge for this project will be to identify the schema, or perhaps schemas, that can serve the metadata needs of a mix of audiences, asshown For example, an art historian reviewing an illustration may be interested in knowing the artist and geographic location where the work was created in orderto understand how the artist was influenced by his or her locality. A scientist, considering the same illustration, may be interested in knowing the species name and geographic distribution of the organism depicted in the illustration to compare the development of the species with related species from that area. Both have a need for the geographic metadata contained within the text, but from different perspectives.Since we wanted to push these illustrations out into other platforms for crowdsourcing the descriptions and then bring that metadata back into the BHL platform we needed a schema that would help guide users in what information to contribute and how to record it and also to create some consistency in those descriptions so they are easier to bring back to BHL  Rather than inventing a new schema from scratch we really wanted to adopt an existing schema or schemas so that when we shared the described illustrations beyond the BHL, the metadata could easily interoperate with data in other systems .
  • We ended up choosing most of the schema elements from the VRA Core because its elements and attributes were mostly closely aligned with the types of information we felt were important to record. But also because its relationship of works linking to one or more images fit nicely with the BHL pages often containing one or more illustrations on a single page. The only thing the VRA Core lacked was a way to record an acceptedName and CommonName for a species. VRA Core has a subject attribute type of scientificName but Taxonomists would be interested in knowing the multiple names by which species are known. Darwin Core was able to fulfill this need and so we borrowed 2 elements from that schema
  • Transcript

    • 1. All creatures great and small :metadata for biodiversity illustrationsDLF Forum 2012 Denver CO Trish Rose-Sandler, Missouri Botanical Garden Art of Life project
    • 2. What is Art of Life?• Full title - The Art of Life: Data Mining and Crowdsourcing the Identification and Description of Natural History Illustrations from the Biodiversity Heritage Library (BHL)• Grant given to Missouri Botanical Garden in St Louis• Funded by National Endowment for the Humanities• Runs May 2012-April 2014DLF Forum 2012 Denver CO Trish Rose-Sandler, Missouri Botanical Garden Art of Life project
    • 3. What is BHL? • A consortium of 14 natural history, botanical libraries and research institutions • An open access digital library for legacy biodiversity literature • An open data repository of taxonomic names and bibliographic information • An increasingly global effortDLF Forum 2012 Denver CO Trish Rose-Sandler, Missouri Botanical Garden Art of Life project
    • 4. DLF Forum 2012 Denver CO Trish Rose-Sandler, Missouri Botanical Garden Art of Life project
    • 5. Why the need for Art of Life? Problem statement – users want access to images, access to images is limited to page by page scroll or viewing selection of images in Flickr, not searchable by image content (e.g. corn, zea mays)DLF Forum 2012 Denver CO Trish Rose-Sandler, Missouri Botanical Garden Art of Life project
    • 6. Table ofcontents Page by page scroll DLF Forum 2012 Denver CO Trish Rose-Sandler, Missouri Botanical Garden Art of Life project
    • 7. DLF Forum 2012 Denver CO Trish Rose-Sandler, Missouri Botanical Garden Art of Life project
    • 8. 5 Primary Objectives of Art of Life Objective 1: Define an appropriate metadata schema for natural history illustrations Objective 2: Build software tools to automatically identify illustrations in the BHL corpus Objective 3: Enhance existing tools to enable the initial sorting, viewing, and editing of these identified visual resources. Objective 4: Integrate tagging applications to enable a community of users to edit descriptive metadata for the illustrations Objective 5: Integrate the descriptive metadata generated by users back into BHL portal both for access and preservation DLF Forum 2012 Denver CO Trish Rose-Sandler, Missouri Botanical Garden Art of Life project
    • 9. DLF Forum 2012 Denver CO Trish Rose-Sandler, Missouri Botanical Garden Art of Life project
    • 10. Current status of Art of Life • Development of the algorithm is about 70% complete and will be done by Jan 2013 • Draft schema available for public review http://tinyurl.com/9hm7nsb • Classifier tool – reusing an existing BHL tool developed by Joel Richard called Macaw http://code.google.com/p/macaw- book-metadata-tool/DLF Forum 2012 Denver CO Trish Rose-Sandler, Missouri Botanical Garden Art of Life project
    • 11. Art of Life Schema Needs to support three objectives: (1) to enable the discovery, description and use of the identified images by artists, biologists, humanities scholars, librarians, and educators; (2) to make BHL’s metadata and images available to other platforms; and (3) to import crowdsourced metadata generated in other platforms back into BHL.DLF Forum 2012 Denver CO Trish Rose-Sandler, Missouri Botanical Garden Art of Life project
    • 12. Schema landscape review – VRA Core 4.0 (borrowed 9 elements) – LIDO – Darwin Core (borrowed 2 elements) – Dublin CoreDLF Forum 2012 Denver CO Trish Rose-Sandler, Missouri Botanical Garden Art of Life project
    • 13. ART OF LIFE SCHEMA ELEMENTS red =required Title Type Date Copyright Source Agent Subjects Description InscriptionDLF Forum 2012 Denver CO Trish Rose-Sandler, Missouri Botanical Garden Art of Life project
    • 14. Example of illustration described using Art of Life schema Title Stictospiza formosa Type Paintings Date Publication: 1898 Agent Author: Arthur G. Butler (1844-1925) Illustrator: F.W. Frohawk (1861-1946) Description A pair of finches with green and yellow bodies resting on reeds Subjects Scientific name: Amandava formosa (Latham, 1790) Vernacular Name: Green Avadavat or Green Munia Accepted Name: Amandava formosa (Latham, 1790) Birds, finches Inscriptions bottom center: Green Amaduvade Waxbill (Stictospiza formosa) Source Butler, Arthur Gardiner. Foreign finches in captivity. Hull and London: Brumby and Clarke, limited,1889 (2nd edition). This image comes from the Biodiversity Heritage Library, and is available online at biodiversitylibrary.org/page/17195895 Rights Public domainDLF Forum 2012 Denver CO Trish Rose-Sandler, Missouri Botanical Garden Art of Life project
    • 15. Thanks to Art of Life team! Co-PIs: Doug Holland and Trish Rose-Sandler, from Missouri Botanical Garden Algorithm development: Ed Bachta and Charlie Moad; from Indianapolis Museum of Art Schema development: Gaurav Vaidya and Robert Guralnick, from University of Colorado, Boulder William Ulate, from Missouri Botanical Garden Programming: Mike Lichtenberg, Missouri Botanical Garden Former PI for Art of Life Chris Freeland, Washington UniversityDLF Forum 2012 Denver CO Trish Rose-Sandler, Missouri Botanical Garden Art of Life project
    • 16. Interested? Here’s how you can help • We welcome your feedback on the schema! http://tinyurl.com/9hm7nsb • If you know of scholars and users who would be interested in these types of images and would be interested either in participating in our survey or a brief focus groups about the schema please have them contact me trish.rose-sandler@mobot.org • Would love to talk with other folks about their experiences with crowdsourcing of metadata, particularly if you’ve used flickr or Wikimedia commonsDLF Forum 2012 Denver CO Trish Rose-Sandler, Missouri Botanical Garden Art of Life project
    • 17. Thanks!For more infohttp://biodivlib.wikispaces.com/Art+of+LifeContact:tweet@trosesandlertrish.rose-sandler@mobot.orgDLF Forum 2012 Denver CO Trish Rose-Sandler, Missouri Botanical Garden Art of Life project