Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Let the Public and the Computer do the Metadata Work!


Published on

Presentation by Karen Cariani, WGBH Media Library and Archives Senior Director and Project Director for the American Archive of Public Broadcasting at the 2017 Association of Moving Image Archivists Conference in New Orleans.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Let the Public and the Computer do the Metadata Work!

  1. 1. Karen Cariani AAPB Project Director, WGBH Senior Director, WGBH Media Library & Archives Let the Computer, and the Public, do the Metadata Work!
  2. 2. The Library of Congress Packard Campus for Audio Visual Conservation American Archive of Public Broadcasting
  3. 3. WGBH Educational Foundation American Archive of Public Broadcasting
  4. 4. the situation 72,000 digitized television and radio programs incomplete, inaccurate metadata records limited staff resources we need to know what we have in the collection we have a responsibility to users to provide access to the collection continued growth of the collection (content and sparse metadata)
  5. 5. the potential: transforming content into data • Computational Tools • Speech-to-text • Audio analysis • Image Analysis • Visualization of Data How can we use them?
  6. 6. a crowdsourcing game Casey Davis Kaufman Associate Director, WGBH Media Library and Archives Project Manager, AAPB
  7. 7. AV crowdsourcing precedents TiltFactor @ Dartmouth: “Metadata Games” New York Public Library’s Together We Listen project & Transcript Editing Tool Netherlands Institute for Sound and Vision
  8. 8. user population General public Public media fans K-12 students Senior Citizens People seeking to develop editing skills People seeking volunteer opportunities
  9. 9. game pipeline Identify errors 1 Suggest corrections 2 Validate corrections 3
  10. 10. game improvement targets Change algorithm and game pipeline to get transcripts through the game quicker Update Rules page to allow more leniency in corrections. Communicate that we’re looking for acceptable corrections, not perfection. Add ability for AAPB staff to prioritize transcripts in the game Remove the preferences feature Update API to help AAPB staff determine more easily which transcripts are ready to come out of the game.
  11. 11. lessons learned • Ensure that all team members understand the overall goals of the project from the beginning • Ensure that all relevant team members are involved in developing the game flow concepts and API • Stay involved in all decision-making – don’t trust that the developers/contractors will make all the right decisions • Test, test, test!!
  12. 12. once corrected… JSON transcripts will be stored on AAPB’s Amazon S3 account Transcripts will be indexed for keyword searching on the AAPB website Transcripts will be made available alongside the media on the record page Transcripts can play as captions within the player Transcripts can be harvested via an API and used as a dataset for research such as a digital humanities project
  13. 13. usability & ux research questions Do users understand the workflow of the game? Do users understand the iconography? How do users feel about interacting with random transcripts rather than choosing a specific transcript to work on? How do users feel about interacting with small bits of transcripts rather than a full transcript at once? What is the overall user experience when playing the game? What is the overall satisfaction level in playing the game?
  14. 14. future plans
  15. 15. @amarchivepub #FixItAAPB Come to our editathon! Friday, 5:45 – 6:45 pm Room: Arcadian I Treats and prizes!