Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

American Archive of Public Broadcasting. Karen Cariani, Casey E. Davis, WGBH. Alan Gevinson, Library of Congress


Published on

FIAT/IFTA World Conference 2015 Vienna

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

American Archive of Public Broadcasting. Karen Cariani, Casey E. Davis, WGBH. Alan Gevinson, Library of Congress

  1. 1. American Archive of Public Broadcasting Karen Cariani, WGBH Casey E. Davis, WGBH Alan Gevinson, Library of Congress
  2. 2. @amarchivepub b
  3. 3. Karen Cariani, AAPB Project Director, WGBH @kcariani
  4. 4. Joint collaboration between WGBH and the Library of Congress
  5. 5. • An unprecedented and historic collection of American public radio and television content • Dates back through the 1950s • Preserved and made available to the public What is it?
  6. 6. Who are we? WGBH Media Library and Archives
  7. 7. Library of Congress
  8. 8. © 2010 WGBH 8 Vision The American Archive of Public Broadcasting seeks to preserve and make accessible significant historical content created by public media, and to coordinate a national effort to save at-risk public media before its content is lost to posterity.
  9. 9. • 40,000 hours of digital material initially from over 100 stations – 5,000 hours from born digital files • 2.5 million inventory records from 120 stations • Identified over 3 million items kept at stations, archives, producers, university collections across the country Initial Collection
  10. 10. Challenges of born digital media • Varying file formats • File failure when moving the files from one location to another – file corruption • File naming • Wrappers, codecs, bits, and more
  11. 11. Current Status • Accession the 40,000 hours of digitized files into the LOC systems • Launched a website for public access to the 2.5 million records from the inventory project • Public access to all proxy files on location at WGBH and LOC
  12. 12. Launched New Projects • NET Catalog – Build a national inventory of titles created for National Educational Television (pre-PBS) • National Digital Stewardship Residency – Expand the NDSR program to include geographically diverse residencies at organizations with pubic media collections
  13. 13. Newly Funded • Collaboration with Pop-Up Archive – Create transcripts of all 40,000 hours using speech to text tools – Create metadata games to improve transcripts via crowdsourcing – Create audio fingerprint database
  14. 14. Long term goals • Grow the collection by adding new inventory records and digitized materials • Help public media organizations with archiving, digitizing, and access to their collections • Build a consortium for preservation and access of public media archive content
  15. 15. And PBCore! • Responsible for further development of PBCore • AMIA PBCore Advisory Subcommittee – Schema – Documentation – Education – Website – Communications
  16. 16. PBCore • Continue the development of PBCore as a standard for media materials • Re-engage the PBCore community for input in the continued development • Out reach to new adopters of PBCore • Collaborate with EBUCore to develop an RDF ontology for PBCore
  17. 17. Next steps • Allow public on-line access to as much as possible, rights permitting, to the proxy files – Check out website Nov 1!!!! • Develop Sustainability plan – Fundraise! • Enhance website
  18. 18. Casey Davis, AAPB Project Manager, WGBH @caseyedavis1
  19. 19. Website Development
  20. 20. Website Development • Initial launch in April 2015, with access to 2.5 million inventory records • Responsive design, accessible on desktop, mobile and tablet • Further developing site to allow for streaming video and audio in our Online Reading Room • Now, all video and audio is accessible at WGBH & Library of Congress through implementation of IP restrictions
  21. 21. Data Audio and Video
  22. 22.
  23. 23. Metadata
  24. 24. • Public Broadcasting Metadata Dictionary – Descriptive – Administrative – Technical/Instantiation – Extensions
  25. 25. Archival Management System (AMS) • Developed by AVPreserve • LAMP stack application • Licensed under GPLv3 • Create, store and manage descriptive, technical and preservation metadata • Stations can log-in and access their metadata and stream content • Digitization vendor sends technical metadata to server using BagIt specification; preservation metadata sent to server through Google Spreadsheets API • Provides PBCore and CSV import and export • PBCore and PREMIS REST API
  26. 26. To fully catalog the AAPB collection, it would take one person 32 years working full time. So now we’re focusing on normalization & MVC. • AAPB Phases of Cataloging – Inventory (item-level) – Normalization: formatting dates, titles, splitting out types of titles; formatting existing data at a high level in CSV – Minimum Viable Cataloging (MVC): one-by-one cataloging, 15 minutes per item, spend most time reviewing opening and closing credits; adding dates, titles, creators, contributors, publishers, copyright information, topic, genre (format), and copyright information. This will take approx. 6 years. – Full Cataloging: I’ll come back as a volunteer and do this after I retire 
  27. 27. Metadata • Titles • Contributing Organization • Identifiers • Description • Date • Asset Type • Genres • Creators • Contributors • Publishers • Media Type • Copyright • Duration This is the metadata we expose to users on the public-facing website at americanarchiv
  28. 28. Our advanced search, following the Google model.
  29. 29. A way for users to explore the collection by geographic location.
  30. 30. A way for users who don’t know what they are looking for to dive in to the collection
  31. 31. There’s a lot of stuff on our site. Keeping the access facet always expanded w/ help text may help users
  32. 32. Online Reading Room
  33. 33. Steps to Determining ORR Access • October 2014 Rights Meeting: reviewed many of the challenges facing access to moving image and sound archives. Read about it here: eting/ • Reviewed existing agreements with stations • Sent quit claims to donors – 75% signed • Series-level “bucket analysis” – identified buckets, or categories, of content, i.e., news magazine, live music performance, interviews with public figures, event coverage, etc. Decision: ORR or On Location
  34. 34. Alan Gevinson, AAPB Project Director, Library of Congress
  35. 35. The American Archive of Public Broadcasting Access and Use Place and Time
  36. 36. Access and Use: Place • Massachusetts: 27% • Maryland: 11% • Minnesota: 8% • New York: 7% • California: 6% • New Jersey: 5% • Montana: 4% • South Carolina: 4% • Illinois: 4% • Wisconsin: 3% • North Carolina: 2% • Connecticut: 2% • 38 states and 2 non- states (DC and Guam) submitted materials for digitization • Average percentage of total per state (and non- states): 1.5% • 12 states (on right) contributed above average amounts • 12 states did not participate
  37. 37. Access and Use: Place Region Assets % • Northeast 19,202 31.1 • Mid-Atlantic 14,974 24.2 • Midwest 11,262 18.2 • West 8,187 13.3 • South 7,545 12.2 • Pacific (AK, HI, GU) 905 1.5 By Region
  38. 38. Access and Use: Time Dates of creation or broadcast in 45% of 62,000 records • nearly 1,000 files from the 1950s • around 3,400 from the 1960s • 2,900 from the 1970s • 6,300 from the 1980s • 7,800 from the 1990s • and 6,800 from the 21st century
  39. 39. AAPB • national history • regional history • local history • news • public affairs • civic affairs • religion • education • environmental issues • music • art • literature • filmmaking • dance • poetry documents
  40. 40. AAPB can be of value for scholarship because of ... Geographical breadth • to uncover ways that national and global processes played out on the local scene Chronological reach • to document change (or stasis) over time
  41. 41. Scholars’ • “I have long been frustrated ... gaining access to the vast audiovisual record of my period” • “content [is] held in relative obscurity by the TV and radio networks and the public TV stations” • “programs remain almost impossible to locate and access ... locked in the collections of its many member stations” • “Working to document recent American history without access to the pictures has been a real challenge” • “key historical moments and events are lost to us forever” complaints
  42. 42. AAPB can be of value for scholarship because ... • scholarship pertaining to the period of 1973 onwards is “limited, fragmentary, and politically conflicted” • for the 1980s, “the archival and monographic work … has not yet been done” • accounts about the 1990s and later have “not really been history” Kim Phillips-Fein, “1973 to the Present,” in American History Now (2011)
  43. 43. The Importance of Local History ... • “emphasis on diversity” • “the history of the nation is many different stories, no one of which can be considered the ‘main’ story” • a “skepticism about finding common definitions of American nationalism or discovering common values” among many historians of the 1960s and 1970s History from the bottom up (quotes from Alan Brinkley)
  44. 44. The Importance of Local History for ... • relating “national experiences to larger processes and local resolutions.” Thomas Bender Rethinking American History in a Global Age (2002)
  45. 45. The Importance of Local Civil Rights History ... • “publication of local and state studies ... marked a major shift in the field” • “called into question many of the top-down generalizations” • “studying the importance of the movement’s local, indigenous base fundamentally alters our picture of the movement and its significance” • a “bottom-up perspective” can expose “students to a world beyond their immediate experience” Emilye Crosby, ed. Civil Rights History from the Ground Up (2011)
  46. 46. Civil Rights Movement • Charleston, SC 1960 • Tallahassee, FL 1959-61 • McComb, MS 1961 • Baton Rogue, LS 1961 • Monroe, NC 1961 • Birmingham, AL 1963 • Yellow Springs, OH 1963 • Savannah, GA 1963 • St. Augustine, FL 1964 • MS Freedom Summer 1964 • Natchez, MS 1965 radio stories in AAPB from ...
  47. 47. Improving Scholarly Access to AAPB • Create transcripts • Enlist subject specialists to determine searchable topics, key words, and phrases • Enlist IT specialists to query transcripts using subject specialist vocabulary • Enhance display to relate programs from a variety of localities and time periods