Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

The Benefits of Linking Metadata for Internal and External users of an Audiovisual Archive

245 views

Published on

Slides for the MTSR2018 presentation for the paper The Benefits of Linking Metadata for Internal and
External users of an Audiovisual Archive by Victor de Boer, Tim de Bruyn, John Brooks and Jesse de Vos

Like other heritage institutions, audiovisual archives adopt structured vocabularies for their metadata management. With Semantic Web and Linked Data now becoming more and more stable and commonplace technologies, organizations are looking now at linking these vocabularies to external sources, for example those of Wikidata, DBPedia or GeoNames. However, the benefits of such endeavors to the organizations are generally underexplored. In this paper, we present an in-depth case study into the benefits of linking the “Common Thesaurus for Audiovisual Archives” (or GTAA) and the general-purpose dataset Wikidata. We do this by identifying various use cases for user groups that are both internal as well as external to the organization. We describe the use cases and various proofs-of-concept prototypes that address these use cases.

Published in: Education
  • Be the first to comment

  • Be the first to like this

The Benefits of Linking Metadata for Internal and External users of an Audiovisual Archive

  1. 1. The Benefits of Linking Metadata for Internal and External users of an Audiovisual Archive Victor de Boer, Tim de Bruyn, John Brooks, Jesse de Vos With content from: M. Brinkerink, J. Oomen
  2. 2. Some examples Netherlands Institute for Sound and Vision (Beeld en Geluid)
  3. 3. Individuals Media professionals Heritage/museum professionals Teachers and pupils Researchers
  4. 4. Research & Development at NISV R&D initiates, stimulates and facilitates research and development. At the same time it also collects knowledge and practical examples from inside and outside the institute and offers this to knowledge organisations, colleagues in the sector and other interested parties.
  5. 5. 11 Petabyte of data 65,000 hrs of yearly ingest Digitization of old material Digital-born new content
  6. 6. CC-by-nc-nd https://www.flickr.com/photos/joinash/ Moving away from silos
  7. 7. Linked Data "Linking Open Data cloud diagram 2017, by Andrejs Abele, John P. McCrae, Paul Buitelaar, Anja Jentzsch and Richard Cyganiak. http://lod-cloud.net
  8. 8. ...TO CONTEXT: MUTUALLY CONNECTED COLLECTIONS... 24-10-2018 Connecting collections: topics, people, genres, etc Catalogue Photos B&GWiki Programmeguides Internal: Video hyperlinking
  9. 9. External: Networked heritage
  10. 10. Machine readable format Standardized Flexibility to connect heterogeneous data Link what can be linked re-use and re-usability OBJECT EVENT PLACE TIME PERSON CONCEPT PROVENANCE Why Linked Open Data
  11. 11. The Benefits of Linking Metadata for Internal and External users of an Audiovisual Archive Victor de Boer, Tim de Bruyn, John Brooks, Jesse de Vos
  12. 12. • RDF SKOS thesaurus • ~160.000 terms used in object metadata fields: – ~3800 Subjects, – ~97.000 Persons, – ~27.000 Names, – ~14.000 Locations, – 113 Genres – ~18.000 Makers • http://gtaa.beeldengeluid.nl/ Gemeenschappelijke Thesaurus Audiovisuele Archieven (GTAA)
  13. 13. ALIGNMENTVRT thesaurus GTAA Example: Linking Dutch and Flemish collections de Boer et al. Exploring Audiovisual Archives through Aligned Thesauri. Proceedings of MTSR2016
  14. 14. • “Collaboratively edited knowledge base” • Drives ‘facts’ in Wikipedia • 50 Million items 500M statements (=triples) • Wikidata query service https://query.wikidata.org/ WikiData: General-purpose Knowledge Graph
  15. 15. https://tools.wmflabs.org/mix-n-match/ Alignments: Mix ‘n’ Match tool
  16. 16. • 10,350 GTAA persons matched at time of writing. Currently 45.000 • Based on labels and contextual information – skos:scopeNote, other https://www.wikidata.org/wiki/Q37079 Mix ‘n’ Match results
  17. 17. Analysis of the data Interviews Describing use cases Partial implementation of cases Validate with interviewees Improve Identifying value: Method
  18. 18. Properties of Linked Wikidata persons
  19. 19. Internal and External use cases Internal: Tim de Bruyn Interviews with NISV employees 1) intake, 2) information management 3) research and development External: John Brooks Interviews with Media Scholars Internal and External use cases
  20. 20. UC-I-1: Receiving an alert when the copyright on a person's work expires. • Dutch copyright expiration laws • Using GTAA and WikiData “Date of death” • WikiData occupations (~800) • Manual selection of relevant occupations • (Fields of television, movies, radio, theater and music • Google Calendar alert
  21. 21. UC-I-2: Provide more information on a person appearing in online story • NISV story platform • more information on a person in story • Automatic generated description
  22. 22. UC-I-3: Using Wikidata for story recommendation. • Stories you might also like • Manually selected properties for semantic recommendation • Using properties as metadata • Threshold on matching properties
  23. 23. UC-E-1: Exploratory extension of the CLARIAH Media Suite • http://mediasuite.clariah.nl/ • Media Scholars • Combines datasets and analysis tools in an integrated workspace
  24. 24. UC-E-1: Wikidata retrieval service • Based on interviews with five users of the Media Suite, focusing on 1) Drugs, 2) Sports, 3) Occupations, 4) History and 5) Disruptive media events • Exploratory search by properties • Send SPARQL query to Wikidata Query Service • Retrieve list of persons based on properties • View additional information (Wikidata/GTAA) • Exploratory search
  25. 25. UC-E-1: Exploratory extension of the CLARIAH Media Suite
  26. 26. UC-E-1: Exploratory extension of the CLARIAH Media Suite
  27. 27. ● Interviewees ● Four tasks, ○ Sports, politics, disruptive media events ● Share feedback ○ Discuss limitations ○ Propose improvements ● Added value for exploratory search ● Provides insight into background knowledge ● Participants report feeling grasping the context ● Data (in)completeness is a major issue UC-E-1: Validation
  28. 28. Take home Connecting archives to background information using Linked Databrings new possibilities for access, analysis of content WikiData is becoming de-facto standard for generic background knowledge Shift from tech push to user needs We show added value for a variety of users Data completeness and Quality are (and will remain) key
  29. 29. v.de.boer@vu.nl http://victordeboer.com @victordeboer Thank you

×