The Other Side of Linked Open Data: Managing Metadata Aggregation


Published on

Slides prepared for the ALCTS Metadata Interest Group at ALA Midwinter 2014.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • If LOD exists in multiple versions, and nobody uses it, does it make noise?
  • Evaluation using statistical analysis tool, from, Analyzing Metadata for Effective Use and Re-UseNaomi Dushay, Diane I. Hillmann
  • Revised diagram from: Orchestrating metadata enhancement services: Introducing LennyJon Phipps, Diane I. Hillmann, Gordon Paynter. Note that XForms in this context means ‘Transforms’—was well before an XForms standard that means something specific.
  • The Other Side of Linked Open Data: Managing Metadata Aggregation

    1. 1. The Other Side of Linked Data: Managing Metadata Aggregation ALCTS Metadata Interest Group ALA Midwinter 2014
    2. 2. Where Are We Now? • Major projects so far focused on exposing selected portions of their data for ‘experimentation’ – Who’s using this data? – Can LOD for libraries succeed on that basis? • LOD is not just outputs, needs actual use to inform practice – A more complete view of the environment and workflow should help
    3. 3. Outline • Limitations of the traditional database strategy – Including records, normalization, de-duplication, etc. • Components of a fuller view – – – – – Workflow Inputs, outputs Data cache and services Need for automated orchestration The maintenance conundrum
    4. 4. Substituting a Cache for a Database • Supports multiple streams of data • Allows detailed provenance to be carried over time • Separates services from data storage • Allows more extensive automation (and orchestration of services) • Focuses valuable human effort where it’s needed: analysis, design and implementation of improvement services
    5. 5. Workflow • • • • Obtain data (possibly as ‘records’) Store data as statements in cache Evaluate data by source or collection Improve data using specific services, as determined by evaluation • Publish improved data • [Rinse, repeat]
    6. 6. Yellow=Data we use now Green=Data we’re adding
    7. 7. Yellow=Data we share now Orange=Data we propose to share Green=Data categories we can share
    8. 8. Developing and Defining Services • Small single purpose services are easier to develop and maintain – What services you need are determined by goals, evaluation results, etc. – ‘Orchestration’ of services applies them to specific kinds of data, in order – Services can be described, and linked, to expose who, what, when and how to downstream users
    9. 9. Developing Automated Interaction • Rule: Use humans for things requiring human understanding and decision making – Use machines for everything else – A manual process for something a machine can do as well or better is a failure • Improvement services can be granular, invoked in prescribed order, and report results for later use – Continuous improvement necessary to respond to continuous change
    10. 10. Data Maintenance • Improved data returns as statements to the data cache, with provenance attached • Statement strategy avoids overwriting of new data over ‘improved’ data • Each new statement adds to what is known about a described resource • Statements can be cherry picked and exposed to others in statements or records, in ‘flavors’ or as a ‘everything we have’
    11. 11. Contact Information Diane Hillmann Gordon Dunsire Jon Phipps The First MetadataMobile