• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Update and Thoughts on Directions for Metadata Work

Update and Thoughts on Directions for Metadata Work






Total Views
Views on SlideShare
Embed Views



0 Embeds 0

No embeds



Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

    Update and Thoughts on Directions for Metadata Work Update and Thoughts on Directions for Metadata Work Presentation Transcript

    • Update and Thoughts on Directions for Metadata Work Carol Hert March 17, 2003
    • Our Metadata Activities
      • User study to understand metadata necessary for integration tasks (we’re finding needs for metadata not available in agencies)
      • Ongoing efforts to understand DDI and ISO11179 for deploying in end-user tools
      • Identification of host of other relevant standards (open archives, business XML, Z39.50, …)
      • Marked-up tables using DDI
      • Attempting to acquire particular metadata
    • Metadata Aspects for GovStat
      • Conceptual Tasks
        • Determining elements and attributes to be used in wrapping data and contextual info (an XML DTD presumably)
          • User study et al. to determine appropriate content
          • “ thought” experiments with implementations related to elements, attributes, and their values
        • Developing conceptual metadata model for SKN
      • Practical Tasks
        • Finding the actual metadata content to be “wrapped” via the elements
        • finding data with metadata to port into tools
    • Today’s Presentation
      • Focus on the Conceptual Tasks
        • Status report on potentially relevant standards and projects
        • Considering the user tools and the public intermediary
      • Start strategizing on directions to pursue further
    • Concept. Task 1: Identifying Elements, Attributes, and Values
      • Current Contenders for Elements, Attributes (and some values)
        • DDI (and its implementations)
        • ISO11179 (and its implementations)
        • Hybrids
          • Corporate Metadata Repository (CMR) from Oracle
          • Data cubes for Tables from NESSTAR, DDI
    • DDI
      • Data set is the basic element
      • Data archives perspective-designed primarily for people who archive data sets and those who will retrieve and reuse those datasets
      • Does capture information on variables, values, etc.
      • Still actively working on specifications for tables (see Ryssevik memo 3/6/2003)
    • DDI Issues
        • Doesn’t have good mechanism for relating surveys and instances of those surveys-each data set is considered as stand-alone
        • Hard to compare across variables and time-series
        • Elements for tables still in development and other data presentations (such as news releases, graphics) not well developed
        • Currently working backwards to a conceptual model for the metadata
    • DDI Implementations of Note
        • Counting California
        • Virtual Data Center (Harvard/MIT)
          • Developed CRISTAL datacubes and FasterCubes
        • Minnesota Population Center
          • Developed WendyCubes for data cubes
          • WendyCubes and FasterCubes being merged
        • Data Ferrett (Census)
    • ISO11179
      • from the data producers’ perspective (Dan argues that it doesn’t take any perspective)
      • Able to relate survey instances, etc.
      • Isn’t capable of handling the full range of metadata we might need, nor can it handle data representations such as news releases, webpages, etc. (same problem with DDI)
    • ISO11179 Implementations
      • StatCanada
        • Dan G. has reservations about this implementation and feels it doesn’t meet the standard (more as I understand the problem better)
    • Is CMR the answer?
      • CMR as a registry to describe data, data processes, data quality and which links to datasets and data
      • CMR incorporates all of ISO11179, and DDI, in addition can support a variety of metadata types (those news releases)
      • CMR not open source, cost unknown (software cost and Oracle consultants)
      • Two good contacts for us
        • Dan has gotten for BLS
        • Sarah Nusser acquiring for Iowa State
    • Seque to Conceptual Task 2
      • My original goal was to determine what metadata elements would be necessary for a given end-user tool (e.g. the SIG) and determine which standard(s) could provide necessary functionality (enabling metadata to get from agencies to the user tools)
      • I started by looking at the SIG and also at DDI implementations to see what functionalities we could acquire
    • The Plot Thickens
      • Two new questions emerged from these activities
        • What functions/information (data & metadata) would be necessary in SKN
        • What other standards efforts should be considered in creating the SKN?
    • The SKN Architecture
    •   INTERNAL TO AGENCIES PUBLIC INTERMEDIARY POSSIBLE SKN USER TOOLS/FUNCTIONS TRANSFERS Agency data production Data archives   standards, projects and their functions   CMR; Proprietary metadata repositories; Presentation formats (html, xml, pdf, etc.); Database formats (ACCESS, ALMIS ); DDI Datacubes   NESSTAR/Faster CRISTAL; XML for Analysis; Common Warehouse Metadata Model; Statistical disclosure (SDC in Nesstar);  StatCan ISO imp.     DDI   (and DDI for datacubes)   NESSTAR/Faster CRISTAL   Middleware (whatever that includes) NEOOM from Nesstar/Faster   From Virtual Data Center (VDC): federated metadata harvesting, repository exchange and caching, federated authentication and authorization, naming     Searching: Z39.50   Data analysis, Bookmarking, Downloading datasets (nesstar); Cataloging, archiving functions (VDC); Online search, data conversion, exploration, data analysis (VDC); Glossary (The Neuchatel Group)   Statistical Interactive Glossary (SIG—our project)   Ontologies (ISI/Columbia for gas); Relation Browsers; Online Help Z39.50 (used by VDC)   Open Archives (VDC)   DC, MARC, DDI metadata import and export (VDC)   SOAP   HTTP   RDF (Nesstar)   ASN.1  
    • New Strategic Direction for Us?
      • Specification of metadata necessary throughout SKN?
        • Will require specification of interactions among components of SKN
        • And perhaps the specification of specific standards
    • An example of a possible interaction
      • User via interface “I want data on gasoline price indices in the state of MD”
      • Query transferred to intermediary.
      • Intermediary query agent has business rule requiring check of terms so forwards the term “indices” to the SIG
    • Example continued
      • SIG responds with 3 definitions of index (specificity of definition) and multiple display options
      • Intermediary business rule indicates to take most general and to use the term “index” in queries sent to agency data sources
      • Etc.
    • New Strategic Direction for Us?
      • Specification of functions (and related information) necessary throughout SKN?
        • Will require specification of interactions among components of SKN (possible queries, acceptable responses, bindings among agents, etc.)
        • And perhaps the specification of specific standards