Weibel tsukuba-colloquium-6-up-2011-05-13


Published on

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Weibel tsukuba-colloquium-6-up-2011-05-13

  1. 1. Outline The Context Twenty Years of Metadata: Lessons from the Dublin Core in the Metadata Matrix First Two Decades of the Web What we did right Stuart Weibel University of Tsukuba Visiting Scholar May 13, 2011 The major impediments A few words about models What about the future? Image: Carved figures (Morikawa Toen), Tokyo National Museum THe Context And now?When I started working at OCLC in 1985: A cell phone has more computing power than the Space Shuttle I was 4 years away from my first email address An iPod will hold WorldCat A PC hard drive wasn’t large enough to store a Bandwidth is more important than computing power single high resolution digital image. (which was ok, because…) The library is still mostly mired in MARC Cameras still used film me… circa 1994 There are many metadata standards (mostly struggling for traction) Cell phones were suitcase-sized me… circa 1994 People (mostly) find things with Google MARC Cataloging stood alone as the discovery tool for intellectual assets of libraries but…. No end-user access to the global library catalogsMetadata is more than just 50 years of Metadata search MARC standards (library metadata) Metadata-dependent actions OCLC founded (shared library cataloging) Describe ARPANET Operational - forerunner of the Internet Networking diffuses throughout academia Access The Web begins... FRBR work begins Encode/Render First Dublin Core Workshop DCMI established Preserve Google is founded Rights Management First Dublin Core Conference (Tokyo) my first email WorldCat introduced Administer address RDA introduced “Bind” digital pages in digital books 1960s 1970s 1980s 1990s 2000s
  2. 2. The confusion: Jenn Riley’s Metadata Map How bad is it? 105 standards 30 most common across the top (3 predate the Web) some share common models… most do not Text much overlap many work together Who among us can choose rationally from the array of “This visual map of the metadata landscape is intended to assist standards, platforms, technologies?planners with the selection and implementation of metadata standards.” Will the results have any reasonable expectation ofhttp://www.dlib.indiana.edu/~jenlrile/metadatamap/ interoperability? The real world is not The map is much more standards-centric complicated Metadata- dependent actions Standard Information Entities (ex.) “This visual map of the metadata landscape is intended to assist MARC, DC, MODS, Agents planners with the selection and implementation of metadata standards.” Describe RDA, LCSH, MeSH…. (persons, corporate entities, devices) Access HTTP, FTP…. Events RDF, media-type Encode/render dependent (many) Time intervals or eras Preserve PREMIS Concepts Rights CC licenses, Management eCommerce systems Collections “selection and implementation of metadata standards requires a clear Administer METS, MARC…. Media-types understanding of the information entities, the standards, and the “Bind” digital pages METS, eBook Structured data type functional requirements of the system under design” in digital books standards Image: Kyoto horizon from above the Tenru-ji Temple Dublin Core in the Things we did right metadata matrix We didn’t call it ‘cataloging’ (Web, not libraries) The first metadata standard for the Web A hybrid of technical engineering and social engineering General and cross-disciplinary International - Major events on 5 continents, element definitions Simple starting place, but in 20+ languages (maintained in extensible Tsukuba) International and multilingual Separated syntax and semantics Consensus-driven (bottom-up, Built a community of practice rather than top-down) About the right level of complexity for a core element set Image: Jomon Pottery, Tokyo National Museum, Image: Harajuku train station platform, Tokyo
  3. 3. Impediments that trippedus up Data Modeling: what is it? Entity-relationship model defines the important concepts or things Too many syntaxes to support (HTML, XML, RDF-XML) (entities), and the relationships among them No common data model A model is a model, not reality but we tried hard: data model group, Designed to solve a problem, architecture group, not to emulate the real world abstract model, Singapore Framework... The complexity of the model Without a data model, the story we told was not consistent: confusion resulted should be mapped to the problem, not to reality Without a data model, details of implementation become arbitrary (and less interoperable) Image: Netsuke, Tokyo National Museum Identifying the right level of abstraction is an art Image: Edo MuseumData Modeling: why is it An example of modelingnecessary? mismatch Citation information Without a shared Date understanding of the important entities, and the Title relationships among them, Author systems will not Affiliation interoperate easily Email address Cross-walks become necessary: clumsy, - Which of the attributes are Dublin Core? Changing rail car ‘bogeys’ on the inaccurate, inefficient China/Mongolia border - Is “email address” an attribute of the resource, or the person? - Should there be a distinction between Title and Subtitle?Is Dublin Core well-matched to theproblem of bibliographic description? The problem with models Matching the complexity of models to a diverse and evolving It is too simple to capture the precision of detailed problem is challenging, and full of compromises bibliographic description too much complexity BUT… It is good enough for many purposes, including the leads to failure description of most simple internet resources (creeping elegance) The trade-off between perfect matching of model and too little complexity problem, and simplicity of use is always a compromise leads to failure (insufficient richness DC was intended for general resource description, not to to solve the problem) replace MARC HOW DO YOU KNOW WHEN IT IS RIGHT? Image: figures from a model in the Kyushu National Museum
  4. 4. Conceptual Models in the The Next Chapters in the Web Library World Metadata story... The dominant models for ...are being written in the W3C Incubator Group on Library Linked Data (http:// FRBR and FRAD www.w3.org/2005/Incubator/lld/) bibliographic and authority data Reference model for Open Many questions: OAIS Archive Information Systems Will the data be open? Conceptual Reference Model for CIDOC CRM Who will maintain it? cultural heritage documentation Is semantic web infrastructure stable? Can existing metadata be integrate Largely unintelligible data model seamlessly into the web? Dublin Core Abstract Model for Dublin Core instance data Can a model be agreed upon? A vague framework describing Singapore Framework levels of metadata interoperability Will we ever have interoperability across domain silos? Image: Stone Monk in the Nezu Museum Gardenstuart.weibel@gmail.comhttp://weibel-lines.typepad.com@stuartweibel on twitterstuartweibel on Facebook all photographs by the author Image: Lantern overlooking the Irises in the Nezu Museum Garden