Principle Violations: Revisiting the Dublin Core 1:1 Principle


Presented at the 73rd Annual Conference, American Society for Information Science & Technology, Pittsburgh, PA

Published in: Technology, Education
  1. 1. Principle Violations Revisiting the Dublin Core 1:1 Principle Richard J. Urban The Problem Pilot Study The 1:1 Principle In general, Dublin Core metadata describes one manifestation or version of a resource, rather than assuming that manifestations stand in for one another. For instance, a jpeg image of the Mona Lisa has much in common with the original painting, but it is not the same as the painting. As such, the digital image should be described as itself, most likely with the creator of the original image included as a Creator or Contributor rather than just the painter of the original Mona Lisa. The relationship between the metadata for the original and the reproduction is part of the metadata description, and assists the user in determining whether his/her need can be met by a reproduction (Hillmann, 2003) Although Dublin Core (DC) metadata emerged from the need to describe "document-like objects" on the World Wide Web in the mid-1990s, libraries, archives and museums soon adopted it to share information about hidden cultural heritage collections. In response to concerns from this community about distinguishing between records describing "originals" and records describing "reproductions," DCMI introduced the 1:1 Principle: "each resource should have a discrete metadata description and each description should include elements describing a single resource" (Weibel and Hakala, 1997) However, metadata creators indicate that the 1:1 Principle causes "a great deal of confusion" in practice (Park & Childress, 2009). Even when the Principle is understood, software for metadata creation lacks affordances for creating compliant records (Miller, 2010). Studies find that records frequently describe both physical and digital resources and are "particularly problematic" in large-scale metadata aggregations (Shreeves et al., 2005; Han et al., 2009; Hutt & Riley, 2005). Multiple accounts of the Principle, such as the description provided by Using Dublin Core (below) contribute to confusions about what the Principle is about. While these accounts of the 1:1 Principle may provide guidance for metadata creators, additional rules are needed to understand how particular records "violate" the Principle. This pilot study explores techniques to identify records that describe different classes of resources. Data Collection IMLS Digital Collections & Content Project 25 collections 55,000 item-level OAI-PMH XML records Data Analysis Using the SIMILE Gadget ( XML data explorer, overviews of Dublin Core properties and the frequency of unique values were generated for each collection. Each statement was assigned to a class of resources: Using the statement classifications, each collection was classified according to three categories: Non-violating collections: records conformed to the 1:1 Principle. Violating collections: records included statements about both physical and digital resources Non-violating violations: Records described physical resources, but identified digital resources. Results n=25 Digital�Resource Physical�Resource Physical�Resource Physical�Resource Physical�Resource Physical�Resource Physical�Resource Physical�Resource Acknowledgments Portions of this research was supported by a 2007 IMLS National Leadership Research and Demonstration Grant (LG-06-07-0020-07) hosted by the GSLIS Center for Informatics Research in Science and Scholarship (CIRSS), Dr. Carole L. Palmer, Principal Investigator : PhysicalResources: resources described by format values for physical mediums and extents. DigitalResources: resources described by format values about file formats and extents. What is the 1:1 Principle, really? Ongoing Research n:1 Principle, DCAM & OAI-PMH XML Although the Dublin Core Abstract Model (DCAM) embodies the 1:1 Principle and may help prevent errors, it does not directly help identify violations in legacy OAI-PMH XML that may include implicit description sets. Nor does DCAM's generalized resources ("anything that can be identified") help systematically recognize records that describe more than one resource. 1:1 Principle & Bibliographic Relationships If the concern of cultural heritage institutions is about "originals", "reproductions" or "surrogates," different kinds of bibliographic relationships need to be considered. For example, the museum community would not classify the the relationship between a jpeg and the Mona Lisa, as an Equivalence Relationship that involves related FRBR Manifestations. Rather, surrogate resources may stand in Derivative or Descriptive relationships involving FRBR Expressions or FRBR Works. Unfortunately, "the problem of defining reproductions in relationship to originals has proven elusive through all of the cataloging codes of the 20th Century" (Knowlton, 2009) Ongoing work will provide a conceptual definition of a 1:1 Principle that reflects the concerns of cultural heritage repositories and is grounded in contemporary theories of the bibliographic universe. Identifying 1:1 Principle Violations A conceptual definition will inform the development of rules and techniques that identify records that violate the 1:1 Principle. Ongoing work will adapt the Getty Art & Architecture Thesaurus to identify distinct manifestation classes. Additional violation categories based on other relationships or FRBR Group 1 Entities will also be explored. (i.e. is it possible to identify DC records that describe more than one FRBR Expression or FRBR Work?) Violation identification techniques will be applied to 148,000 item-level OAI-PMH records from the IMLS DCC Opening History aggregation in order to identify patterns of 1:1 Principle violations. (http:/ Bibliography Hillmann, D. (2003, August 26). Using Dublin Core. Dublin Core Metadata Initiative. Retrieved from Hutt, A., & Riley, J. (2005). Semantics and syntax of dublin core usage in open archives initiative data providers of cultural heritage materials. In Proceedings of the 5th ACM/IEEE-CS joint conference on Digital libraries (p. 270). Knowlton, S. A. (2009). How the current draft of RDA addresses the cataloging of reproductions, facsimiles, and microforms. Library Resources and Technical Services, 53(3), 159–165. Miller, S. (2010). The One-To-One Principle: Challenges in Current Practice. International Conference On Dublin Core And Metadata Applications. Retrieved October 23, 2010, from Park, J., & Childress, E. (2009). Dublin Core metadata semantics: An analysis of the perspectives of information professionals. Journal of Information Science, XX(X), 1-13. Powell, A., Nilsson, M., Naeve, A., Johnston, P., & Baker, T. (2007). DCMI Abstract Model. Dublin Core Metadata Initiative. Retrieved from documents/abstract-model/ Shreeves, S. L., Knutson, E. M., Stvilia, B., Palmer, C. L., Twidale, M. B., & Cole, T. W. (2005). Is “Quality” Metadata “Shareable” Metadata? The Implications of Local Metadata Practices for Federated Collections. In Currents and convergence: navigating the rivers of change: proceedings of the Twelfth National Conference of the Association of College and Research Libraries April 7-10, 2005, Minneapolis, Minnesota (p. 223). Tillett, B. (2001). Bibliographic Relationships. In C. Bean & R. Green (Eds.), Relationships in the organization of knowledge. Boston: Kluwer Academic Publishers Weibel, S., & Hakala, J. (1998, February). DC-5: The Helsinki Metadata Workshop; A Report on the Workshop and Subsequent Developments. D-Lib Magazine. Retrieved from