Trends in Cataloging & Metadata


Free Webinar for UW-Madison School of Library & Information Studies, Continuing Education, March 14, 2013

Published in: Education, Technology
  • Just want to start off by saying that I’m as confused as all of you. Only reason why I’m here and you’re there is because I’ve had a pretty good spot to survey all this stuff from – library school instructor, teaching cataloging, metadata – go to professional conferences – was an internet cataloger in the 1990s, OCLC CORC user group prez; taught integrating resource cataloging for LoC; currently co-chair ALA LITA ALCTS linked data IG, and ALCTS FRBR IG
  • The way I’m going to structure this is to talk [BASH] some of what I think are significant things that have been happening in cataloging for the last 12 - 15 years. First, late 1990s – push to get Internet resources into library catalogs. Next – early 2000s (Atlanta, GA, June 13-19, 2002) – FRBR became the buzz in cataloging - at about the same time – RDA – although,until I think about 2003 or later, RDA was AACR3 – and itwas in 2007 when the decisionwas made by the JSC to organize RDA according to FRBR – mistake IMHO – more on that in a min. Finally, kind of as result of RDA – In May of 2011 the big news in the library world was the Library of Congress' (LoC) announcementthatitwastransitioningawayfrom the MARC format for bibliographic data. (Announcement: ; BIBFRAME home page: the summer of 2012, LoChiredZepheira ( to investigate the possibilities of linked data as the carrier for library descriptive information. In November of 2012, Zepheira and LoCpublished a report, Bibliographic Framework as a Web of Data: Linked Data Model and Supporting Services -
  • OK so - #1 Getting internet resources into the catalog – we thought it would help, right? If someone was searching the catalog using subject headings like motion pictures, why shouldn’ttheyget IMDB alongwithothermaterialsrelated to film? And, I don’tmean to betoosuperficial about this – anotherbigreason for adaptingourrules and trying to get Internet resourcesinto the catalogwas NOT for free resourceslike IMDB – as sort of alluded to by the subjectheadingshere – a lot of referenceresourcesthatused to be books have evolvedinto online dbs – and librariespay a lot of $$ for thesethings – so of course theyshouldbe in the catalog!
  • But going back to the IMDB example, in reality, this is more like what happens now – even if there’s a record for IMDB in the catalog, a user isn’t going to get to the entry in IMDB for the specific movie they want. And the missed opportunity here is the link to the copy of the movie at the local library that can be checked out – Amazon’s taking advantage of it – but no links to libraries show up in my first page of hits – and I bet a lot of you have started seeing that right hand box on the side in Google results – that’s HTML5 microdata or – a way to make html smarter – IN OCLC WorldCat now
  • #2 – FRBR – FRBR is an information model that divides the bibliographical universe into entities that with attributes or charcatreistics, and that have relationships between them – the big 4 are the Group 1 – Work, expression, manifestation, item WEMI – but there are also Group 2 – responsible entities – persons, corporate bodies, families – and group 3 – subjects – concept, and groups 1 & 2 can also be subjects
  • No surprise to you , but – Library cataloging has traditionally focused on describing the carrier of intellectual content – the package – is it the paperback that’s this tall, with this many pages, published by Macmillan, or is it the hardcover – with the exact same text, but a forward by someone famous, different pagination, different height, different publisher – that’s probably a subsidiary of Macmillan. Promise of FRBR is that it can kind of unlock the data in bibliographic records – and show the relationships between all these different packages of similar content – bringing out the differences when they’re important, when someone needs the large print, for example, but not when they’re not, when people just want stuff.What we’re looking at here is OCLC’s FRBRizedWorldCat - nicely brings together 858 editions & formats of the novel – 190 of the movie Question is – in FRBR terms, is the movie an expression or a related work? And – even bigger question – Does it matter? OCLC admits that they played a little fast & loose with pure FRBR – expression just didn’t work with the bibliographic data in WorldCat, they couldn’t reliably divide a work into expressions - so they left it out – that why we have this display with the novel & movie separate – close together because they have the same title
  • But before we start criticizing OCLC for sloppy research, we have to ask ourselves – does it matter, when you are creating a description of an information resource, to determine if that resource is a work, expression, manifestation, or item and at which of theses levels are the attributes, or characteristics, attached? Here’s an example form Karen Coyle – she says when you make a cake, you have your ingredients
  • When you mix them, you don’t end up with a hierarchical structure like this -
  • You get this – she says “My point here, in case it isn't clear, is that the purpose of creating a bibliographic description using a number of different entities is to... well, to create a bibliographic description; something that as a whole has meaning. You can create it from individual "ingredients," like information about a Work and an Expression, but those do not need to remain separate entities in your final product; instead, that information can become part of your whole.”
  • Third – RDA - process to develop RDA has been extremely long and tortuous – started in about 2003 – tho some point to 1997 “fundamental problem” of AACR – rules organized into chapters by format, and modern materials have characteristics covered in more than one chapter – e.g. ebooks, websitesAt the October 2007 meeting, the JSC agreed on a new organization for RDA, FRBR. A full draft of RDA was issued in November 2008. JSC discussed the responses to the full draft at its meeting in April 2009 and the revised text was delivered to the publishers in June 2009.  RDA was published in the RDA Toolkit in June 2010. RDA published in 2010, testing by the big three LoC, NLM, NLA – plus other partners – SLIS was a partner, part of a group of library schools.Rather than chapters by format – 37 chapters in 10 sections -Recording attributesSection 1. Recording attributes of manifestation and itemSection 2. Recording attributes of work and expressionSection 3. Recording attributes of person, family, and corporate bodySection 4. Recording attributes of concept, object, event, and placeRecording relationshipsSection 5. Recording primary relationships between work, expression, manifestation, and itemSection 6. Recording relationships to persons, families, and corporate bodiesSection 7. Recording relationships to concepts, objects, events, and places associated with a workSection 8. Recording relationships between works, expressions, manifestations, and itemsSection 9. Recording relationships between persons, families, and corporate bodiesSection 10. Recording relationships between concepts, objects, events, and places Some of us who looked at 1st edition RDA – same aacr rules, munged up according to FRBR – or perhaps more politely, same rules stated over over to apply to different FRB entities.Quote is from 8 pg. exec. Summary of US RDA testing committee report – overall reaction can best be described as “meh” – read highlights -
  • And, as report says – not clear if RDA fulfills simple FRBR user tasks – IMHO, like OCLC RDA has to kludge up pure FRBR – and spends too much time forcing catalogers to decide if an attribute (called RDA core elements) – like title – apply to W, E, M, or I – and also introduces some elements that do not exist in pure FRBR – work identifier, form of work, other distinguishing characteristics of work – OCLC used author/title pairs to arrive at works
  • This is from the FRBR FAQ on the IFLA website – Int’l federation of lib assoc, publisher of FRBR – insert cataloger whine here – but also points to my essential gripe – figuring out the whole W, E, M, I does not make find select identify obtain more possible in many cases -
  • The other thing that came out in the testing is that RDA & MARC are not a good fit – I’m arguing that the brain breaking mental anquish of applying the FRBR model to bibliographic data is not worth it in many cases - but even more so, if the library world is to realize any benefits of FRBR & RDA - cramming our data back into MARC is not the way to get there. I mean – really – do we think what we’re seeing here – a new book cataloged according to RDA with the extra 3XX fields for RDA carrier – is really any better than an AACR MARC record??OCLC is not forcing anyone to implement – OCLC RDA policy page – effective 3/31/2013 – same day as Easter & game of thrones - “OCLC member libraries may contribute new unique records to WorldCat formulated according to any cataloging code they are currently using. OCLC will not require libraries to use RDA. Libraries may switch to RDA for original cataloging on their own timetable, if they chose to switch at all. Libraries may continue to contribute records to WorldCat formulated according to AACR2 if they wish to do so.”
  • So I think, that similar to FRBR, rather than being important as a set of cataloging rules, RDA will be important for the new directions that it’s taking our thinking about cataloging & resource description. Started hearing about this in 2009 – at an ALA linked data program with Diane Hillman, Eric Miller, Rebecca Guenther from LoC – who retired in 2011 – RDA joke – retirement data announced
  • #4 linked data – everything has to have a link – so this has lead to linked data versions of all kinds of library knowledge – authority files, subject headings, the RDA elements themselves are posted in linked data consumeable form on the metadata registry
  • Alright, remember that sidebar in my google search for the town – this is linked data in the record for the DVD – and what I’ve circled and inserted is the links to those linked-data. Machine consumable definitions – LCSH at; VIAF – not going to show these – it’s ref in XML – not pretty to look at – but machines like it – makes it possible for a search engine to tell the difference between a string of text or numbers that is a date, and one that is an address – or in this case a string that is a record number and another that’s the number of libraries that own the DVD, or one that is a creator name chuck hogan, and one that is a subject – bank robberies – with a place name in it, where the story take place – Boston – that’s different form the pave where the DVD was published – burbank CA
  • Now we are finally to BIBFRAME Work, instance, links to authority, annotation – links to locally relevant infor – No expressionhere either – hathis is from the BIBFRAME website; Eric Miller ALA MW presentation his annotation example is the copy of Great Gatsby, where they also have an F. Scott Fitzgerald archive with the original cover painting by Francis Cugat. Because Chas. Scribner III is an alumSo instead of records in a database as the bibliographic descriptions, we have XML encoded bibliographic data that is on the web and can be linked to and fromBIBFRAME is like FRBR in that it is an information modelUnlike MARC in that it is not a record format
  • Here’s the BIBFRAME main page – source for current info about BIBFRAME experimentsStuff on the last slide is under the vocabulary tab hereNot a record format – but an encoding scheme – Current BIBFRAME experiments getting data OUT of MARC records, into an XML format, and out onto the web, where it can be linked to and from (last slide – “These Information Resources can then be re-assembled into a coherent architecture that allows for cooperative cataloging at a far more granular level than before. Then, as we leverage the Web as an architecture for data, whenever updates to these Resources are performed (e.g. someone adds new information about a Person, new mappings related to a Subject, etc.) notification events can occur to automatically update systems that reference these Resources. Further, these information assets can now be more effectively utilized at a granular level and provide a richer substrate in which local collections, special collections and third party data can easily annotate and contextualize cooperative library content. “See Code for a look at the underlying codeThe MARC standard is responsible for the creation of millions of bibliographic records from all parts of the globe. We recognize the need to continue supporting MARC during the transition, and, most likely, for years to come as libraries determine their timetable for making a change. The amount of legacy data, though, does not deter us from taking responsible actions for the next generation of libraries and librarians. The problem has been well defined by our partners. We now turn to partners of many types to help us find a durable solution.
  • The 64 million $$ question – my kneejerk answer is “we don’t” Karen Coyle – There is now a visible speedup ofall forms of information resources, even those thatare ostensibly in traditional off-line formats, anddoubts are growing about the ability of libraries toafford the costs of hand-hewn bibliographic controltoday and in the future.Contentious discussion on the BIBFRAME list – what’s the entry screen for catalogers to type in data to create records – going to look like? Soundly dissed by the more tech savvy – also a thread about why can’t we just keep creating in MARC and then convert to BIBFRAME later, automatically, because that’s what the current experiments are already? And, of course, the big problem with that, as anyone who has done a data conversion knows, is that it’s really messy, and you lose stuff – so why should we waste all the time and energy to keep on creating MARC, only to convert it to BIBFRAME in a lossy process – especially when for many published resources there is available metadata out there – like publisher ONIX – so why not convert that to BIBFRAME and skip the MARC step
  • Another Eric Miller slide with his recommendations – keep learning about linked data, both library and non- like schema.orgPersonally, I don’t really recommend BIBFRAME list – contentious – but I was also on the DC list when they were debating whether to have a list of standard types for DC type, or let people fill it in as needed ….so de javue all over againAnd Eric’s answer to the entry screen problem – ask your vendor what they’re going to give you – the robust as MARC cataloging module for linked data
  • In the 1990s we tried to get the Internet into the Library catalogIn the early 2000s, we tried to fit our bibliographic data into fancy information models like FRBR, and began a long process to rewrite cataloging rules to fit FRBR as well, resulting in RDAFRBRization, even though it’s an adaptation of true FRBR, can create a pretty good catalog display, even if it’s not everything promised by FRBRBut no matter how good they look, our library catalogs are still based on MARC, and walled off from the Web – standard Google searches for information resources rarely reveal the library connectionsSo it must be time to try something new – get the library data out there on the web, where it will mean even more
    3. 3. Some of the things… • We thought would save us: • Cataloging Internet resources • & making huge adaptations of MARC • FRBR • RDA • Transitioning away from MARC to … • Linked data? University of Wisconsin–Madison 3
    4. 4. Poll 1. How many of you are implementing RDA? • right now, as we speak • soon, but still waiting to see what others do • no implementation plan 1. How many of you currently create a significant amount of non-MARC metadata - e.g. Dublin Core, EAD, MODS? • yes, we do that a lot • no, none or almost none University of Wisconsin–Madison 4
    5. 5. University of Wisconsin–Madison 5
    6. 6. University of Wisconsin–Madison 6
    7. 7. University of Wisconsin–Madison 7
    8. 8. Entity Relationship Model, but also FRBR user tasks: •to find; •to identify; •to select; •to acquire or obtain access FRBR
    9. 9. University of Wisconsin–Madison 9
    10. 10. Coyle – FRBR as cake, 1 University of Wisconsin–Madison 10
    11. 11. Coyle, FRBR as cake, 2 University of Wisconsin–Madison 11
    12. 12. Coyle, FRBR as cake, 3 University of Wisconsin–Madison 12
    13. 13. “In the final analysis, the RDA Test Coordinating Committee recommended that the national libraries adopt RDA with certain conditions and that implementation will not occur before January 1, 2013.” Meh
    14. 14. University of Wisconsin–Madison 14 What happened to find, identify, select,obtain?
    15. 15. University of Wisconsin–Madison 15
    16. 16. University of Wisconsin–Madison 16
    17. 17. • Dis-aggregated cataloging • Distinguish between RDA the cataloging rules; tool for catalogers; and • RDA – the new model for library data (linked data) • Broad agreement after testing – • MARC & RDA = not a happy match! Implications of RDA
    18. 18. University of Wisconsin–Madison 18
    19. 19. University of Wisconsin–Madison 19
    20. 20. University of Wisconsin–Madison 20
    21. 21. University of Wisconsin–Madison 21
    22. 22. How do we create this brave new metadata?? • No “hand-hewn” records • No “entry screen” in BIBFRAME • BIBFRAME experiments to date use existing MARC data, and dumping it in University of Wisconsin–Madison 22
    23. 23. University of Wisconsin–Madison 23
    24. 24. To recap • Tried to get the Internet into the Library Catalog • Had our bibliographic data try on fancy information models & let it flirt with other metadata schemes • Got our bibliographic data all gussied up and looking pretty good – • BUT – still wearing 50-year-old underwear!! • It’s time to release the bibliographic data from the library catalog, and let it out on the web to play with new friends University of Wisconsin–Madison 24
    25. 25. University of Wisconsin–Madison 25
    26. 26. Resources • Coyle, Karen. 2012. Linked data tools: connecting on the Web. Chicago, IL: ALA TechSource. Library technology reports, v. 48, no. 4. • IFLA FRBR FAQ • OCLC FRBR algorithm • Executive summary: Report & Recommendations of the US RDA Test Coordinating Committee June 2011 • RDA chapter 0, 0.6.0 – 0.6.9 • Karen Coyle, FRBR as cake University of Wisconsin–Madison 26
    27. 27. Resources • OCLC RDA policy - • Metadata registry - • VIAF - • - • Celestial Eyes • BIBFRAME - • - • Roy Tennant, Cataloging Unchained - University of Wisconsin–Madison 27