Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Catalog Enrichment for RDA - Adding relationship designators (in Koha) [text]


Published on

Relationship designators are used to specify the relationship between a resource and a person, family, or corporate body associated with that resource. This presentation shows how they were added to the catalog of the library of the Pontificia Università della Santa Croce, in new and -mostly automatic- to legacy records. The Name Cloud, a way to navigate the catalog through related authors, is also shown.

Published in: Science
  • Slides are available at
    Are you sure you want to  Yes  No
    Your message goes here
  • Be the first to like this

Catalog Enrichment for RDA - Adding relationship designators (in Koha) [text]

  1. 1. Catalog Enrichment for RDA : adding relationship designators (in Koha) 35th ADLUG Meeting- Sep 22, 2016 - Stefano Bargioni Slide 1 I'm very happy to discuss with you part of the project started at the Pontificia Università della Santa Croce that aims to introduce RDA rules in our library. We are sharing this goal with the URBE network, i.e. with other 17 ecclesiastical libraries running on Koha, OliSuite or other Integrated Library Systems. This work is a strong cooperation with my staff: especially Luigi Gentile, Michele Caputo, Alberto Gambardella, Giampaolo Del Monte. Introducing RDA in cataloging is a task with many parts. Now we will focus in one of them: the relationship designators or relator terms in bibliographic records. Slide 2 Tags affected by relator terms are personal, corporate and meeting names. The slide lists these tags, both for MARC21 and Unimarc. We will focus on MARC21, as this it the MARC flavor used in our library. Anyway, following ideas could be applied to Unimarc as well. Name-titles are not included. We will see why in some slides. Slide 3 The subfields involved are "e" for personal and corporate names, or "j" for meeting names (where the subfield "e" is yet in use for subordinate unit). In both cases, a string text in the language of the cataloging agency is entered in this subfield. And subfield "4" can contain the same information using a standardized language independent 3 character code. Note that Unimarc differs from MARC21. It approaches the problem using only numeric codes, leaving to the software the responsibility of display and even to search this information. For display and cataloguing, I appreciate the Unimarc's solution, but for searching, I prefer MARC21's solution. This is why we decided to use both subfield "e" and subfield "4". Slide 4 The complete list is very large, and -in my opinion- is evolving continuously. It tries to include any kind of role played by some people in "writing" a "book": author, joint author, editor, translator, cover designer, and many more. Any of them has a short code, but sometimes the code is shared by more than one role, like in "wit", used for "witness" and "eye-witness", or -more interesting- in "aut" used for both "author" and "joint-author".
  2. 2. Our decision was to simplify a lot the process, and ensure the catalog to contain information from a closed list of values, chosen from the official Italian translation of RDA. Slide 5 This slide shows the new popup menu we added to Koha for subfield "e" of tag "7xx". Slide 6 Subfield "4" is hidden, and will be filled in automatically when saving the record. It can be useful in a linked data environment, or copy cataloging from non-Italian libraries. Slide 7 Up to now, we described how we will add the relator information to new records. However, global modifications to cataloging rules could require to modify old records, and many times this task is never accomplished, because it is very difficult to achieve. We discovered that bibliographic records contain useful information for changing them automatically, and to add the relator codes an terms. Our catalog contains 83 percent of records with only the main author tag "1xx", 35 percent of records that contain only added authors, i.e. "7xx" tags, and 21 percent of records that have both main and added authors. Slide 8 The main author -we can say- always has the term "author", of course. With exceptions, of course. It depends on your library. So, probably it is possible to update records with only "1xx" tags automatically. And we modified about 99,000 records. Note that this operation cannot be applied to records with added authors, since these records will remain only partially updated, and this can be a problem for any user, both professional and generic. Slide 9 Following slides illustrate how to infer the relator code for added authors using the information stored in other tags, like the statement of responsibility, some kind of notes, and so on. Of course, the quality of the catalog can play an important role in this operation, and other ideas could be applied in other catalogs, especially depending on the type of collections. Slide 10 If the statement of responsibility or the contents note contain one of these strings, the added author (when only one 7xx tag is present) is an editor. So we were able to update automatically more than 14,000 records.
  3. 3. Slide 11 The added author is an editor also if the statement of responsibility contains "critical edition" or one of its translation in other languages. Slide 12 The added author is an editor if the remainder of statement (250$b) exists. Slide 13 The added author is "honored" if byte 30 of fixed field 008 is set. Slide 14 This case is a bit more difficult, but interesting indeed. The added author is a "joint author" if the statement of responsibility is sufficiently similar to "7xx" occurrences. Here are some examples. You may say that the algorithm that discover this similarity could be very complicated, and this was my opinion before starting this project. Then, surprisingly, I wrote very few lines of (Perl) code for this type of update. Generally speaking, artificial intelligence could help a lot to obtain better results. But you have to know very well your records, and usually this is true if you have a lot of help from your cataloging staff. Is it worth? Slide 15 Another example is shown in this slide 15. The statement of responsibility contains names and some keywords as well. The relator term and code can be added, and we updated about 2,700 records. Slide 16 As I told at the beginning of my presentation, records with name-titles caused a discussion among us, and we involved Tiziana Possemato, prof. Mauro Guerrini, and Casalini Libri. We think that the relator term of name-titles refers to the work or expression represented by the "7xx" tag. The subfield "e", in this case, is the role of the author described in "subfield a" of the work described in "subfield t". If you use it, will OPAC users understand this subtle but important difference? This is why we prefer not to use it, in about 700 records. Slide 17 Remaining records could require complex algorithms, but many times the information to write relator terms and codes is ambiguous or not available at all.
  4. 4. Slide 18 Thus, a remaining group of record will require manual updates. And this is a boring task, that requires specific skills. And sometimes the decision for each term is not so simple, leading to discussions in the catalogers staff... To facilitate this task, a tool named RP7 with specific functionalities was prepared, thanks to the Advanced Programmable Interfaces (APIs) available in Koha. Slide 19 This tool, a web application contained in only one page, allows the cataloguers to navigate the set of records without relator terms. Each record is shown in brief or full format. Popup menus are available at the right of each added name, and if filled in, a new occurrence appears. Keyboard shortcuts are available ("S" for saving and moving to the next record, "B" for toggle display format, "G" to go to the next record without saving, and so on). Slide 20 This table summarizes the percentages of this global update of legacy records. 75% of records were updated automatically, and remaining 25% is a work in progress using both the RP7 specific tool and the Koha cataloging interface. I'm not aware of other projects of catalog enrichment were a lot of work can be saved using algorithms and information yet contained in the catalog itself. Slide 21 Let's take a look at the cataloging interface. Each occurrence of name added tags has a visible subfield "e" for the relator term, with a reduced set of values stored in a popup menu, and a hidden subfield "4" filled in automatically. The programming language is of course JavaScript (jQuery Library). Slide 22 The new information can be shown in the OPAC, alongside the authors' names. This is a valuable information for researchers and students, since they can quickly understand the relevance of an author in the book/manifestation. Slide 23 It is important to ensure that the relator term will be indexed. By default, Koha adds it to the author index, and we think that this is a good solution. This will allow OPAC users to perform more rich searches, like limiting the results to records where an author is the main author, or he / she is the translator, and so on. This is also a useful search path for the reference desk.
  5. 5. Slide 24 This part of the presentation shows an example of how to use relations contained in the catalog. They are defined by the presence in the same record of more than one name, even name-title. And this is not an RDA advantage, of course. The RDA relator terms qualify existing relations, and this can help to display very interesting paths and links to navigate the catalog. Students and researchers can immediately discover who studied a specific author, who worked with him / her, and so on. Think to a thesis about an important philosopher. Slide 25 This is why we built a Name Cloud, that we will link as soon as possible to our catalog. It is divided in two parts. The first part represents the cloud of names around the starting name, while the second part contains the same information and functionalities of the cloud, with relator terms and some counters. Let's open the Name Cloud, to see it moving. Slide 26 This is the second part of the Name Cloud page. Each link can be compressed or expanded, useful for authors with many relations, and for printing reasons too. When compressed, a counter is shown. Slide 27 Let's try to conclude: I could simply read this slide. Adding relationship designators to old bibliographic records is possible for a large part of a library catalog. A good analysis of data is required, as well as good software tools and skills to perform batch updates. Adding relationship designators to new bibliographic records requires to help the staff adding some functionalities to the cataloging module. Their introduction leverages new services in the OPAC, enriches information about authors and adds properties to the relationships among them.