What's Wrong With MARC?
Linked Data: what cataloguers need to know #cigld
CILIP Cataloguing and Indexing Group (CIG)
25 No...
Card Index Catalogue

http://cardcat.ucl.ac.uk/cgi-bin/carddisplay.pl?card=887;drawer=13;max=931;ctype=C
AACR2 in MARC21
245 00 $a Models for decision :
$b a conference under the auspices of the United Kingdom
Automation Counci...
RDA in MARC21
245

00

264

_1

264
300

_4
__

336

__

337

__

338

__

504
700

__
1_

$a
$b

$c
$a
$b
$c
$c
$a
$b
$c
...
AARC2 in .mrc
00788nam a2200181 a
450000100270000000500170002700800410004402400150008524502100010026000
490031030000320035...
What is MARC for?
•
•
•
•
•
•

Storage
Exchange and distribution
Manipulation
Display
Input (http://www.aurochs.org/zz/mar...
Finite Notation Problem
Too many subject schemes
650 _0 for LCSH
650 _1 for LC for Childrens
650 _2 for MeSH
…
650 _7 Sour...
Data in More Than One Place
Languages
008 (positions 35-37) eng
041 __ $a eng
240 10 $l English
546 __ $a In English.
Double Encoding: ISBD and MARC
Blanket : Constellation of Orion, 3.
260 __

$a Blanket
$b Constellation of Orion
$c 3

260...
Text, Not Data (1)
ISBN
020 __ $a 9780285638976 (pbk.)
020 __ $a 012002618X (ebook)
Title
245 10 $a British goblins :
Plac...
Text, Not Data (2)
ISBN
020 __ $a 9780285638976
020 __ $a 012002618X
Title
245 10 $a British goblins :
Place of publicatio...
Text, Not Data (3)
ISBN
020 __ $a 9780285638976 :
020 __ $a 012002618X :
Title
245 10 $a British goblins :
Place of public...
Data Mixed Up
GMD
245 10

245 10

$a Data on the web
$h [electronic resource] :
$b research and applications /
$c Antonis ...
Changing Text as Primary Key for
Headings and Authorities
Author heading for deceased person
Niemeyer, Oscar, 1907Differen...
Expressing Relationships
What does this mean?
700 0_

$a Homer.
$t Iliad.

700 1_

$a Berners-Lee, Tim.
Record Not Data
00788nam a2200181 a
450000100270000000500170002700800410004402400150008524502100010026000
4900310300003200...
Doesn't Handle FRBR/RDA Well
245

00

264

_1

264
300

_4
__

336

__

337

__

338

__

504
700

__
1_

$a
$b

$c
$a
$b
...
Other Considerations
• Only libraries use MARC
– Libraries tied to library-specific software/processes
– Outside agencies ...
Marc Must Die / Roy Tennant (2002)
What's Wrong With MARC?
100 1_ $a Meehan, Thomas.
245 __ $a What's wrong with MARC /
$c Thomas Meehan.
260 __ $a Birmingha...
References
•
•
•
•
•
•

MARC21 Standards http://www.loc.gov/marc/
MARC21 Bibliographic http://www.loc.gov/marc/bibliograph...
Upcoming SlideShare
Loading in...5
×

What's Wrong With MARC?

46,037

Published on

From: Linked Data: what cataloguers need to know. A CIG event. 25 November 2013, Birmingham. #cigld

http://www.cilip.org.uk/cataloguing-and-indexing-group/events/linked-data-what-cataloguers-need-know-cig-event

Accompanying write-up in Catalogue & Index 174: http://discovery.ucl.ac.uk/1449459/

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
46,037
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
13
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • In the beginning was the index card. This is basically text, designed to be brief but readable by people. The elements are separated by punctuation.
  • MARC was created to essentially mount that index card onto a computer and share it. It is ideally placed to recreate the index card's data for human consumption.This could be for printing out more index cards, compiling a dictionary catalogue, or even a display screen for an OPAC, at least if you want to recreate the index card layout.
  • This is the same thing with an RDA record..
  • It's worth bearing in mind that this is what MARC actually looks like! Most of what we actually see of MARC is an abstraction for display or editing.To get any bit of information out, the data needs to pulled apart first.At the top is the Leader, which includes some basic information about the record.Next is the Directory, which has information about the fields and their length. The third part is the data itself: note the MARC fields are all absent, as are the subfield markers. There are markers there but they are hidden! There is one invisible code to end the record, another to end each field, and a third to mark the beginning of subfields. In the above example, all the hidden codes have been replaced with underscores so you get an idea what's there.
  • What is MARC for?Is it used for these things anymore?Storage: rarelyExchange and distribution. This is where MARC really lives on. To download records from others via z39 Manipulation: to manipulate MARC, you have to transform it to something else. E.g. in Marcedit. Or do bad thing to it. For example, to index and normalize involves dealing with lots of punctuation, which I'll deal with more in a moment.Display: Designed for traditional catalogue displays and standards, i.e. AACR2. However, the MARC and ISBD card index structure is more often hidden behind a labelled display. In discovery systems, this can be even more abstracted.Input: MARC fields are clearly used for inputting bibliographic data although it's unclear why they have to be. Rob Styles once recounted an anecdote where cataloguers were offered a simpler non-MARC interface and they didn't like it! As RDA is more element-based and takes out the ISBD, there are possibilities of this being more acceptable in the future. See RIMMF.Lingua franca: This is another Rob Styles phrase, which has afflicted many of us, especially those of us who talk to non-cataloguers. We talk of the 245a rather than the title proper. Some non-cataloguers barely believe that MARC replaced AACR2.Many of the following are rather faults with rules and practices, or even with MARC21 over, say, UKMARC. Again, many non-cataloguers conflate MARC21 with other cataloguing practices.
  • If you are familiar with the deficiences of Dewey, which is constrained by its notation.Answer:SuperMARC!
  • Admittedly, contexts can vary here, but do note the variety of methods for formatting the content too: codes, controlled full-language, language in free text.
  • A particularly MARC21 problem:First, ISBD punctuation onlySecond, MARC onlyBoth of these make sense on their own. However we put them back together!Third, MARC and ISBD. This makes human editing harder than it needs to be and automatic processing a real headache.
  • Many MARC data elements contain text. Especially when isolated they become harder to make sense ofIn the ISBN examples, qualifiers are included in the subfield. With the new subfield $q, this is at least partially solved…
  • …as this data is moved elsewhere \o/ … although
  • … the pricing and availability information still needs punctuation in front of it. 8-(The title and place of publication both have extraneous punctuation after them too. This is important as the place of publication is not Köln :, it is Köln, or possibly Cologne, or the place we call Cologne, but the Germans call Köln (unless you speak the local dialect: Kölle) but the Romans called Colonia Claudia AraAgrippinensium, and so on.The extent example shows the punctuation for an element divorced from the element it represents. This has to be cleaned out before any processing as to its meaning can take place. Even then, the units in this and the dimensions examples are mixed in with the numbers and can be very variable.
  • In the first example, some data about the media type is stuck in the middle of the title!Whatever the benefits of the GMD, this is not a good place to put it.In the third example, a lot of fine grained data is essentially dumped in a single box and it would be very hard to retrieve it. Clearly in this case, the record is designed only to provide display.
  • In the first example, poor Mr Niemeyer has passed away, so the heading changes. This is not helpful for linking or matching. Not to mention the huge maintenance issues.In the second case, preferences changes, or different communities might want to display different strings to users.The third example shows the kind of small textual issues that can break matching. LMS's indexing these have to process these to figure out they match.
  • Both these 700s express a relationship of some kind. But what?RDA, and the apparently enthusiastic adoption of relationship designators, have at least done something to address this with the $e.(Although note the weird comma between the heading and the relationship designator!)
  • To do anything, you have to take the MARC file to bits.One piece of information will not stand alone.
  • MARC is record based, and that record is a Manifestation record.However, Expression level elements are spread throughout (in red).Work and Expression level elements are implied and not explicit (e.g. title).There are internal relationships here: we know Berners-Lee is an expression level relationship merely by the relationship: the 700 and the $e otherwise give nothing away.Think of the language example given above: machine-readable expression information is all over the place."Many survey respondents expressed doubt that RDA changes would yield significant benefits without a change to the underlying MARC carrier."Most felt any benefits of RDA would be largely unrealized in a MARC environment.  MARC may hinder the separation of elements and ability to use URIs in a linked data environment. "While the Coordinating Committee tried to gather RDA records produced in schemas other than MARC, very few records were received.""Demonstrate credible progress towards a replacement for MARC"-- Report and Recommendations of the U.S. RDA Test Coordinating Committee. Executive Summary
  • e.g. Primo converts all data, even MARC, into something else before processing.
  • Eleven years ago, Roy Tennant addressed in particular, recognising that AACR2 and MARC so closely entwined and focussed on the card catalogue.- Unreadably esoteric formatGranularity. E.g. authors' first and last names all in a string. Roles hidden within text in the title field!Extensibility, e.g. adding contents notes or additional contentClumsy handling of different scriptsTechnical marginalisation, i.e. only libraries use MARC limits us to niche vendors"With the advent of the web, XML, portable computing, and other technological advances, libraries can become flexible, responsive organizations that serve their users in exciting new ways. Or not. If libraries cling to outdated standards, they will find it increasingly difficult to serve their clients as they expect and deserve." Eleven years later, not much has changed.
  • Thank you.I mentioned that RDA, despite its sins, does make element based, non-ISBD, non-MARC cataloguing more possible. Celine will now talk about RIMMF.
  • What's Wrong With MARC?

    1. 1. What's Wrong With MARC? Linked Data: what cataloguers need to know #cigld CILIP Cataloguing and Indexing Group (CIG) 25 November 2013 Thomas Meehan tom@aurochs.org @orangeaurochs
    2. 2. Card Index Catalogue http://cardcat.ucl.ac.uk/cgi-bin/carddisplay.pl?card=887;drawer=13;max=931;ctype=C
    3. 3. AACR2 in MARC21 245 00 $a Models for decision : $b a conference under the auspices of the United Kingdom Automation Council organised by the British Computer Society and the Operational Research Society / 260 __ 300 __ 504 __ 700 1_ $c $a $b $c $a $b $c $a $a edited by C.M. Berners-Lee. London : English Universities Press, 1965. x, 149 p. : ill. ; 23 cm. Includes bibliographical references. Berners-Lee, C. M.
    4. 4. RDA in MARC21 245 00 264 _1 264 300 _4 __ 336 __ 337 __ 338 __ 504 700 __ 1_ $a $b $c $a $b $c $c $a $b $c $a $2 $a $2 $a $2 $a $a $e Models for decision : a conference under the auspices of the United Kingdom Automation Council organised by the British Computer Society and the Operational Research Society / edited by C.M. Berners-Lee. London : The English Universities Press Limited, 1965. ©1965 x, 149 pages : illustrations ; 23 cm. text rdacontent unmediated rdamedia volume rdacarrier Includes bibliographical references. Berners-Lee, C. M., editor of compilation.
    5. 5. AARC2 in .mrc 00788nam a2200181 a 450000100270000000500170002700800410004402400150008524502100010026000 490031030000320035950400410039165000330043270000230046571000390048871 0003000527710004900557_UCL01000000000000000477125_20061112120300.0_85 0710s1965 enka b 000 0 eng _8 _ax280050495_00_aModels for decision :_ba conference under the auspices of the United Kingdom Automation Council organised by the British Computer Society and the Operational Research Society /_cedited by C.M. Berners-Lee._ _aLondon :_bEnglish Universities Press,_c1965._ _ax, 149 p. :_bill. ;_c23 cm._ _aIncludes bibliographical references._ 0_aDecision making_vCongresses._1 _aBerners-Lee, C. M._2 _aUnited Kingdom Automation Council._2 _aBritish Computer Society._2 _aOperational Research Society (Great Britain)__ Leader Directory Data 245 field, final 710 field
    6. 6. What is MARC for? • • • • • • Storage Exchange and distribution Manipulation Display Input (http://www.aurochs.org/zz/marc_input/marc_input.html) “Lingua franca of library cataloguing”
    7. 7. Finite Notation Problem Too many subject schemes 650 _0 for LCSH 650 _1 for LC for Childrens 650 _2 for MeSH … 650 _7 Source specified in subfield $2 Not enough indicators 246 184 $aThe title on the spine
    8. 8. Data in More Than One Place Languages 008 (positions 35-37) eng 041 __ $a eng 240 10 $l English 546 __ $a In English.
    9. 9. Double Encoding: ISBD and MARC Blanket : Constellation of Orion, 3. 260 __ $a Blanket $b Constellation of Orion $c 3 260 __ $a Blanket : $b Constellation of Orion, $c 3.
    10. 10. Text, Not Data (1) ISBN 020 __ $a 9780285638976 (pbk.) 020 __ $a 012002618X (ebook) Title 245 10 $a British goblins : Place of publication 260 __ $a Köln : Copyright date 260 __ $c c2005 264 _4 $c ©2002 260 _4 $c copyright 2005 260 _4 $c ℗1983 260 _4 $c phonogram 1993 Extent 300 __ $a ix, 300 p. : 300 __ $a ix, 300 p. ; 300 __ $a ix, 300 p. Dimensions 300 __ $c 23 cm. 300 __ $c 9 mm.
    11. 11. Text, Not Data (2) ISBN 020 __ $a 9780285638976 020 __ $a 012002618X Title 245 10 $a British goblins : Place of publication 260 __ $a Köln : Copyright date 260 __ $c c2005 264 _4 $c ©2002 260 _4 $c copyright 2005 260 _4 $c ℗1983 260 _4 $c phonogram 1993 Extent 300 __ $a ix, 300 p. : 300 __ $a ix, 300 p. ; 300 __ $a ix, 300 p. Dimensions 300 __ $c 23 cm. 300 __ $c 9 mm.
    12. 12. Text, Not Data (3) ISBN 020 __ $a 9780285638976 : 020 __ $a 012002618X : Title 245 10 $a British goblins : Place of publication 260 __ $a Köln : Copyright date 260 __ $c c2005 264 _4 $c ©2002 260 _4 $c copyright 2005 260 _4 $c ℗1983 260 _4 $c phonogram 1993 Extent 300 __ $a ix, 300 p. : 300 __ $a ix, 300 p. ; 300 __ $a ix, 300 p. Dimensions 300 __ $c 23 cm. 300 __ $c 9 mm.
    13. 13. Data Mixed Up GMD 245 10 245 10 $a Data on the web $h [electronic resource] : $b research and applications / $c Antonis Bikakis, Adrian Giurca (eds.). $a Data on the web $b research and applications / $c Antonis Bikakis, Adrian Giurca (eds.). Nothing allowed after 245$c 245 10 $a Enduring resistance : $b cultural theory after Derrida / $c edited by Sjef Houppermans, Rico Sneller, Peter van Zilfhout. = La résistance persérvère : la théorie de la culture (d')aprés Derrida / edité par Sjef Houppermans, Rico Sneller, Peter van Zilfhout.
    14. 14. Changing Text as Primary Key for Headings and Authorities Author heading for deceased person Niemeyer, Oscar, 1907Different preferences for writing name Mao, Tse-tung, 1893-1976 [Former heading] Mao, Zedong, 1893-1976 毛泽东, 1893-1976 Small differences could break match Mao, Zedong, 1893-1976. Mao, Zedong, 1893-1976
    15. 15. Expressing Relationships What does this mean? 700 0_ $a Homer. $t Iliad. 700 1_ $a Berners-Lee, Tim.
    16. 16. Record Not Data 00788nam a2200181 a 450000100270000000500170002700800410004402400150008524502100010026000 490031030000320035950400410039165000330043270000230046571000390048871 0003000527710004900557_UCL01000000000000000477125_20061112120300.0_85 0710s1965 enka b 000 0 eng _8 _ax280050495_00_aModels for decision :_ba conference under the auspices of the United Kingdom Automation Council organised by the British Computer Society and the Operational Research Society /_cedited by C.M. Berners-Lee._ _aLondon :_bEnglish Universities Press,_c1965._ _ax, 149 p. :_bill. ;_c23 cm._ _aIncludes bibliographical references._ 0_aDecision making_vCongresses._1 _aBerners-Lee, C. M._2 _aUnited Kingdom Automation Council._2 _aBritish Computer Society._2 _aOperational Research Society (Great Britain)__ Leader Directory Data 245 field, final 710 field
    17. 17. Doesn't Handle FRBR/RDA Well 245 00 264 _1 264 300 _4 __ 336 __ 337 __ 338 __ 504 700 __ 1_ $a $b $c $a $b $c $c $a $b $c $a $2 $a $2 $a $2 $a $a $e Models for decision : a conference under the auspices of the United Kingdom Automation Council organised by the British Computer Society and the Operational Research Society / edited by C.M. Berners-Lee. London : The English Universities Press Limited, 1965. ©1965 x, 149 pages : illustrations ; 23 cm. text rdacontent unmediated rdamedia volume rdacarrier Includes bibliographical references. Berners-Lee, C. M., editor of compilation.
    18. 18. Other Considerations • Only libraries use MARC – Libraries tied to library-specific software/processes – Outside agencies can’t take advantage of library data and standards (See Also: RDA not freely available) • Not even all of libraries use MARC – Archives – Repositories – Non-MARC LMSs
    19. 19. Marc Must Die / Roy Tennant (2002)
    20. 20. What's Wrong With MARC? 100 1_ $a Meehan, Thomas. 245 __ $a What's wrong with MARC / $c Thomas Meehan. 260 __ $a Birmingham : $b CIG, $c 2013. 490 1 _ $a Linked data 500 __ $a Presentation given at the CILIP CIG event " Linked Data: what cataloguers need to know", Birmingham, England, 25 Nov. 2013. 710 1_ $a Chartered Institute of Library and Information Professionals (Great Britain). Cataloguing and Indexing Group. 830 _0 $a Linked data.
    21. 21. References • • • • • • MARC21 Standards http://www.loc.gov/marc/ MARC21 Bibliographic http://www.loc.gov/marc/bibliographic/ecbdhome.html MARC21 Record Structure http://www.loc.gov/marc/specifications/specrecstruc.html UKMARC Manual http://www.bl.uk/bibliographic/ukmarc.html MARC Must Die / Roy Tennant. (Library Journal , Oct. 2002). http://www.libraryjournal.com/article/CA250046.html Report and Recommendations of the U.S. RDA Test Coordinating Committee. Executive Summary http://www.loc.gov/bibliographicfuture/rda/source/rda-execsummary-public-13june11.pdf
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×