Stefano Bargioni
Pontificia Università della Santa Croce

Catalogue enrichment: importing
Dewey Decimal Classification
fro...
The project
●

Improving the Dewey search path
–
–

●

●

with a minimal effort
while adding BNCF compliant subject headin...
Version 1: The Batch Mode
●

Add Dewey notations to the catalog
–

automatically

–

from selected sources

–

ensure qual...
An atomic copy cataloguing
●
●

copy cataloguing is usually related to the full record
we only need to copy field 082 (MAR...
Records to be modified
●

without Dewey notation

●

with ISBN

●

limit: 008 language
–

SELECT biblionumber, ISBN
FROM b...
Dewey Sources (I)
●

a choice based on copy cataloguing experience

●

OCLC Classify

●

some National Libraries

●

API, ...
Dewey Sources (II): OCLC Classify
●

●

●

Classify is a FRBR-based prototype designed to support the assignment of classi...
Dewey Sources (III): National Libraries
LC

Library of Congress

(any)

MARC

BNF

Bibliothèque nationale de France

(fre...
The logic used in the programs
●

open the connection to the bibliographical database

●

obtain the ISBN from records wit...
Quality check
●

Catalogs contain errors

●

DDC has many editions

●

Our old Dewey numbers start from edition 19

●

Ind...
Delay while searching sources
●

Continuous searching can suffocate remote servers
–
–

●
●

robots.txt
policies for crawl...
Statistics
Source

Language

Dewey #
not found

Dewey #
discarded

Classify

all

42387

10267

5321

6607

20059

LC

all...
Browsing Dewey Index
Besides author, uniform
titles and subject
headings, our OPAC
offers a path of semantic
search based ...
Software
●

Query programs were written in Perl language, making
use of the Koha API and the following libraries
available...
A scientific article
●

●

published on JLIS.it at
http://leo.cilea.it/index.php/jlis/article/view/8766
JLIS.it, Italian J...
Version 2.0 - Single Record Mode
●

New record:
–
–

retrieve Dewey from important catalogs

–
●

enter the ISBN
choose an...
Oct 18, 2013

ADLUG 2013

17
Conclusions
●

Increase of available bibliographic data on the net

●

Unique identifiers
–
–

●

ISBN, ISSN, ...
VIAF Id,...
Thank you
Gracias
Grazie

Oct 18, 2013

ADLUG 2013

19
Upcoming SlideShare
Loading in...5
×

Catalog enrichment: importing Dewey Decimal Classification from external sources (slides)

631

Published on

Usually, important catalogs are accessed for copy-cataloguing whole records. It is possible to retrieve "atomic" information too, using unique keys like ISBN.
Library at Pontificia Università della S. Croce developed a tool that allows Dewey retrieval and insertion into bibliographic records, in bulk mode as well as in single record mode, i.e. during cataloguing.
During the bulk process, Dewey classification was added to about 20,000 records, retrieving it from OCLC, Library of Congress and some national libraries, up to 7 external sources.
The single record mode was integrated into the Koha ILS, to make easier to assign Dewey classification during cataloguing.

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
631
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
5
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Catalog enrichment: importing Dewey Decimal Classification from external sources (slides)

  1. 1. Stefano Bargioni Pontificia Università della Santa Croce Catalogue enrichment: importing Dewey Decimal Classification from external sources Oct 18, 2013 ADLUG 2013 1
  2. 2. The project ● Improving the Dewey search path – – ● ● with a minimal effort while adding BNCF compliant subject headings to our catalog Koha 3 <http://koha-community.org> open source ILS Can be applied to other ILS's Oct 18, 2013 ADLUG 2013 2
  3. 3. Version 1: The Batch Mode ● Add Dewey notations to the catalog – automatically – from selected sources – ensure quality and uniformity Oct 18, 2013 ADLUG 2013 3
  4. 4. An atomic copy cataloguing ● ● copy cataloguing is usually related to the full record we only need to copy field 082 (MARC21) or 676 (Unimarc) ● ISBN unique identifier ● the policy issue Oct 18, 2013 ADLUG 2013 4
  5. 5. Records to be modified ● without Dewey notation ● with ISBN ● limit: 008 language – SELECT biblionumber, ISBN FROM biblio WHERE ISBN_present AND dewey_absent AND language_008='...' Oct 18, 2013 ADLUG 2013 In Ko cla ha, My use i the W Ex tra SQ s ba HE on ctV L s fie alu fun ed o RE ld e, t ctio n thr bibl ha n ou io. t w exp gh X ma ork res Pa rcxm s sio th l ns 5
  6. 6. Dewey Sources (I) ● a choice based on copy cataloguing experience ● OCLC Classify ● some National Libraries ● API, Z39.50 or HTML access Oct 18, 2013 ADLUG 2013 6
  7. 7. Dewey Sources (II): OCLC Classify ● ● ● Classify is a FRBR-based prototype designed to support the assignment of classification numbers and subject headings for books, DVDs, CDs, and other types of materials. This project applies principles of the FRBR model to aggregate bibliographic information above the manifestation level. Bibliographic records are grouped using the OCLC FRBR Work-Set algorithm to form a work-level summary of the class numbers and subject headings assigned to a work. You can retrieve a summary by ISBN, ISSN, UPC, OCLC number, author/title, or subject heading. The Classify database is accessible through a user interface and as a machine-to-machine service. The database provides access to more than 36 million WorldCat records that contain Dewey Decimal Classification (DDC) numbers,[...]. ● Retrieved information is in XML format. ● http://www.oclc.org/research/activities/classify.html?urlm=159746 Oct 18, 2013 ADLUG 2013 7
  8. 8. Dewey Sources (III): National Libraries LC Library of Congress (any) MARC BNF Bibliothèque nationale de France (fre) MARC DNB Deutsche Nationalbibliothek (ger) HTML BNCF Biblioteca Nazionale Centrale di Firenze (ita) HTML BNCR Biblioteca Nazionale Centrale di Roma (ita) HTML BNB British National Bibliography (eng) MARC Oct 18, 2013 ADLUG 2013 8
  9. 9. The logic used in the programs ● open the connection to the bibliographical database ● obtain the ISBN from records without a Dewey number ● open the connection to the Dewey source, if Z39.50 ● for each ISBN ● query the data source using the current ISBN ● if a Dewey number is available in the response ● if the Dewey number passes quality control ● update the bibliographical record ● wait to avoid overloading ● close the connection to the Dewey source, if Z39.50 ● close the connection to the bibliographical database Oct 18, 2013 ADLUG 2013 9
  10. 10. Quality check ● Catalogs contain errors ● DDC has many editions ● Our old Dewey numbers start from edition 19 ● Indicators ● Lot of discarded Dewey... ● … but we moved from 40,000 to 60,000 records with Dewey number Oct 18, 2013 ADLUG 2013 +5 0% 10
  11. 11. Delay while searching sources ● Continuous searching can suffocate remote servers – – ● ● robots.txt policies for crawlers Continuous indexing can overload your server Wait a few seconds between searches or group of searches – this will slow the harvesting process Oct 18, 2013 ADLUG 2013 11
  12. 12. Statistics Source Language Dewey # not found Dewey # discarded Classify all 42387 10267 5321 6607 20059 LC all 31999 1252 21195 8562 1011 BNF all 30903 2253 21327 7268 55 DNB ger 4193 163 3867 163 0 BNCF ita 12017 4088 3643 3542 744 BNCR ita 7549 1515 3003 2978 53 BNB eng 6215 193 5449 55 518 Total Oct 18, 2013 Records Scanned Records Modified ISBN not found Several works with same ISBN 8240 ISBN incorrect 133 19710 ADLUG 2013 12
  13. 13. Browsing Dewey Index Besides author, uniform titles and subject headings, our OPAC offers a path of semantic search based on the Dewey classification number Oct 18, 2013 ADLUG 2013 13
  14. 14. Software ● Query programs were written in Perl language, making use of the Koha API and the following libraries available on CPAN: – LWP for HTTP connections – ZOOM for Z39.50 connections – DBI for connections to the MySQL database – XML::XPath for XML data processing – WWW::Scraper for HTML data processing – MARC::Record for MARC records processing Oct 18, 2013 ADLUG 2013 14
  15. 15. A scientific article ● ● published on JLIS.it at http://leo.cilea.it/index.php/jlis/article/view/8766 JLIS.it, Italian Journal of Library and information science, is an academic journal of international scope, peer-reviewed and open access ● written with my cataloguers ● doesn't deal with the dynamic component Oct 18, 2013 ADLUG 2013 15
  16. 16. Version 2.0 - Single Record Mode ● New record: – – retrieve Dewey from important catalogs – ● enter the ISBN choose and import the best one into the new record Or upgrade an old record adding or modifying its Dewey classification Oct 18, 2013 ADLUG 2013 16
  17. 17. Oct 18, 2013 ADLUG 2013 17
  18. 18. Conclusions ● Increase of available bibliographic data on the net ● Unique identifiers – – ● ISBN, ISSN, ... VIAF Id, ISNI, ... Catalog enrichment – – ● bibliographic records authority records Expose rich linked data – with coded information like Dewey – with standard IDs like iSBN, ISNI, ... Oct 18, 2013 ADLUG 2013 18
  19. 19. Thank you Gracias Grazie Oct 18, 2013 ADLUG 2013 19
  1. Gostou de algum slide específico?

    Recortar slides é uma maneira fácil de colecionar informações para acessar mais tarde.

×