This document summarizes Jackie Shieh's presentation on enabling descriptive data to be linked at the Smithsonian Libraries by implementing linked library data. It discusses Smithsonian's strategic plans, initiatives to link data and make it accessible. It outlines projects to add linked identifiers to MARC records, transform data to BIBFRAME RDF, and challenges faced in managing authority data and staff impacts of learning new skills for linked data. Resources for transitioning from MARC to linked data are also provided.
Introduction to ArtificiaI Intelligence in Higher Education
Implementing Linked Library Data at the Smithsonian Libraries
1. 1NISO 2019 Webinar_JShieh
Enabling Descriptive Data to be Linked
at the Smithsonian Libraries
Implementing Linked Library Data
NISO Webinar, Nov 13, 2019
Jackie Shieh
Descriptive Data Management
Smithsonian Libraries
4. 4NISO 2019 Webinar_JShieh
About the Libraries
• 21 branch libraries
• 4 locations: Washington (DC), Maryland, New York, and Panama
• Collections: 2.2M volumes, 1.5M volumes with digital records
• the Dibner Library of the History of Science and Technology
• Staff: 105 FTE, 39 volunteers, 30 interns
• ILS: Horizon (SirsiDynix)
• managed by the Office of Chief Information Officer (OCIO)
https://www.si.edu/dashboard/national-collections
5. 5NISO 2019 Webinar_JShieh
• FAST Headings Pilot Projects
• The manuscripts collection of the Dibner Library of the History of Science
and Technology (1,620 bib records)
• Monographs published before 1923 (ca. 102K bib records)
• Participating staff
• Discovery Services Division (Disc Svc)
• Descriptive Data Management
• Resource Description
• Electronic Resources & Serials
• Freer & Sackler Gallery Library (FSG)
Linked data projects
8. 8NISO 2019 Webinar_JShieh
MARC bibliographic data
Where should URIs/IRIs for RDF objects go?
• $u URL è Uniform Resource Identifier
$u http://id.loc.gov/authorities/names/n90710805.html
• $0 Authority record control number or standard number
$0 http://id.loc.gov/authorities/names/n90710805
• $1 Real World Object (RWO)
$1 http://www.wikidata.org/entity/Q36322
• $4 Relationship
$4 http://id.loc.gov/entities/relationships/composer
$4 http://rdaregistry.info/Elements/a/P50522
9. 9NISO 2019 Webinar_JShieh
SL FAST headings projects
RDF URI in
$0/$1?
MARC
Bib
(selective set)
FAST
Headings?
Add
FAST
Headings
Conform
FAST Heading
Identifier to RDF URI
NO
Yes Yes
RDF URIs in BIB
NO
10. 10NISO 2019 Webinar_JShieh
• Added FAST identifiers (then converted to URIs)
• Updated MARC encoding standards and data validation
• RDA content standards
• Added URIs to AAPs and alternates (1xx, 6xx, 7xx)
• Replaced/added relationship URI ($4) to relator code
• Removed ISBD punctuation (trailing and between subfields)
Bib data enhancements
16. 16NISO 2019 Webinar_JShieh
• System challenges in loading multiple authority files
• Local practice: authority control at point of cataloging
• Single item authority workflow
• Batch process authority workflow
• Post-load reconciliation process (e.g. VIAF)
Authority data enhancements
18. 18NISO 2019 Webinar_JShieh
Wikidata for identity management
• Linked data platform
• Crowdsourcing for data updates
• Non-Latin scripts for native entity
• New items (maybe too esoteric?)
• New vocabularies (e.g., non-library sources)
• APIs to other tools (OpenRefine, etc.)
• Local instance of Mediawiki/Wikibase
19. 19NISO 2019 Webinar_JShieh
For want of a nail
• Windows 10
• Cataloging
• Connexion
• Horizon
• Microsoft Office 365
(Excel + ASAP Utilities,
Powerpoint screen recording)
• cURL (client URL)
• Open Source
• MarcEdit
• TextPad
• OpenRefine
• Draw.io
• Linux
• MARC::Record Perl package
• Yaz toolkit (Index Data)
• Perl and shell scripts
• cURL (client URL)
20. 20NISO 2019 Webinar_JShieh
• Cataloging silos
• Multiple cataloging sites with diverging policies
• Communication struggles
• Training inconsistencies
• Reinventing workflow
• Development for all staff levels
• Connecting new technologies to daily tasks
Staff impacts : breaking old habits
21. 21NISO 2019 Webinar_JShieh
Staff impacts : learning new skills
• OpenRefine
• Clustering
• Using “blank down” function to remove
duplicates
• TextPad
• Free evaluation copy
• Has block column selection mode
• Sort and dedupe
• Bulk replacements (RegEx, and more)
• Regular Expression concept
• MarcEdit
• MARCJoin
• Validate MARC records
• Find/replace
• Manage/apply tasks
• Export tab delimited records
• Export/import for/from OpenRefine
• Remove blank lines
• RDA helper
• Build linked records
• Compile file into MARC
22. 22NISO 2019 Webinar_JShieh
Some resources
20
• ALCTS Webinars: From MARC to BIBFRAME
ala.org/alcts/confevents/upcoming/webinar/MARCtoBIBFRAME
• PCC URIs in MARC Task Group
loc.gov/aba/pcc/bibframe/TaskGroups/URI-TaskGroup.html
• URI FAQs
loc.gov/aba/pcc/bibframe/TaskGroups/URI%20FAQs.pdf
• Formulating and Obtaining URIs
loc.gov/aba/pcc/bibframe/TaskGroups/formulate_obtain_URI_guide.pdf
• PCC URIs in MARC Pilot
loc.gov/aba/pcc/pilots/URIs-in-MARC-Pilot.html
Shieh_NISO 2019 Webinar