Jennifer Bowen, University of Rochester
DC-2010 Conference
October 20, 2010, Pittsburgh, PA
Moving Library Metadata
toward...
About me…
Currently:
- Librarian
- Technical services administrator
- Software development team co-leader
Formerly:
- Cata...
My Topics Today
3
Is it feasible to turn legacy library
MARC metadata into Linked Data
in an automated environment,
and,
H...
Semantic Web and Linked Data
Semantic Web: a set of technologies that
allow computers to understand the meaning
of informa...
Linked Data “Expectations of Behavior”
– Use URIs as names for things
– Use HTTP URIs so that people can look up
those nam...
Linked Data: RDF triple
6
This presentation Jennifer Bowen
has creator
ObjectPredicateSubject
“Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/”
A Reality Check
7
Teaching MARC metadata new tricks?
8
Image source: http://www.englishcafe.com/node/2337
Turning legacy data into Linked Data…
How do we even get started?
9
Getting Started
To create Linked Data, we need:
–Software to transform legacy data
–Analysis: mapping of legacy metadata t...
The software…
11
eXtensible Catalog (XC) is open source,
user-centered, next generation software
for libraries.
XC provide...
XC Software Components
User Interface Website on Drupal CMS
Integrated Library System Repository
XC User Interface
Metadat...
XC’s original metadata goals
- Aggregate MARC and other metadata for
use in new applications
- Define a FRBR-based metadat...
Software development:
a moving target!
14
XC and Linked Data
How can XC help move legacy library
metadata closer to Linked Data?
NOT among XC’s original goals
Howev...
Converting MARC to Linked Data
What XC software can do:
– Convert MARC codes to vocabulary values
– Remove extraneous data...
Converting MARC to Linked Data
Problematic areas:
– Some MARC fields/subfields are difficult to
map to appropriate FRBR en...
MARC to XC Schema Transformation
Parses MARCXML
records into linked
FRBR-based records Maps MARCXML data
elements to Linke...
Managing Relationships
Managing Relationships
20
Issue: Managing Multiple Relationships
21
MARC bibliographic records can refer to
multiple FRBR entities of the same type
...
Issue: Beyond FRBR Group 1 Entities
22
MARC “Alternate Graphic Representation”
(880 fields) can contain data that belong i...
If we were to parse this 880 data correctly:
23
Alternative
script of
name from
880
Alternative
script of
subject
from 880
Issue: Related Group 1 Entities
Language attribute for a related expression
041  1    ‡a eng ‡h ita
100  0    ‡a Dante Ali...
If we were to parse 041 ‡h data…
25
Alternative
script of
name from
880
Original
language from
041 ‡h
Alternative
script o...
Managing Relationships Between Entities
26
Original
language from
041 $h
Alternative
script of
subject
from 880
Alternativ...
•new records
•changed records
•deleted records
•changed
relationships
Maintaining links between separate FRBR
entity recor...
28
But wait…
If we can map a
MARC data element
to a FRBR entity, we
can probably convert
it to Linked Data.
What does this...
29
But do we have to?
- Do we have to be able to map MARC
elements to a FRBR entity in order to create
Linked Data?
- Woul...
Best Practices for Linked Data
- Unique identifiers for XC metadata
records
- Data elements from registered schemas
- Regi...
RDF Triple
31
This resource Poets, American
has subject
ObjectPredicateSubject
URIs for each?
RDF Triple – Record identifiers
32
ObjectPredicateSubject
oai:mst.rochester.edu: MST/
MARCToXCTransformation/
10081
This r...
Identifiers for XC Schema records
33
<?xml version="1.0" encoding="UTF-8"?>
<xc:frbr xmlns:xc="http://www.extensiblecatalo...
RDF Triple - Registered Data Elements
34
http://www.
extensiblecatalog.info
/Elements/subject
ObjectPredicateSubject
oai:m...
35
DCMI
36
RDA
37
XC
XC Schema “work” record: data elements
38
<?xml version="1.0" encoding="UTF-8"?>
<xc:frbr xmlns:xc="http://www.extensiblec...
RDF Triple - RegisteredVocabularies
39
http://id.loc.gov/authorities
/sh85103735#concept
http://www.
extensiblecatalog.inf...
40
<?xml version="1.0" encoding="UTF-8"?>
<xc:frbr xmlns:xc="http://www.extensiblecatalog.info/Elements" …
xmlns:subjid=“i...
RDF Triple
41
http://id.loc.gov/authorities
/sh85103735#concept
http://www.
extensiblecatalog.info
/Elements/subject
Objec...
Experimenting with Linked Data
- Within a MARC or MARCXML
environment?
- Possible to give each record a
URI
- MARC element...
Making Linked Data a Priority for XC
– Balancing goals
– Time/funding constraints
– What’s our use case?
– Output of Linke...
XC Linked Data Accomplishments
XC has set the stage for Linked Data by:
- Providing a platform for creating Linked Data
us...
Next Steps
- Monitor RDA implementations
- Develop XC authority control service
- Enable RDF output of XC Schema metadata
...
www.eXtensiblecatalog.org
Jennifer Bowen
jbowen@library.rochester.edu
Thank you! Questions?
Upcoming SlideShare
Loading in …5
×

Moving Library Metadata Toward Linked Data: Opportunities Provided by the eXtensible Catalog

2,677
-1

Published on

Presented at DCMI-2010, a conference of the Dublin Core Metadata Initiative, in Pittsburgh, PA, on October 20, 2010

Published in: Technology, Education
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
2,677
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
55
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Moving Library Metadata Toward Linked Data: Opportunities Provided by the eXtensible Catalog

  1. 1. Jennifer Bowen, University of Rochester DC-2010 Conference October 20, 2010, Pittsburgh, PA Moving Library Metadata toward Linked Data: Opportunities Provided by the eXtensible Catalog
  2. 2. About me… Currently: - Librarian - Technical services administrator - Software development team co-leader Formerly: - Cataloger (MARC) - Standards developer (RDA) Maybe someday…Linked Data Expert? 2
  3. 3. My Topics Today 3 Is it feasible to turn legacy library MARC metadata into Linked Data in an automated environment, and, How can eXtensible Catalog (XC) software play a role in that process? Image source: www.blog.kdl.org
  4. 4. Semantic Web and Linked Data Semantic Web: a set of technologies that allow computers to understand the meaning of information on the web Linked Data: a mechanism for exposing, sharing and connecting data on the web, using identifiers and relationships 4
  5. 5. Linked Data “Expectations of Behavior” – Use URIs as names for things – Use HTTP URIs so that people can look up those names. – When someone looks up a URI, provide useful information, using the standards (RDF*, SPARQL) – Include links to other URIs so that they can discover more things. Tim Berners-Lee,“Design issues”, 2006 http://www.w3.org/DesignIssues/LinkedData.html 5
  6. 6. Linked Data: RDF triple 6 This presentation Jennifer Bowen has creator ObjectPredicateSubject
  7. 7. “Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/” A Reality Check 7
  8. 8. Teaching MARC metadata new tricks? 8 Image source: http://www.englishcafe.com/node/2337
  9. 9. Turning legacy data into Linked Data… How do we even get started? 9
  10. 10. Getting Started To create Linked Data, we need: –Software to transform legacy data –Analysis: mapping of legacy metadata to Linked Data properties 10
  11. 11. The software… 11 eXtensible Catalog (XC) is open source, user-centered, next generation software for libraries. XC provides a discovery system and a set of tools for libraries to manage metadata and build applications.
  12. 12. XC Software Components User Interface Website on Drupal CMS Integrated Library System Repository XC User Interface Metadata Processing Metadata Services Toolkit Connectivity tools NCIP Toolkit 12 OAI Toolkit
  13. 13. XC’s original metadata goals - Aggregate MARC and other metadata for use in new applications - Define a FRBR-based metadata schema to support XC’s user-interface functionality - Create a software application to process batches of metadata through a set of services 13
  14. 14. Software development: a moving target! 14
  15. 15. XC and Linked Data How can XC help move legacy library metadata closer to Linked Data? NOT among XC’s original goals However, XC software creates an opportunity to contribute to this effort and provides important “lessons learned” 15
  16. 16. Converting MARC to Linked Data What XC software can do: – Convert MARC codes to vocabulary values – Remove extraneous data – Normalize inconsistencies – Map most MARC fields/subfields and parse to appropriate FRBR Group 1 entity records 16
  17. 17. Converting MARC to Linked Data Problematic areas: – Some MARC fields/subfields are difficult to map to appropriate FRBR entities – Tracking relationships between FRBR entity records: How many relationships can we support with XC software? 17
  18. 18. MARC to XC Schema Transformation Parses MARCXML records into linked FRBR-based records Maps MARCXML data elements to Linked-Data- Compatible elements in the XC Schema.
  19. 19. Managing Relationships
  20. 20. Managing Relationships 20
  21. 21. Issue: Managing Multiple Relationships 21 MARC bibliographic records can refer to multiple FRBR entities of the same type (analytics that represent multiple works/expressions, e.g. tracks on a CD)
  22. 22. Issue: Beyond FRBR Group 1 Entities 22 MARC “Alternate Graphic Representation” (880 fields) can contain data that belong in records for Group 2 and Group 3 entities Contributor: 700 1    ‡6 880‐08 ‡a Vasil’ev, Maksim. 880 1    ‡6 700‐08 ‡a Васильев, Максим. Subject: 600 10 ‡6 880‐06 ‡a Putin, Vladimir Vladimirovich, ‡d 1952‐ 880 10 ‡6 600‐06 ‡a Путин, Владимир Владимирович, ‡d  1952‐
  23. 23. If we were to parse this 880 data correctly: 23 Alternative script of name from 880 Alternative script of subject from 880
  24. 24. Issue: Related Group 1 Entities Language attribute for a related expression 041  1    ‡a eng ‡h ita 100  0    ‡a Dante Alighieri, ‡d 1265‐1321. 240  10 ‡a Divina commedia. ‡l English 245  14 ‡a The divine comedy / ‡c Dante ; a      new verse translation by C.H. Sisson. 500        ‡a Translation of: Divina commedia. 24
  25. 25. If we were to parse 041 ‡h data… 25 Alternative script of name from 880 Original language from 041 ‡h Alternative script of subject from 880
  26. 26. Managing Relationships Between Entities 26 Original language from 041 $h Alternative script of subject from 880 Alternative script of name from 880
  27. 27. •new records •changed records •deleted records •changed relationships Maintaining links between separate FRBR entity records in a production environment monopolizes system resources and may not be scalable. What we are learning from XC 27
  28. 28. 28 But wait… If we can map a MARC data element to a FRBR entity, we can probably convert it to Linked Data. What does this emphasis on FRBR have to do with Linked Data? FRBR Group 1 Entities
  29. 29. 29 But do we have to? - Do we have to be able to map MARC elements to a FRBR entity in order to create Linked Data? - Would managing RDF triples be more scalable than managing FRBR-based records and the relationships between those records?
  30. 30. Best Practices for Linked Data - Unique identifiers for XC metadata records - Data elements from registered schemas - Registered vocabularies 30 By attempting to follow best practices in XC for Linked Data, we hope to facilitate eventual output of XC metadata in RDF.
  31. 31. RDF Triple 31 This resource Poets, American has subject ObjectPredicateSubject URIs for each?
  32. 32. RDF Triple – Record identifiers 32 ObjectPredicateSubject oai:mst.rochester.edu: MST/ MARCToXCTransformation/ 10081 This resource has subject Poets, American
  33. 33. Identifiers for XC Schema records 33 <?xml version="1.0" encoding="UTF-8"?> <xc:frbr xmlns:xc="http://www.extensiblecatalog.info/Elements" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:rdvocab="http://rdvocab.info/Elements" xmlns:dcterms="http://purl.org/dc/terms/" xmlns:rdarole="http://rdvocab.info/roles"> <xc:entity type="work" id="oai:mst.rochester.edu:MST/MARCToXCTransformation/10081"> <dcterms:subject xsi:type="dcterms:LCC">PS3505.U334</dcterms:subject> <dcterms:subject xsi:type="dcterms:DDC">811/.52</dcterms:subject> <dcterms:subject xsi:type="dcterms:DDC">B</dcterms:subject> <rdarole:author>Sawyer-Lauc<U+0327>anno, Christopher, 1951-</rdarole:author> <rdvocab:titleOfTheWork>E.E. Cummings :</rdvocab:titleOfTheWork> <xc:subject xsi:type="dcterms:LCSH">Cummings, E. E. (Edward Estlin), 1894- 1962.</xc:subject> <xc:subject xsi:type="dcterms:LCSH">Poets,American-20th century-Biography.</xc:subject> </xc:entity> </xc:frbr> A persistent, globally unique identifier for each XC Schema record
  34. 34. RDF Triple - Registered Data Elements 34 http://www. extensiblecatalog.info /Elements/subject ObjectPredicateSubject oai:mst.rochester.edu: MST/ MARCToXCTransformation/ 10081 This resource has subject Poets, American
  35. 35. 35 DCMI
  36. 36. 36 RDA
  37. 37. 37 XC
  38. 38. XC Schema “work” record: data elements 38 <?xml version="1.0" encoding="UTF-8"?> <xc:frbr xmlns:xc="http://www.extensiblecatalog.info/Elements" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:rdvocab="http://rdvocab.info/Elements" xmlns:dcterms="http://purl.org/dc/terms/" xmlns:rdarole="http://rdvocab.info/roles"> <xc:entity type="work" id="oai:mst.rochester.edu:MST/MARCToXCTransformation/10081"> <dcterms:subject xsi:type="dcterms:LCC">PS3505.U334</dcterms:subject> <dcterms:subject xsi:type="dcterms:DDC">811/.52</dcterms:subject> <dcterms:subject xsi:type="dcterms:DDC">B</dcterms:subject> <rdarole:author>Sawyer-Lauc<U+0327>anno, Christopher, 1951-</rdarole:author> <rdvocab:titleOfTheWork>E.E. Cummings :</rdvocab:titleOfTheWork> <xc:subject xsi:type="dcterms:LCSH">Cummings, E. E. (Edward Estlin), 1894- 1962.</xc:subject> <xc:subject xsi:type="dcterms:LCSH">Poets,American-20th century-Biography.</xc:subject> </xc:entity> </xc:frbr> Data elements from registered namespaces for DC terms, RDA roles and vocab, and XC
  39. 39. RDF Triple - RegisteredVocabularies 39 http://id.loc.gov/authorities /sh85103735#concept http://www. extensiblecatalog.info /Elements/subject ObjectPredicateSubject oai:mst.rochester.edu: MST/ MARCToXCTransformation/ 10081 This resource has subject Poets, American
  40. 40. 40 <?xml version="1.0" encoding="UTF-8"?> <xc:frbr xmlns:xc="http://www.extensiblecatalog.info/Elements" … xmlns:subjid=“id.loc.gov/authorities”> <xc:entity type="work" id="oai:mst.rochester.edu:MST/MARCToXCTransformation/10081"> … <xc:subject xsi:type="dcterms:LCSH">Poets,American-20th century-Biography.</xc:subject> <xc:subject xsi:type="dcterms:LCSH” subjid=“sh85103735#concept”>Poets, American</xc:subject> <xc:temporal>20th century</xc:temporal> <xc:type>Biography</xc:type> </xc:entity> XCWork record with embedded URI for LCSH “Poets,American”
  41. 41. RDF Triple 41 http://id.loc.gov/authorities /sh85103735#concept http://www. extensiblecatalog.info /Elements/subject ObjectPredicateSubject oai:mst.rochester.edu: MST/ MARCToXCTransformation/ 10081 This resource has subject Poets, American
  42. 42. Experimenting with Linked Data - Within a MARC or MARCXML environment? - Possible to give each record a URI - MARC elements themselves don’t have URIs - How to embed multiple URIs for registered vocabularies in MARC? 42 - XC enables experimentation outside of a MARC environment with data that originated as MARC
  43. 43. Making Linked Data a Priority for XC – Balancing goals – Time/funding constraints – What’s our use case? – Output of Linked Data from XC vs. – Using Linked Data within XC? 43
  44. 44. XC Linked Data Accomplishments XC has set the stage for Linked Data by: - Providing a platform for creating Linked Data using XC software - Ensuring that XC Schema records can be converted to RDF triples as easily as possible - Enabling others to build upon what we have accomplished done so far. 44
  45. 45. Next Steps - Monitor RDA implementations - Develop XC authority control service - Enable RDF output of XC Schema metadata - Encourage libraries to use XC software and contribute to the XC user community - Seek funding for additional software development 45
  46. 46. www.eXtensiblecatalog.org Jennifer Bowen jbowen@library.rochester.edu Thank you! Questions?

×