SlideShare a Scribd company logo
1 of 16
Download to read offline
Applying Linked Open Data to a digital library:
best practices and lessons learnt
Gustavo Candela Romero
gustavo.candela@cervantesvirtual.com
Index
1. Introduction
2. Keys to success
3. Future work
4. References
1. Introduction
● In February 2015, the first release of the linked open data website based on
RDA and FRBR data.cervantesvirtual.com was launched. The project intends
to promote data sharing, interoperability, data re-use and dissemination
of best practices.
● Starting from scratch, the project is no longer a demo, but a rich source of
lessons learnt in order to stimulate innovative types of projects.
1. Introduction
Step by step
Marc21 FRBR
2012
RDA
2015
Relational
Database
RDF
repository
Stanford Prize for Innovation in
Research Libraries
TPDL 2015
IODC 2016
SWJ 2017
DATeCH 2017
?
1. Introduction
Why we did it?
● Open our traditional catalog to the world for human and computers.
● Provide a public interface for querying the dataset according to international
recommendations (SPARQL).
● Establish relationships and links with broadly used data sets such as VIAF
and Wikidata.
● Improve the catalog and promote reuse.
2. Keys to success
Preprocessing of sources
● Since some fields are required (for example, field 245 containing the title) while some others are
optional or user-defined, the homogeneity of the data across libraries cannot be guaranteed.
Furthermore, the content of a field can be expressed with different conventions, in different
languages, or it may contain typos.
● These features represent a challenge when MARC21 records must be shared between libraries.
2. Keys to success
Preprocessing of sources
● Textual errors. Many titles were found to contain spurious characters or unbalanced parenthesis.
● Mark-up errors. MARC tags are introduced manually and therefore, a number of mistakes can be
expected.
● Unspecified roles
● No unique identifiers for creators
● Multiple publication statements
● Variable encodings. Some information is encoded using different fields at different institutions. For
example, the MARC control number and language subfields.
❏ latspa Latin + Spanish
❏ italat Italian + Latin
2. Keys to success
Preprocessing of sources
● However, further refinements are needed for the recognition and extraction of
implicit relationships expressed in natural language, such as geographic
locations and dates.
❏ En Sevilla, : en la imprenta de Joseph Padrino ..., [entre 1748 y 1775]
❏ Sevilla, : por Thomas Lopez de Haro ..., , 1679
❏ [Sevilla : s.n., 1760]
❏ Impresso en Sevilla : por Juan Francisco de Blas..., 1693
❏ Hispali :, Antonius Martinez, Alfonsus de Portu et Bartholomaeus
Segura, 1477
2. Keys to success
Work Expression Manifestation
Language Dates (publication,
distribution)
Author
Place of production
Subject
Form of work
Reuse of vocabularies (RDA and FRBR)
www.rdaregistry.info/
2. Keys to success
Identify access points
Entity URI
Person http://data.cervantesvirtual.com/person/{id}
CorporateBody http://data.cervantesvirtual.com/corporatebody/{id}
Family http://data.cervantesvirtual.com/family/{id}
Work http://data.cervantesvirtual.com/work/{id}
Expression http://data.cervantesvirtual.com/expression/{id}
Manifestation http://data.cervantesvirtual.com/manifestation/{id}
Country http://data.cervantesvirtual.com/country/{id}
Date http://data.cervantesvirtual.com/date/{id}
Language http://data.cervantesvirtual.com/language/{id}
2. Keys to success
Metadata enrichment
2. Keys to success
Increase visibility
● Social Media (Facebook and Twitter)
● Conferences
● SEO techniques
● Technology blog
● Github profile
● Encouraging students at the university
3. Future work
BVMC
Repository
Keeping on exploring and innovating
I Still Haven't Found What I'm Looking For...
3. Future work
Wikidata properties
● https://www.wikidata.org/wiki/Property:P2799 BVMC Person id (5500 links)
● https://www.wikidata.org/wiki/Property:P3976 BVMC Work id (100 links)
Some examples of possible additional properties:
● BVMC Journal id
● BVMC Location id
● BVMC Date id
● BVMC Manuscript id
4. References
● http://data.cervantesvirtual.com
● http://data.cervantesvirtual.com/geosearch
● SPARQL endpoint
● Migration of a library catalogue into RDA linked open data, Semantic Web
Journal, 2017 online
● Transformation of a Library Catalogue into RDA Linked Open Data. TPDL
2015
● http://www.rdaregistry.info/
● https://www.ifla.org/publications/functional-requirements-for-bibliographic-reco
rds
Applying Linked Open Data to a digital library: best practices and lessons learnt

More Related Content

Similar to Applying Linked Open Data to a digital library: best practices and lessons learnt

The Europeana Strategy and Linked Open Data
The Europeana Strategy and Linked Open DataThe Europeana Strategy and Linked Open Data
The Europeana Strategy and Linked Open DataDavid Haskiya
 
OpenWordnet-PT: A Project Report
OpenWordnet-PT: A Project ReportOpenWordnet-PT: A Project Report
OpenWordnet-PT: A Project ReportAlexandre Rademaker
 
Linked Open Data for Cultural Heritage
Linked Open Data for Cultural HeritageLinked Open Data for Cultural Heritage
Linked Open Data for Cultural HeritageNoreen Whysel
 
Biblioteca Nacional de España and Linked Open Data. A view from the library s...
Biblioteca Nacional de España and Linked Open Data. A view from the library s...Biblioteca Nacional de España and Linked Open Data. A view from the library s...
Biblioteca Nacional de España and Linked Open Data. A view from the library s...Biblioteca Nacional de España
 
Seeing is Correcting:Linked Open Data for Portuguese
Seeing is Correcting:Linked Open Data for PortugueseSeeing is Correcting:Linked Open Data for Portuguese
Seeing is Correcting:Linked Open Data for PortugueseValeria de Paiva
 
Getting Started with Knowledge Graphs
Getting Started with Knowledge GraphsGetting Started with Knowledge Graphs
Getting Started with Knowledge GraphsPeter Haase
 
Summer School LD4SC 2015 - RDF(S) and SPARQL
Summer School LD4SC 2015 - RDF(S) and SPARQLSummer School LD4SC 2015 - RDF(S) and SPARQL
Summer School LD4SC 2015 - RDF(S) and SPARQLPieter Pauwels
 
Linked Data for Biopharma
Linked Data for BiopharmaLinked Data for Biopharma
Linked Data for BiopharmaTom Plasterer
 
Maximising (Re)Usability of Library metadata using Linked Data
Maximising (Re)Usability of Library metadata using Linked Data Maximising (Re)Usability of Library metadata using Linked Data
Maximising (Re)Usability of Library metadata using Linked Data Asuncion Gomez-Perez
 
RDF(S) and SPARQL
RDF(S) and SPARQLRDF(S) and SPARQL
RDF(S) and SPARQLLD4SC
 
ESWC 2017 Tutorial Knowledge Graphs
ESWC 2017 Tutorial Knowledge GraphsESWC 2017 Tutorial Knowledge Graphs
ESWC 2017 Tutorial Knowledge GraphsPeter Haase
 
Embedding Nomlex-BR into OpenWN-PT
Embedding Nomlex-BR into OpenWN-PTEmbedding Nomlex-BR into OpenWN-PT
Embedding Nomlex-BR into OpenWN-PTValeria de Paiva
 
Digital Tools, Trends and Methodologies in the Humanities and Social Sciences
Digital Tools, Trends and Methodologies in the Humanities and Social SciencesDigital Tools, Trends and Methodologies in the Humanities and Social Sciences
Digital Tools, Trends and Methodologies in the Humanities and Social SciencesShawn Day
 
Oliver: Introducing RDA
Oliver: Introducing RDAOliver: Introducing RDA
Oliver: Introducing RDAALATechSource
 
Logics and Ontologies for Portuguese Understanding
Logics and Ontologies for Portuguese UnderstandingLogics and Ontologies for Portuguese Understanding
Logics and Ontologies for Portuguese UnderstandingValeria de Paiva
 
Session 4.2 unleash the triple: leveraging a corporate discovery interface....
Session 4.2   unleash the triple: leveraging a corporate discovery interface....Session 4.2   unleash the triple: leveraging a corporate discovery interface....
Session 4.2 unleash the triple: leveraging a corporate discovery interface....semanticsconference
 
Highlights from the Workshop on Sustainable Software Sustainability 2019
Highlights from the Workshop on Sustainable Software Sustainability 2019Highlights from the Workshop on Sustainable Software Sustainability 2019
Highlights from the Workshop on Sustainable Software Sustainability 2019Shoaib Sufi
 

Similar to Applying Linked Open Data to a digital library: best practices and lessons learnt (20)

The Europeana Strategy and Linked Open Data
The Europeana Strategy and Linked Open DataThe Europeana Strategy and Linked Open Data
The Europeana Strategy and Linked Open Data
 
OpenWordnet-PT: A Project Report
OpenWordnet-PT: A Project ReportOpenWordnet-PT: A Project Report
OpenWordnet-PT: A Project Report
 
OWN-PT: Taking Stock
OWN-PT: Taking Stock OWN-PT: Taking Stock
OWN-PT: Taking Stock
 
Linked Open Data for Cultural Heritage
Linked Open Data for Cultural HeritageLinked Open Data for Cultural Heritage
Linked Open Data for Cultural Heritage
 
Shieh "Enabling Descriptive Data to be Linked at the Smithsonian Libraries"
Shieh "Enabling Descriptive Data to be Linked at the Smithsonian Libraries"Shieh "Enabling Descriptive Data to be Linked at the Smithsonian Libraries"
Shieh "Enabling Descriptive Data to be Linked at the Smithsonian Libraries"
 
Biblioteca Nacional de España and Linked Open Data. A view from the library s...
Biblioteca Nacional de España and Linked Open Data. A view from the library s...Biblioteca Nacional de España and Linked Open Data. A view from the library s...
Biblioteca Nacional de España and Linked Open Data. A view from the library s...
 
Seeing is Correcting:Linked Open Data for Portuguese
Seeing is Correcting:Linked Open Data for PortugueseSeeing is Correcting:Linked Open Data for Portuguese
Seeing is Correcting:Linked Open Data for Portuguese
 
Getting Started with Knowledge Graphs
Getting Started with Knowledge GraphsGetting Started with Knowledge Graphs
Getting Started with Knowledge Graphs
 
Summer School LD4SC 2015 - RDF(S) and SPARQL
Summer School LD4SC 2015 - RDF(S) and SPARQLSummer School LD4SC 2015 - RDF(S) and SPARQL
Summer School LD4SC 2015 - RDF(S) and SPARQL
 
Linked Data for Biopharma
Linked Data for BiopharmaLinked Data for Biopharma
Linked Data for Biopharma
 
4V - WP3 Progress Report (TIN2013-46238)
4V - WP3 Progress Report (TIN2013-46238)4V - WP3 Progress Report (TIN2013-46238)
4V - WP3 Progress Report (TIN2013-46238)
 
Maximising (Re)Usability of Library metadata using Linked Data
Maximising (Re)Usability of Library metadata using Linked Data Maximising (Re)Usability of Library metadata using Linked Data
Maximising (Re)Usability of Library metadata using Linked Data
 
RDF(S) and SPARQL
RDF(S) and SPARQLRDF(S) and SPARQL
RDF(S) and SPARQL
 
ESWC 2017 Tutorial Knowledge Graphs
ESWC 2017 Tutorial Knowledge GraphsESWC 2017 Tutorial Knowledge Graphs
ESWC 2017 Tutorial Knowledge Graphs
 
Embedding Nomlex-BR into OpenWN-PT
Embedding Nomlex-BR into OpenWN-PTEmbedding Nomlex-BR into OpenWN-PT
Embedding Nomlex-BR into OpenWN-PT
 
Digital Tools, Trends and Methodologies in the Humanities and Social Sciences
Digital Tools, Trends and Methodologies in the Humanities and Social SciencesDigital Tools, Trends and Methodologies in the Humanities and Social Sciences
Digital Tools, Trends and Methodologies in the Humanities and Social Sciences
 
Oliver: Introducing RDA
Oliver: Introducing RDAOliver: Introducing RDA
Oliver: Introducing RDA
 
Logics and Ontologies for Portuguese Understanding
Logics and Ontologies for Portuguese UnderstandingLogics and Ontologies for Portuguese Understanding
Logics and Ontologies for Portuguese Understanding
 
Session 4.2 unleash the triple: leveraging a corporate discovery interface....
Session 4.2   unleash the triple: leveraging a corporate discovery interface....Session 4.2   unleash the triple: leveraging a corporate discovery interface....
Session 4.2 unleash the triple: leveraging a corporate discovery interface....
 
Highlights from the Workshop on Sustainable Software Sustainability 2019
Highlights from the Workshop on Sustainable Software Sustainability 2019Highlights from the Workshop on Sustainable Software Sustainability 2019
Highlights from the Workshop on Sustainable Software Sustainability 2019
 

More from IMPACT Centre of Competence

More from IMPACT Centre of Competence (20)

Session6 01.helmut schmid
Session6 01.helmut schmidSession6 01.helmut schmid
Session6 01.helmut schmid
 
Session1 03.hsian-an wang
Session1 03.hsian-an wangSession1 03.hsian-an wang
Session1 03.hsian-an wang
 
Session7 03.katrien depuydt
Session7 03.katrien depuydtSession7 03.katrien depuydt
Session7 03.katrien depuydt
 
Session7 02.peter kiraly
Session7 02.peter kiralySession7 02.peter kiraly
Session7 02.peter kiraly
 
Session6 04.giuseppe celano
Session6 04.giuseppe celanoSession6 04.giuseppe celano
Session6 04.giuseppe celano
 
Session6 03.sandra young
Session6 03.sandra youngSession6 03.sandra young
Session6 03.sandra young
 
Session6 02.jeremi ochab
Session6 02.jeremi ochabSession6 02.jeremi ochab
Session6 02.jeremi ochab
 
Session5 04.evangelos varthis
Session5 04.evangelos varthisSession5 04.evangelos varthis
Session5 04.evangelos varthis
 
Session5 03.george rehm
Session5 03.george rehmSession5 03.george rehm
Session5 03.george rehm
 
Session5 02.tom derrick
Session5 02.tom derrickSession5 02.tom derrick
Session5 02.tom derrick
 
Session5 01.rutger vankoert
Session5 01.rutger vankoertSession5 01.rutger vankoert
Session5 01.rutger vankoert
 
Session4 04.senka drobac
Session4 04.senka drobacSession4 04.senka drobac
Session4 04.senka drobac
 
Session3 04.arnau baro
Session3 04.arnau baroSession3 04.arnau baro
Session3 04.arnau baro
 
Session3 03.christian clausner
Session3 03.christian clausnerSession3 03.christian clausner
Session3 03.christian clausner
 
Session3 02.kimmo ketunnen
Session3 02.kimmo ketunnenSession3 02.kimmo ketunnen
Session3 02.kimmo ketunnen
 
Session3 01.clemens neudecker
Session3 01.clemens neudeckerSession3 01.clemens neudecker
Session3 01.clemens neudecker
 
Session2 04.ashkan ashkpour
Session2 04.ashkan ashkpourSession2 04.ashkan ashkpour
Session2 04.ashkan ashkpour
 
Session2 03.juri opitz
Session2 03.juri opitzSession2 03.juri opitz
Session2 03.juri opitz
 
Session2 02.christian reul
Session2 02.christian reulSession2 02.christian reul
Session2 02.christian reul
 
Session2 01.emad mohamed
Session2 01.emad mohamedSession2 01.emad mohamed
Session2 01.emad mohamed
 

Recently uploaded

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 

Recently uploaded (20)

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 

Applying Linked Open Data to a digital library: best practices and lessons learnt

  • 1. Applying Linked Open Data to a digital library: best practices and lessons learnt Gustavo Candela Romero gustavo.candela@cervantesvirtual.com
  • 2. Index 1. Introduction 2. Keys to success 3. Future work 4. References
  • 3. 1. Introduction ● In February 2015, the first release of the linked open data website based on RDA and FRBR data.cervantesvirtual.com was launched. The project intends to promote data sharing, interoperability, data re-use and dissemination of best practices. ● Starting from scratch, the project is no longer a demo, but a rich source of lessons learnt in order to stimulate innovative types of projects.
  • 4. 1. Introduction Step by step Marc21 FRBR 2012 RDA 2015 Relational Database RDF repository Stanford Prize for Innovation in Research Libraries TPDL 2015 IODC 2016 SWJ 2017 DATeCH 2017 ?
  • 5. 1. Introduction Why we did it? ● Open our traditional catalog to the world for human and computers. ● Provide a public interface for querying the dataset according to international recommendations (SPARQL). ● Establish relationships and links with broadly used data sets such as VIAF and Wikidata. ● Improve the catalog and promote reuse.
  • 6. 2. Keys to success Preprocessing of sources ● Since some fields are required (for example, field 245 containing the title) while some others are optional or user-defined, the homogeneity of the data across libraries cannot be guaranteed. Furthermore, the content of a field can be expressed with different conventions, in different languages, or it may contain typos. ● These features represent a challenge when MARC21 records must be shared between libraries.
  • 7. 2. Keys to success Preprocessing of sources ● Textual errors. Many titles were found to contain spurious characters or unbalanced parenthesis. ● Mark-up errors. MARC tags are introduced manually and therefore, a number of mistakes can be expected. ● Unspecified roles ● No unique identifiers for creators ● Multiple publication statements ● Variable encodings. Some information is encoded using different fields at different institutions. For example, the MARC control number and language subfields. ❏ latspa Latin + Spanish ❏ italat Italian + Latin
  • 8. 2. Keys to success Preprocessing of sources ● However, further refinements are needed for the recognition and extraction of implicit relationships expressed in natural language, such as geographic locations and dates. ❏ En Sevilla, : en la imprenta de Joseph Padrino ..., [entre 1748 y 1775] ❏ Sevilla, : por Thomas Lopez de Haro ..., , 1679 ❏ [Sevilla : s.n., 1760] ❏ Impresso en Sevilla : por Juan Francisco de Blas..., 1693 ❏ Hispali :, Antonius Martinez, Alfonsus de Portu et Bartholomaeus Segura, 1477
  • 9. 2. Keys to success Work Expression Manifestation Language Dates (publication, distribution) Author Place of production Subject Form of work Reuse of vocabularies (RDA and FRBR) www.rdaregistry.info/
  • 10. 2. Keys to success Identify access points Entity URI Person http://data.cervantesvirtual.com/person/{id} CorporateBody http://data.cervantesvirtual.com/corporatebody/{id} Family http://data.cervantesvirtual.com/family/{id} Work http://data.cervantesvirtual.com/work/{id} Expression http://data.cervantesvirtual.com/expression/{id} Manifestation http://data.cervantesvirtual.com/manifestation/{id} Country http://data.cervantesvirtual.com/country/{id} Date http://data.cervantesvirtual.com/date/{id} Language http://data.cervantesvirtual.com/language/{id}
  • 11. 2. Keys to success Metadata enrichment
  • 12. 2. Keys to success Increase visibility ● Social Media (Facebook and Twitter) ● Conferences ● SEO techniques ● Technology blog ● Github profile ● Encouraging students at the university
  • 13. 3. Future work BVMC Repository Keeping on exploring and innovating I Still Haven't Found What I'm Looking For...
  • 14. 3. Future work Wikidata properties ● https://www.wikidata.org/wiki/Property:P2799 BVMC Person id (5500 links) ● https://www.wikidata.org/wiki/Property:P3976 BVMC Work id (100 links) Some examples of possible additional properties: ● BVMC Journal id ● BVMC Location id ● BVMC Date id ● BVMC Manuscript id
  • 15. 4. References ● http://data.cervantesvirtual.com ● http://data.cervantesvirtual.com/geosearch ● SPARQL endpoint ● Migration of a library catalogue into RDA linked open data, Semantic Web Journal, 2017 online ● Transformation of a Library Catalogue into RDA Linked Open Data. TPDL 2015 ● http://www.rdaregistry.info/ ● https://www.ifla.org/publications/functional-requirements-for-bibliographic-reco rds