Technical Processes Dept.
Ricardo Santos Muñoz
Coordination and Standardization Service
DC 2013 – Lisbon, 5th September
Bi...
BIBLIOTECA NACIONAL DE ESPAÑA
Motivation
 Primary motivation was finding a tool to achieve
multilinguism, a main topic in...
BIBLIOTECA NACIONAL DE ESPAÑA
Project features
 Achieved between BNE and the Ontology Engineering
Group. (Universidad Pol...
BIBLIOTECA NACIONAL DE ESPAÑA
Potential sources of data
 More than 4 millions bibliographic records
 More than half mill...
BIBLIOTECA NACIONAL DE ESPAÑA
Modelling and vocabularies
 Proof of concept of exhaustively using IFLA models and
vocabula...
BIBLIOTECA NACIONAL DE ESPAÑA
Advantages of FRBR and FRAD as models
 Order: grouping of related data that
leads to powerf...
BIBLIOTECA NACIONAL DE ESPAÑA
IFLA vocabularies
 All of them are in RDF in the OMR with the status
published
 The elemen...
BIBLIOTECA NACIONAL DE ESPAÑA
IFLA vocabularies
BIBLIOTECA NACIONAL DE ESPAÑA
Building FRBR/FRAD and Linked Data from MARC21
Authority records
frbr:work
frbr:person, frbr...
BIBLIOTECA NACIONAL DE ESPAÑA
Conversion process
 Transformation was carried out with the tool Marimba, designed by
the O...
BIBLIOTECA NACIONAL DE ESPAÑA
Results
 It has been migrated:
– Authority file (persons, corporate bodies)
– Modern Monogr...
BIBLIOTECA NACIONAL DE ESPAÑA
Results (2)
 Data released under CC0 license.
 More than 58 millions of triplets.
 Owl:sa...
BIBLIOTECA NACIONAL DE ESPAÑA
Reception
 Data and FRBR constructing and mapping has been used, studied
and cited by JISC ...
BIBLIOTECA NACIONAL DE ESPAÑA
Problems found and future developments
 Mapping problems.
– Not all FRBR basic relations co...
BIBLIOTECA NACIONAL DE ESPAÑA
Other developments: subjects in SKOS vocabulary
 Subjects records could not be properly map...
BIBLIOTECA NACIONAL DE ESPAÑA
Other developments: BNE Escolar
(BNE for schools)
 Aim: semantic exploitation and education...
BIBLIOTECA NACIONAL DE ESPAÑA
BNEscolar Demo
BIBLIOTECA NACIONAL DE ESPAÑA
Conclusion
 Library-specific models and vocabularies (namely, FRBR-based
ontologies), provi...
BIBLIOTECA NACIONAL DE ESPAÑA
Ricardo Santos Muñoz
Departamento de Proceso Técnico
ricardo.santos@bne.es
Pº de Recoletos 2...
Upcoming SlideShare
Loading in …5
×

Biblioteca Nacional de España and Linked Open Data. A view from the library side. Ricardo Santos Muñoz

1,424 views

Published on

Presentada en la Conferencia Internacional de Dublin Core 2013, que tuvo lugar en Lisboa, del 2 al 6 de septiembre y donde participó la Biblioteca Nacional de España (BNE).

Published in: Technology, Education
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,424
On SlideShare
0
From Embeds
0
Number of Embeds
224
Actions
Shares
0
Downloads
14
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide
  • There exist 4 official languagues in Spain, which always has hindered cooperation between libraries. LD was seen as an opportunity to achieve this. Although multilingüism is not at this moment the cornerstone of this project, it continues to be a goal to be reached. But eventually the project evolved into broader goals, that is, contribute to the emerging Linked Data cloud, with the data from the library.
  • In the first phase only number 1 and 2 were considered.
  • Lingusitic neutrality is core in an international organization such as IFLA.
  • Migration was achieved
  • Not every library uses all of the MARC fields available, and not every library use the same fields in the same way. Pre-processing allows to look at the own data, in order to select the most appropiate class or property for any piece of data.
  • We looked into the cataloging, and figured out which content would fit better into FRBR structure. We thought that monographs, modern and old, printed music and sound recording were the most suitable candidates.
  • ABES has begun to work into the publication of library data using ISBD classes and properties, targeting library professionals. The initiative/experiment drew positive remarks in academic or professional environment, due to our approach to fit traditional MARc records into a built-up complex FRBR structure. Reviews noted that this wasn’t “ a ‘straightforward’ dump of bibliographic records, that is, data were not just mapped, worh that can be done in a relatively easy way, but in some way organized and linked. For the sake of this discussion I have remarked this phrase from the Open biblio blog. IFLA voc. nomenclature also made mapping process slower.
  • [ambigiueti] Better coding… This is a kind of feedback.
  • This is, on the contrary, a conversion from library data to a non-library specific vocabulary, but very popular. In this case, nevertheless, there is hardly loss of data. Benefits are inmediate. But this can be a point to the “popular and rich” camp.
  • This is case for reutilization of library data beyond library enrivonment. In this case, education. Again, once explained to computer experts, MARC21 proved to be the most precise tool for extract and map to other vocabularies. BNE Escolar is a lighter and simpler Linked Data related project
  • This is a personal conclusion We have to aks aourselves if we are jumping from one silo to another. Experience: More professional, academic approach in “datos.bne.es” and lighter in BNE Escolar.
  • Biblioteca Nacional de España and Linked Open Data. A view from the library side. Ricardo Santos Muñoz

    1. 1. Technical Processes Dept. Ricardo Santos Muñoz Coordination and Standardization Service DC 2013 – Lisbon, 5th September Biblioteca Nacional de España and Linked Open Data. A view from the library side
    2. 2. BIBLIOTECA NACIONAL DE ESPAÑA Motivation  Primary motivation was finding a tool to achieve multilinguism, a main topic in the Spanish library environment.  Expose bibliographic and authority data as Linked Open Data, in order to: – Enhance and enrich the user experience of navigating the data, using internal and external sources. – Make rich library data open and reusable.
    3. 3. BIBLIOTECA NACIONAL DE ESPAÑA Project features  Achieved between BNE and the Ontology Engineering Group. (Universidad Politécnica de Madrid)  The work began at early 2011 as a small proof of concept: work by and related to Cervantes.  Eventually, it evolved into a bigger part of the catalogue. It was presented in December 2011, as datos.bne.es
    4. 4. BIBLIOTECA NACIONAL DE ESPAÑA Potential sources of data  More than 4 millions bibliographic records  More than half million authority records  More than 100000 digital objects  Other sources: virtual exhibitions, archive
    5. 5. BIBLIOTECA NACIONAL DE ESPAÑA Modelling and vocabularies  Proof of concept of exhaustively using IFLA models and vocabularies: FRBR, FRAD, FRSAD and ISBD ontologies. – Widely agreed upon by the library community, potentially with no loss of data – Provides a rich framework for exposing, sorting and connecting library data. – Sustainability: backed by an international association. – Enable data to be related with other theoretical models from other similar disciplines.
    6. 6. BIBLIOTECA NACIONAL DE ESPAÑA Advantages of FRBR and FRAD as models  Order: grouping of related data that leads to powerful and neat visualization schemes  Relationship: discovery of related resources  Abstract models can be mapped to other models: CIDOC-CRM..
    7. 7. BIBLIOTECA NACIONAL DE ESPAÑA IFLA vocabularies  All of them are in RDF in the OMR with the status published  The elements are denominated by numeric designations: http://iflastandards.info/ns/isbd/elements/P1004 Isbd:P1004  “has title proper” – Goal: linguistic neutrality – The “human side” has been translated into many languages  Additional elements from various vocabularies were added: DC, RDA, MADS/RDF,
    8. 8. BIBLIOTECA NACIONAL DE ESPAÑA IFLA vocabularies
    9. 9. BIBLIOTECA NACIONAL DE ESPAÑA Building FRBR/FRAD and Linked Data from MARC21 Authority records frbr:work frbr:person, frbr:corporate body Bib records frbr:expression frbr:manifestation Clasification Person Work Relation 100 $a author of Work $t Annottation 260$a Publication place 100 $a 100 $a $t http://bne.linkeddata.es/mapping-marc21/index.html
    10. 10. BIBLIOTECA NACIONAL DE ESPAÑA Conversion process  Transformation was carried out with the tool Marimba, designed by the OEG group.  This tool supports the mapping through user-friendly spreadsheets.  Allows librarians to map manually the pieces of information from MARC records (fields and subfields codes) to the ontologies of their choice.  Marimba pre-processes MARC records and present librarians with the current metadata elements used in their catalogue, and they assign the correspondences.
    11. 11. BIBLIOTECA NACIONAL DE ESPAÑA Results  It has been migrated: – Authority file (persons, corporate bodies) – Modern Monographs: 1.947.332 – Ancient Monographs: 107.803 – Notated music: 162.519 – Sound recordings: 172.484  Building of datos.bne.es: – Datadumps (from the website and Datahub) – Sparq end-point
    12. 12. BIBLIOTECA NACIONAL DE ESPAÑA Results (2)  Data released under CC0 license.  More than 58 millions of triplets.  Owl:sameAs links (downloadable also as a single datadump) to VIAF, GND, LIBRIS, SUDOC and DbPedia.
    13. 13. BIBLIOTECA NACIONAL DE ESPAÑA Reception  Data and FRBR constructing and mapping has been used, studied and cited by JISC or ABES.  Reviews from OpenBiblio: – “A complicating matter from a data wrangler’s point of view is that the field names are based on IFLA Standards, which are numeric codes and not ‘guessable’ English terms like DublinCore fields for example. This is more correct from an international and data quality point of view, but does make the initial mapping more time consuming.”  No known reuse of data beyond library communnity.
    14. 14. BIBLIOTECA NACIONAL DE ESPAÑA Problems found and future developments  Mapping problems. – Not all FRBR basic relations could be extracted or inferred from MARC records. – MARC ambiguities make exact matching difficult.  Future developments: – Visualization tools. – Mapping refinements. – New sources of data: digital objects. – Better coding and cataloging procedures to improve future matchings
    15. 15. BIBLIOTECA NACIONAL DE ESPAÑA Other developments: subjects in SKOS vocabulary  Subjects records could not be properly mapped to FRSAD vocabularies, so they were in large part left aside in the first migration.  SKOSification of BNE Subject List (2013)  Accessible from the Sparql end-point from datos.bne.es.  Data not yet fully integrated with the bibliographic record.  Despite being in beta phase and lack of promotion, has been already reused, although in library domain.
    16. 16. BIBLIOTECA NACIONAL DE ESPAÑA Other developments: BNE Escolar (BNE for schools)  Aim: semantic exploitation and educational valorization of Spanish cultural goods. Tool for students and teachers of primary and secondary education.  Digital objects selected from the BNE Digital Library, related to enrich and complement educational contents of the intermediate education curriculum.  Collection of over 8,500 digitalized documents of the Hispanic Digital Library Selection (books, drawing, engravings, maps…)  It is part of the larger Didactalia, a big platform of educational resources, built upon semantic mark-up. Allows social tagging from teachers recommending contents.  Records are taken from MARC21 and converted into Learning Objects Models (LOM) ontology, and enriched to other sources.
    17. 17. BIBLIOTECA NACIONAL DE ESPAÑA BNEscolar Demo
    18. 18. BIBLIOTECA NACIONAL DE ESPAÑA Conclusion  Library-specific models and vocabularies (namely, FRBR-based ontologies), provides a rich way of publishing data, and an appropiate frame for relationships to other collections from libraries and other heritage institutions.  At the same time, library models and vocabularies seems not appropiate to accomplish one of the Library Linked Data goals, that is, make library data understandable and thus reusable to other parties.  Currently, National libraries serve a wide audience with difference needs, so different approachs are suitable, as our experience shows.
    19. 19. BIBLIOTECA NACIONAL DE ESPAÑA Ricardo Santos Muñoz Departamento de Proceso Técnico ricardo.santos@bne.es Pº de Recoletos 20 -22 28071 Madrid España T +34 915 807 735 www.bne.es

    ×