Your SlideShare is downloading. ×
datos.bne.es: Publishing and consuming
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Introducing the official SlideShare app

Stunning, full-screen experience for iPhone and Android

Text the download link to your phone

Standard text messaging rates apply

datos.bne.es: Publishing and consuming

403
views

Published on

A presentation by Daniel Vila Suero of the Ontology Engineering Group at the Universidad Politecnica de Madrid. …

A presentation by Daniel Vila Suero of the Ontology Engineering Group at the Universidad Politecnica de Madrid.

Delivered at the Cataloguing and Indexing Group Scotland (CIGS) Linked Open Data (LOD) Conference which took place Fri 21 September 2012 at the Edinburgh Centre for Carbon Innovation.


0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
403
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
2
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. datos.bne.es: Publishing and consuming Daniel Vila Suero dvila@fi.upm.esOntology Engineering Group, Universidad Politécnica de Madrid Acknowledgements: OEG Members, BNE staff (Elena Escolano, Marina Jimenez Piano, Ana Manchado, Mar HernándezAgustí, Ricardo Santos and others)
  • 2. datos.bne.es 2
  • 3. Backgrounddatos.bne.es • Initiative from BibliotecaNacional de Españatogether with OEG-UPM Madrid. • Multidisciplinary effort: Librarians, Computer scientists, linguists.. • Close collaboration between library experts and computer scientists. • Initiated as a small scale proof-of-concept: the "Cervantes dataset" using IFLA vocabularies (FRBR, ISBD) and others (MADS, DC, RDA..) 3
  • 4. Main goalsdatos.bne.es • Perform the transformation incrementally and iteratively • Develop a system where library experts can define and assess the mappings to RDF independently from the IT people • Be vocabulary agnostic (BNE uses FRBR as core model, but the system would allow them to use RDA for example) • Have a clear picture of the source data before you start to transform (help to detect possible deficiencies in the source data) 4
  • 5. Source MARC recordsdatos.bne.es AUTHORITY BIBLIOGRAPHIC Persons 76576 Maps Corporate bodies 320727 Sound recordings Conferences 166017 Gravings, drawings, pictures Titles 35770 Manuscripts Subject 143959 Ancient books 2696560 Modern books 178473 Scores 3021 Electronic resources 156634 Serials 96672 Videos 5
  • 6. Some figuresdatos.bne.es • Total number of authority records: 4.100.000 • Total number of bibliographical records: 2.390.140 • Total number of RDF triples: 58.053.215 • Number of links: (15% authorities): 587.520 • Linked sources: • VIAF • SUDOC (French Collective University Catalogue) FR • GND (German National Library Authorities) GER • LIBRIS Sweden • DBPedia • Soon BNF, BNB, German Bibliographie 6
  • 7. Some statisticsdatos.bne.es 282,879 497,644 Manifestation 2,390,103 Work 1,114,719 Person Expression 1,163,764 Thema 1,969,526 Corporate Body 7
  • 8. Some statisticsdatos.bne.es 2,500,000 2,129,222 2,129,222 2,000,000 1,246,773 1,054,736 1,500,000 1,246,773 1,000,000 1,054,736 500,000 0 85,347 85,347 78,561 16,462 16,462 755 755 8
  • 9. Publishing 9
  • 10. Our data modelPublishing 10
  • 11. Transformation processPublishing • How to facilitate the mapping process to library experts? 1. Use a familiar and intuitive interface: Spreadsheets 2. Work only on whats in the database: Pre-process records to build the spreadsheets • 3 step-process 3 different spreadsheets 1. Classification: is it a Person? a Work? a Manifestation? 2. Annotation: name, birth date, title, language of expression 3. Relation: find relationships between entities (Person is creator of a certain work) 11
  • 12. Publishing 12
  • 13. Mapping processPublishingOpen mappings at: http://bne.linkeddata.es/mapping-marc21 13
  • 14. Mapping processPublishing 14
  • 15. Mapping processPublishing 15
  • 16. Still a lot of work to doPublishing • We cover only core relations of FRBR • There are a significant amount of manifestationsnot linked to their expressions  currently looking at more sophisticated clustering techniques • Manifestations are not linked to their corresponding digitalized materials at the digital library (Biblioteca Digital Hispánica)  Next version (to be published this year) will contain these links • Classification step can be further automatized 16
  • 17. Consuming 17
  • 18. PerspectivesConsuming • 2 different perspectives: - Systems and applications: • SPARQL endpoint, • Linked Data API - End-user interfaces • + an interesting side-effect: - By applying FRBR and RDF mappings we can (and did) improve the catalogue • Using standard web technologies and more intuitive models we open the door to: - Data analytics and cleansing, catalogue enrichment, reuse by smaller institutions… 18
  • 19. Graph analysis example Consuminghttp://bne.linkeddata.es/graphvisUsing Open-source tools: Gephi for example 19
  • 20. Enabling access to systems and appsConsumingLinked Data API: http://datos.bne.es/frontend/persons 20
  • 21. Flexible access to dataConsuming Out of the box: •Search by every field •Access cluster of resources •Filtering •Paging •Serve multiple formats: XML, Turtle, JSON 21
  • 22. Different views on the dataConsuming XML HTML 22
  • 23. END-user interfacesConsuming Current linked data opens the door to: •Re-rank OPAC results •Better clustering of results •Recommendation •Enhance data from other sources 23