Vila LOD-innovacion- bib-semweb-redux


Published on

Published in: Technology, Education
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Vila LOD-innovacion- bib-semweb-redux

  1. 1. Linked Open Data and innovation: libraries and the Semantic Web Daniel Vila Suero 03/11/2011 Ontology Engineering Group, Universidad Politécnica de MadridAgradecimientos: A los miembros del OEG que han participado en la elaboración de estas transparencias
  2. 2. Contenido• Linked Data• Library Linked Data - W3C Incubator Group - IFLA - Stanford Manifesto - A Bibliographic Framework for the Digital Age• Casos de uso, herramientas y demos 2
  3. 3. Linked Data 3
  4. 4. World Wide Web (Visión original) 4
  5. 5. Smart Web, Dumb Web• La Web está llena de aplicaciones ―inteligentes‖ (Motores de búsqueda, recomendadores, geolocalización, etc.)• Sin embargo, también se dan situaciones en las que la respuesta de la Web no parece alineada con el estado de la tecnología 5
  6. 6. Smart Web, Dumb Web• Problemas frecuentes (Usuario): - Información inconsistente entre servicios aparentemente relacionados. - Necesidad de visitar múltiples aplicaciones para una tarea simple - Dificultad para encontrar información muy específica CLAVE: Integración de datos en la Web en un formato estándar• Problemas frecuentes (Desarrollador): - Heterogeneidad de formatos - Formatos propietarios o de difícil tratamiento - Falta de documentación APIs - 1 API 1 Funcionalidad 1 Forma de acceso => APIs desconectadas 6
  7. 7. ¿Qué es la Web de Linked Data?• Han pasado 10 años desde la visión original de la Web Semántica.• Hasta ahora poco ejemplos de impacto real• Tecnologías demasiado complejas (maduras a día de hoy)• En 2006 aparece la iniciativa Linked Data - Una extensión de la Web actual donde se publican y consumen datos de acuerdo a 4 principios (
  8. 8. Principios de Linked Data1. Utilizar URIs para hacer referencias a cosas (recursos)2. Usar el protocolo HTTP para publicar/recuperar recursos Describir datos en un formato estándar (RDF) dbpedia:Tim_Berners-Lee rdf:type foaf:Person foaf:surname "Berners-Lee"@en ; foaf:givenName "Tim"@en ;4. Enlazar con otros recursos a través de URIs 8
  9. 9. Web tradicional (documentos)Links to 9
  10. 10. Linked Data (Web de datos)leads RDF Book Mashup 10
  11. 11. Linked Data cloud evolutionCredits: Richard Cyganiak 11
  12. 12. Linked Data cloud evolutionCredits: Richard Cyganiak 12
  13. 13. Linked Data cloud evolutionCredits: Richard Cyganiak 13
  14. 14. Linked Data cloud evolutionCredits: Richard Cyganiak 14
  15. 15. Linked Data cloud evolutionCredits: Richard Cyganiak 15
  16. 16. Linked Open Data 2011―Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch.‖ 16
  17. 17. Credits: Frank Van Harmelen 17
  18. 18. Credits: Frank Van Harmelen 18
  19. 19. Herramientas en la Web de Datos• Representar recursos: RDF (Resource Description Format)• Modelar/describir recursos: RDFS/OWL• Consultar/recuperar recursos: SPARQL y HTTP• Transformar recursos a RDF: - Bases de datos: RDB2RML, OdeMapster, R2O - MARC21: MARC2LOD - Any23 - XML: GRRDL - Etc.• Metodología: Linked Data lifecycles 19
  20. 20. Library Linked DataCreating a “knowledge-generating engine” 20
  21. 21. Library Linked Data is here• Growing interest on Linked Data: - Stanford Manifesto - IFLA Semantic Web Special Interest Group and RDFS/OWL models - W3C Incubator Group - RDA vocabularies - European Librarians supporting Open Licensing announcement - LOC Bibliographic Framework Initiative 21
  22. 22. European national libraries: Open Data• CENL (Conference of European National Librarians)• 46 National Libraries voted to support open licensing• Data more accesible and reusable• Keys: - Innovation for app development - Enrichment of services like Wikipedia with highly curated data - Generation of relationships accross datasets through LOD 22
  23. 23. Stanford Manifesto• Manifesto for Linked Libraries - 1. Publishing data on the Web for discovery over preserving it in dark archives. 2. Continuous improvement of data over waiting to publish perfect data. 3. Semantically structured data over flat unstructured data. 4. Use common vocabularies over rolling your own. 5. Collaboration over working alone. 6. Web standards over domain-specific standards. 7. Use of open, commonly understood licenses over closed, local licenses. 23
  24. 24. LOC: A Bibliographic Framework for the Digital Age• Bibliographic Framework Initiative• 31st of October 2011• APPROACH: Embrace the Web and Linked Data and broadly adopted data models (RDF)• GOAL: move the current library-technological environment away from being a niche market unto itself to one more readily understandable by present and future - data creators, - data modelers, - and software developers. 24
  25. 25. W3C incubator (XG) activity• Short-lived working groups: around 1 year• No delivery of W3C Recommendations, but ―innovative ideas for specifications, guidelines, and applications that are not (or not yet) clear candidates as Web standards‖ 25
  26. 26. Library Linked Data incubator• May 2010 – August 2011• 51 participants• 23 W3C member organizations VU Amsterdam, INRIA, Library of Congress, JISC, Deutsche Nationalbibliotek, DERI Galway, OCLC, Talis, LANL, Helsinki University of Technology, University of Edinburgh, Universidad Politécnica de Madrid, etc.• Invited experts from other organizations BnF, National Library of Latvia, German National Library of Economics, etc.
  27. 27. W3C XG ParticipantsAlexander Haffner Guenther Neher Marcia ZengAndrás Micsik Herbert Van De Sompel Mark van AssemAndrew Houghton Hideaki Takeda Martin MalmstenAnette Seiler Ikki Ohmukai Michael HausenblasAntoine Isaac Jeff Young Michael PanzerAsaf Bartov Joachim Neubert Monica DukeBernard Vatant Jodi Schneider Nicolas DelaforgeCarlo Meghini Jon Phipps Oreste SignoreDan Brickley Jonathan Rees Peter MurrayDaniel Vila Suero Kai Eckert Ray DenenbergDickson Lukose Karen Coyle Ross SingerEd Summers Kevin Ford Stu WeibelEmmanuelle Bermes Kim Viljanen Thomas BakerFelix Sasaki Kosuke Tanabe Tod MatolaFumihiro Kato Lars Svensson Uldis BojarsGlen Newton Laszlo Kovacs William WaitesGordon Dunsire Marcel Ruhl Wolfgang Halb Up-to-date list at
  28. 28. W3C XG Mission• To help increase global interoperability of library data on the Web, by - bringing together people involved in Semantic Web activities—focusing on Linked Data—in the library community and beyond, - building on existing initiatives, and - identifying collaboration tracks for the future. 28
  29. 29. W3C XG Results• Loads of interesting discussions! See public mailing list archive: lld/• Final report (3 separate documents) 25/10/2011 1. Final report 2. Datasets, Value Vocabularies, and Metadata Element Sets 3. Use Cases report• Translation into Spanish coming soon… 29
  30. 30. W3C XG Final report• Available at 20111025/ BENEFITS CURRENT SITUATION RECOMMENDATIONS 30
  31. 31. W3C XG Final report: Benefits Researchers, students, patrons OrganizationsBENEFITS Librarians, archivists and curators Developers and vendors 31
  32. 32. W3C XG Final report: Benefits • Improved discovery and Researchers, students, browsing of data patrons • Better visibility of library resources (SEO) OrganizationsBENEFITS • Enriched (scientific) publications Librarians, archivists and curators Developers and vendors 32
  33. 33. W3C XG Final report: Benefits • Bottom-up approach to data Researchers, students, publication  More actors, patrons different views • Wider choice of vendors and Organizations technologies, not only ILSBENEFITS • + Visibility and connectivity  - infrastructure costs Librarians, archivists and curators • ―The coolest thing to do to your data will be thought by Developers and vendors someone else‖ 33
  34. 34. W3C XG Final report: Benefits • Up-to-date resource Researchers, students, descriptions directly citable by patrons catalogers  thanks to URIs+RDF Organizations • Reduce redundancy and duplicationBENEFITS • Catalogers efforts focused on Librarians, archivists and their domain of expertise curators Developers and vendors 34
  35. 35. W3C XG Final report: Benefits • Use of well-known Web Researchers, students, standards and protocols patrons • More and more generic tools, not tied to library-specific Organizations formatsBENEFITS • Welcomes a much larger developer community Librarians, archivists and curators Developers and vendors 35
  36. 36. W3C XG Final report: Current situation Issues with traditional library data1. Library data is not integrated with Web resources2. Library standards are designed only for the library community3. Library data is expressed primarily as natural-language4. Library and SemWeb communities use different terminology for similar metadata concepts5. Library technology changes depend on vendor systems development 36
  37. 37. W3C XG Final report: Current situation Library Linked Data available today1. Fewer bibliographic datasets than value vocabs & el. sets2. Variable quality and support3. Cross-linking requires further effort and coordination 37
  38. 38. W3C XG Final report: Current situation Right issues1. Rights ownership is complex2. Data rights may be considered business assets 38
  39. 39. W3C XG Final report: Current situation Recommendations: Library leadership1. Identify candidate data sets for early exposure2. Foster discussion about Open Data and rights 39
  40. 40. W3C XG Final report: Current situation Recommendations: data and sys designers1. Design/Test user services based on LD capabilities2. Develop policies for managing vocabs and URIs3. Create URIs for the items in library datasets4. Reuse and Map to existing LD vocabularies 40
  41. 41. W3C XG Final report: Current situation Recommendations: librarians and archivists1. Preserve LD element sets and value vocabularies2. Apply library experience in curation and long-term preservation to LD datasets 41
  42. 42. W3C XG Vocabs and Datasets report• Available at vocabdataset-20111025/ British National Bibliography, Datasets Europeana LOD, .. LCSH, VIAF, AGROVOC …Value vocabularies Element sets 42
  43. 43. W3C XG Use cases report• Available at usecase-20111025/ 43
  44. 44. W3C XG Use cases report• 8 Clusters• 60 Individual use cases from XG participants and community• Generalized (Extracted) use cases for each cluster• Good place to look for examples, fresh ideas, space of innovation and research topics! 44
  45. 45. W3C XG Use cases reportGenerated with TagCrowd 45
  46. 46. Use cases, tools and applications 46
  47. 47. Chronicling 47
  48. 48. Chronicling America• Historic newspapers and select digitized newspaper pages (+2.5 million), produced by the National Digital Newspaper Program• From1690 to the present• Nice example of Linked Data best practices and transparent integration• Linking and describing: - DBpedia - Dublin Core and DCMI Terms - FRBR concepts in RDF - GeoNames - OAI-ORE (more about aggregations below) - OWL - RDA - WorldCat 48
  49. 49. Chronicling AmericaText/html 49
  50. 50. Chronicling America rdf 50
  51. 51.• December 2011• Catalog data from BNE MARC21 to RDF using IFLA models - Authority records: +5 million - Bibliographic records: +8 million• Release of the MARC2LOD tool (Open Source)• Public announcement at the BNE:14th December 51
  52. 52. BNE data Modelling
  53. 53. MARC2LOD ToolFlexible tool for transforming MARC21 records to RDFAllows free selection of any RDFS/OWL set of termsEasy to handle mappingsComposed of two modules: MODULE 1: Mapping templates and report generation MODULE 2: RDF Generation and linkageThree main steps: 1) Mapping template generation 2) Mapping assignment by domain experts 3) RDF generation and linkage
  54. 54. marc2LODFramework overview
  55. 55. marc2LODMapping templates
  56. 56. Linked Data architecture 56
  57. 57. DEMO 57
  58. 58. map4rdf map4rdf:• Google maps viewer of RDF resources• Resources with spatial information• Extensible with Google plugins• Used in other applications like Aemet, Goodrelations map4rdf SPARQL Triplestore 58
  59. 59. DEMO 59
  60. 60. Provinces60
  61. 61. Capital of Province61
  62. 62. Provinces – Industry Production Index62
  63. 63. Beaches63
  64. 64. Visor: A tool for end user data exploration VISOR alpha v0.11 A tool for end user data exploration•• Linked data browser from University of Southampton• Multifaceted browsing• Configurable for any SPARQL endpoint• DEMO: 64
  65. 65. Aemoo•• Explore knowledge using knowledge patterns• Uses: - DBPEDIA - Wikipedia links - Twitter - Googles News feed 65