Your SlideShare is downloading. ×
Introduction to Linked Data
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Saving this for later?

Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime - even offline.

Text the download link to your phone

Standard text messaging rates apply

Introduction to Linked Data

3,466
views

Published on

Basic introductory talk about the Web of Linked Data, given to undergraduate and posgraduate students of Universidad del Valle (Cali, Colombia) in September 2010. Knowledge about Semantic Web is …

Basic introductory talk about the Web of Linked Data, given to undergraduate and posgraduate students of Universidad del Valle (Cali, Colombia) in September 2010. Knowledge about Semantic Web is required

Published in: Technology

0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
3,466
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
103
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • 727 veces el term-based record-basedthesaurus
  • Transcript

    • 1. IntroductiontoLinked Data
      Oscar Corcho, Asunción Gómez Pérez ({ocorcho, asun}@fi.upm.es)
      Universidad Politécnica de Madrid
      Universidad del Valle, Cali, ColombiaSeptember 10th 2010
      Credits: Raúl García Castro, Oscar Muñoz, Jose Angel Ramos Gargantilla, María del Carmen Suárez de Figueroa, Boris Villazón, Alex de León, Víctor Saquicela, Luis Vilches, Miguel Angel García, Manuel Salvadores, Guillermo Alvaro, Juan Sequeda, Carlos Ruiz Moreno and manyothers
      WorkdistributedunderthelicenseCreativeCommonsAttribution-Noncommercial-Share Alike 3.0
    • 2. Contents
      IntroductiontoLinked Data
      Linked Data publication
      MethodologicalguidelinesforLinked Data publication
      RDB2RDF tools
      Technicalaspects of Linked Data publication
      Linked Data consumption
      2
    • 3. Whatisthe Web of Linked Data?
      An extension of the current Web…
      … where information and services are given well-defined and explicitly represented meaning, …
      … so that it can be shared and used by humans and machines, ...
      ... better enabling them to work in cooperation
      How?
      Promoting information exchange by tagging web content with machineprocessable descriptions of its meaning.
      And technologies and infrastructure to do this
      And clear principles on how to publish data
      data
    • 4. What is Linked Data?
      Linked Data is a term used to describe a recommended best practice for exposing, sharing, and connecting pieces of data, information, and knowledge on the Semantic Web using URIs and RDF.
      Part of the Semantic Web
      Exposing, sharing and connecting data
      Technologies: URIs and RDF (although others are also important)
    • 5. The fourprinciples (Tim Berners Lee, 2006)
      Use URIs as names for things
      Use HTTP URIs so that people can look up those names.
      When someone looks up a URI, provide useful information, using the standards (RDF*, SPARQL)
      Include links to other URIs, so that they can discover more things.
      http://www.w3.org/DesignIssues/LinkedData.html
      5
      http://www.ted.com/talks/tim_berners_lee_on_the_next_web.html
    • 6. Linked Open Data evolution
    • 7
      LOD Cloud May 2007
      Facts:
      • Focal points:
      • 9. DBPedia: RDFizedvesion of Wikipiedia; many ingoing and outgoing links
      • 10. Music-related datasets
      • 11. Big datasets include FOAF, US Census data
      • 12. Size approx. 1 billion triples, 250k links
      Figure from [4]
    • 13. 8
      LOD Cloud September 2008
      Facts:
      • More than 35 datasets interlinked
      • 14. Commercial players joined the cloud, e.g., BBC
      • 15. Companies began to publish and host dataset, e.g. OpenLink, Talis, or Garlik.
      • 16. Size approx. 2 billion triples, 3 million links
      Figure from [4]
    • 17. 9
      LOD Cloud March 2009
      Facts:
      • Big part from Linking Open Drug cloud and the BIO2RDF project (bottom)
      • 18. Notable new datasets: Freebase, OpenCalais, ACM/IEEE
      • 19. Size > 10 billion triples
      Figure from [4]
    • 20. LOD clouds
    • 21. WhyLinked Data?
      Basically, tomovefrom a Web of documentsto a Web of Data
      Let’s try anexample:
      • Tell me whichfootballplayers, born in theprovince of Albacete, in Spain, havescored a goal in theWorld Cup final
      Disclaimer:
      Sorryto use anexampleaboutfootball, butyouhavetounderstandthatforseveralyearsSpaniardswillbetalkingaboutfootball a lot ;-)
      Example courtesy of Guillermo Alvaro Rey
    • 22. Informationsearch in the Web of documents
      • ¿?
      Example courtesy of Guillermo Alvaro Rey
    • 23. What we were actually looking for
      Example courtesy of Guillermo Alvaro Rey
    • 24. Itwouldbebettertomake a data query…
      (footballplayersfrom Albacete whoplayedEurocup 2008)
      Example courtesy of Guillermo Alvaro Rey
    • 25. Howshouldwepublish data?
      Formats in which data ispublishednowadays…
      XML
      HTML
      DBs
      APIs
      CSV
      XLS

      However, mainlimitationsfrom a Web of Data point of view
      Difficulttointegrate
      Data isnotlinkedtoeachother, as ithappenswith Web documents.
    • 26. Which format do we use then?
      RDF (ResourceDescription Framework)
      Data model
      Basedon triples: subject, predicate, object
      <Oscar> <vive en> <Madrid>
      <Madrid> <es la capital de> <España>
      <España> <es campeona de> <Mundial de Fútbol>

      Serialised in differentformats
      RDF/XML, RDFa, N3, Turtle, JSON…
    • 27. URIs (Universal-UniformResourceIdentifer)
      Two types of identifiers can be used to identify Linked Data resources
      URIRefs(Unique Resource IdentifiersReferences)
      A URI and an optional FragmentIdentifier separated from the URI by the hash symbol ‘#’
      http://www.ontology.org/people#Person
      people:Person
      Plain URIs can also be used, as in FOAF:
      http://xmlns.com/foaf/0.1/Person
      17
    • 28. How do wepublishLinked Data?
      ExposingRelationalDatabasesorother similar formatsintoLinked Data
      D2R
      Triplify
      R2O
      NOR2O
      Virtuoso
      Ultrawrap

      Usingnative RDF triplestores
      Sesame
      Jena
      Owlim
      Talisplatform

      Incorporatingit in theform of RDFa in CMSslikeDrupal
      18
    • 29. How do we consume Linked Data?
      Linked Data browsers
      To explore things and datasets and to navigate between them.
      Tabulator Browser (MIT, USA), Marbles (FU Berlin, DE), OpenLink RDF Browser (OpenLink, UK), Zitgist RDF Browser (Zitgist, USA), Disco Hyperdata Browser (FU Berlin, DE), Fenfire (DERI, Ireland)
      Linked Data mashups
      Sites that mash up (thus combine Linked data)
      Revyu.com (KMI, UK), DBtune Slashfacet (Queen Mary, UK), DBPedia Mobile (FU Berlin, DE), Semantic Web Pipes (DERI, Ireland)
      Search engines
      To search for Linked Data.
      Falcons (IWS, China), Sindice (DERI, Ireland), MicroSearch (Yahoo, Spain), Watson (Open University, UK), SWSE (DERI, Ireland), Swoogle (UMBC, USA)
      Listing on this slide by T. Heath, M. Hausenblas, C. Bizer, R. Cyganiak, O. Hartig
      19
    • 30. Linked Data browsers (Disco)
    • 31. Linked Data Mashup (LinkedGeoData)
      © Migración de datos a la Web de los Datos - Enfoques, técnicas y herramientas
      Luis Manuel Vilches Blázquez
    • 32. Linked Data Mashup (DBpedia Mobile)
      http://wiki.dbpedia.org/DBpediaMobile
      © Migración de datos a la Web de los Datos - Enfoques, técnicas y herramientas
      Luis Manuel Vilches Blázquez
    • 33. Linked Data Search Engines (Sindice and SIG.MA)
      Entity lookup service. Find a document that mentions a URI or a keyword.
    • 34. Linked Data SearchEngines (NYT)
      The New York Times: Alumni In The News
      http://data.nytimes.com/schools/schools.html
    • 35. Linked Data SearchEngines (NYT)
      The New York Times: Source code is available
      … and is based on SPARQL queries
    • 36. Oneadditionalmotivation: Open Government
      Government and state administration should be opened at all levels to effective public scrutiny and oversight
      Objectives:
      Transparency
      Participation
      Collaboration
      Inclusion
      Cost reduction
      Interoperability
      Reusability
      Leadership
      Market & Value
      26
      • Some Links:
      • 37. B. Obama –Transparency and Open Government
      • 38. T. Berners-Lee - Raw data now!
      • 39. J. Manuel Alonso - ¿Qué es Open Data?
      • 40. Open Government Data
      • 41. 8 Principles of Open Government Data
    • Open Government. USA and UK
      27
      BOTTOM-UP
      Top-down
    • 42. Linked Data Mashup (data.gov)
      Clean Air Status and Trends (CASTNET)
      http://data-gov.tw.rpi.edu/demo/exhibit/demo-8-castnet.php
    • 43. Linked Data in the UK
      Education
      http://education.data.gov.uk/id/school/106661
      Parliament
      http://parliament.psi.enakting.org/id/member/1227
      Maps
      E.g., London: http://data.ordnancesurvey.co.uk/id/7000000000041428
      http://map.psi.enakting.org
      Transport
      http://www.dft.gov.uk/naptan/
      SameAs service
      http://www.sameas.org
      Challenges
      http://gov.tso.co.uk/openup/sparql/gov-transport
      29
    • 44. Linked Data Mashup (data.gov.uk)
      Research Funding Explorer
      http://bis.clients.talis.com/
    • 45. Open GovernmentSpain. Euskadi
      31
    • 46. Open GovernmentSpain. Abredatos
      32
    • 47. Open GovernmentSpain. Zaragoza
      33
    • 48. Open GovernmentSpain. Asturias
      34
    • 49. Linked Data Mashup (Waterquality)
      Water quality in Asturias’ beaches
      http://datos.fundacionctic.org/sandbox/asturias/playas/
    • 50. Contents
      IntroductiontoLinked Data
      Linked Data publication
      MethodologicalguidelinesforLinked Data publication
      RDB2RDF tools
      Technicalaspects of Linked Data publication
      Linked Data consumption
      36
    • 51. GeoLinkedData
      It is an open initiative whose aim is to enrich the Web of Data with Spanish geospatial data.
      This initiative has started off by publishing diverse information sources, such as National Geographic Institute of Spain (IGN-E) and National Statistics Institute (INE)
      http://geo.linkeddata.es
    • 52. Motivation
      99.171 % English
      0.019 % Spanish
      The Web of Data ismainlyfor
      Englishspeakers
      Poorpresence of Spanish
      Source:Billion Triples dataset at http://km.aifb.kit.edu/projects/btc-2010/
      Thanks to Aidan and Richard
    • 53. Related Work
    • 54. Impact of Geo.linkeddata.es
      Number of triples in Spanish (July 2010): 1.412.248
      Number of triples in Spanish (EndAugust 2010): 21.463.088
      40
      Asunción Gómez Pérez
    • 55. Processfor Publishing Linked Data onthe Web
      Identification
      of the data sources
      Vocabulary
      development
      Generation
      of the RDF Data
      Publication
      of the RDF data
      Data cleansing
      Linking
      the RDF data
      Enable effective
      discovery
    • 56. 1. Identification and selection of the data sources
      Identification
      of the data sources
      Instituto GeográficoNacional
      Vocabulary
      development
      Generation
      of the RDF Data
      Publication
      of the RDF data
      Data cleansing
      Linking
      the RDF data
      Instituto Nacionalde Estadística
      Enable effective
      discovery
    • 57. 1. Identification and selection of the data sources
      Instituto Geográfico Nacional (GeographicSpanishInstitute)
      Multilingual (Spanish, Vasc, Gallician, Catalan)
      Conceptualizationmistmatches
      Granularity (scale concept)
      Textual information
      Particularaties
      Longitude
      latitude
      • Instituto Nacional de Estadística (StatisticSpanishInstitute)
      • 58. Monolingual
      • 59. Numericalinformation
      • 60. Particularaties
      Geo (textual level)
      Temporal
      43
      Asunción Gómez Pérez
    • 61. 1. Identification and selection of the data sources
      IGN-E
    • 62. 1. Identification and selection of the data sources
      IndustryProductionIndex
      Year
      Province
    • 63. 2. Vocabulary development
      http://www4.wiwiss.fu-berlin.de/bizer/pub/LinkedDataTutorial/#whichvocabs
      Identification
      of the data sources
      Vocabulary
      development
      Generation
      of the RDF Data
      Thisisnotenough
      Publication
      of the RDF data
      Data cleansing
      Linking
      the RDF data
      Enable effective
      discovery
    • 64. 2. Vocabularydevelopment
      Features
      Lightweight :
      Taxonomies and a fewproperties
      Consensuatedvocabularies
      Toavoidthemappingproblems
      Multilingual
      Linked data are multilingual
      TheNeOnmethodology can helpto
      Re-enginer Non ontologicalresourcesintoontologies
      Pros: use domainterminologyalreadyconsensuatedbydomainexperts
      Withdraw in heavyweightontologiesthosefeaturesthatyoudon’tneed
      Reuseexistingvocabularies
      47
      Identification
      of the data sources
      Vocabulary
      development
      Generation
      of the RDF Data
      Publication
      of the RDF data
      Data cleansing
      Linking
      the RDF data
      Enable effective
      discovery
      Asunción Gómez Pérez
    • 65. Knowledge Resources
      Ontological Resources
      O. Design Patterns
      3
      4
      O. Repositories and Registries
      5
      6
      Flogic
      RDF(S)
      OWL
      OntologicalResource
      Reuse
      O. Aligning
      O. Merging
      5
      6
      2
      Ontology Design
      Pattern Reuse
      Non Ontological Resource
      Reuse
      4
      3
      6
      Non Ontological Resources
      2
      Ontological Resource
      Reengineering
      7
      Glossaries
      Dictionaries
      Lexicons
      5
      Non Ontological Resource
      Reengineering
      4
      6
      Classification
      Schemas
      Thesauri
      Taxonomies
      Alignments
      2
      RDF(S)
      1
      Flogic
      O. Conceptualization
      O. Implementation
      O. Formalization
      O. Specification
      Scheduling
      OWL
      8
      Ontology Restructuring
      (Pruning, Extension,
      Specialization, Modularization)
      9
      O. Localization
      1,2,3,4,5,6,7,8, 9
      Ontology Support Activities: Knowledge Acquisition (Elicitation); Documentation;
      Configuration Management; Evaluation (V&V); Assessment
      48
    • 66. Vocabularydevelopment: Specification
      Content requirements: Identifythe set of questionsthattheontologyshouldanswer
      Whichone are theprovinces in Spain?
      Where are thebeaches?
      Where are thereservoirs?
      Identifytheproductionindex in Madrid
      Whichoneisthecitywithhigherproductionindex?
      Give me Madrid latitude and altitude
      ….
      Non-contentrequirements
      Theontologymustbe in thefourofficialSpanishlanguages
      49
      Asunción Gómez Pérez
    • 67. 2. Lightweight Ontology Development
      WGS84 Geo Positioning: an RDF vocabulary
      scv:Dimension
      scv:Item
      scv:Dataset
      hydrographical phenomena (rivers, lakes, etc.)
      Vocabulary for instants, intervals, durations, etc.
      Names and international code systems for territories and groups
      Ontology for OGC Geography Markup Language
      reused
      Following the INSPIRE
      (INfrastructure for SPatial InfoRmation in Europe) recommendation.
      hydrOntology,SCOVO, FAO Geopolitcal, WGS84, GML, and Time
    • 68. Objetivos:
      INSPIRE intenta conseguir fuentes armonizadas de Información Geográfica para dar soporte a la formulación, implementación y evaluación de políticas comunitarias (Medio Ambiente, etc).
      Fuentes de Información Geográfica: Bases de datos de los Estados Miembros (UE) a nivel local, regional, nacional e internacional.
      Contexto – Directiva INSPIRE
      Luis Manuel Vilches Blázquez
    • 69. INSPIRE - Anexos
      Luis Manuel Vilches Blázquez
    • 70. hydrOntology
      Existencia de gran diversidad de problemas (múltiples fuentes, heterogeneidad de contenido y estructuración, ambigüedad del lenguaje natural, etc.) en la información geográfica.
      Necesidad de un modelo compartido para solventar los problemas de armonización y estructuración de la información hidrográfica.
      hydrOntology es una ontología global de dominio desarrollada conforme a un acercamiento top-down.
      Recubrir la mayoría de los fenómenos representables cartográficamente asociados al dominio hidrográfico.
      Servir como marco de armonización entre los diferentes productores de información geo-espacial en el entorno nacional e internacional.
      Comenzar con los pasos necesarios para obtener una mejor organización y gestión de la información geográfica (hidrográfica).
      Luis Manuel Vilches Blázquez
    • 71. Fuentes
      Tesauros y Bibliografía
      Catálogos de fenómenos
      Getty
      FTT ADL
      BCN25
      GEMET
      WFD
      CC.AA.
      EGM & ERM
      Diccionarios y
      Monografías
      BCN200
      Nomenclátor Geográfico Nacional
      Nomenclátor Conciso
      Luis Manuel Vilches Blázquez
    • 72. Criterios de estructuración
      Directiva Marco del Agua
      Propuesta por Parlamento y Consejo de la UE
      Lista de definiciones de fenómenos hidrográficos
      Proyecto SDIGER
      Proyecto piloto INSPIRE
      Dos cuencas, países e idiomas
      Criterios semánticos
      Diccionarios geográficos
      Diccionario de la Real Academia de la Lengua
      WordNet
      Wikipedia
      Bibliografía de varias áreas de conocimiento
      Herencia: Estructuración actual de catálogos
      Asesoramiento expertos en toponimia del IGN
      Luis Manuel Vilches Blázquez
    • 73. Modelización del dominio hidrográfico
      Luis Manuel Vilches Blázquez
    • 74. Implementación & Formalizacón
      + Pellet
      4
      1
      2
      5
      3
      +150 conceptos (classes) , 47 tipos de relaciones (properties) y 64 tipos de atributos (attribute types)
      Luis Manuel Vilches Blázquez
    • 75. 2. Vocabularydevelopment: HydrOntology
      58
      Asunción Gómez Pérez
    • 76. 3. Generation of RDF
      From the Data sources
      Geographic information (Databases)
      Statistic information (.xsl)
      Geospatial information
      Different technologies for RDF generation
      Reengineering patterns
      R20 and ODEMapster
      Annotation tools
      Geometry generation
      Identification
      of the data sources
      Vocabulary
      development
      Generation
      of the RDF Data
      Publication
      of the RDF data
      Data cleansing
      Linking
      the RDF data
      Enable effective
      discovery
    • 77. 3. Generation of the RDF Data
      NOR2O
      INE
      ODEMapster
      IGN
      Geometry2RDF
      Geospatial
      column
      IGN
    • 78. 3. Generation of the RDF Data / instances
      NOR2O is a software librarythatimplementsthetransformationsproposedbythePatternsfor Re-engineering Non-OntologicalResources (PR-NOR). Currentlywehave 16 PR-NORs.
      PR-NORs define a procedurethattransforms a Non-OntologicalResource (NOR) componentsintoontologyelements. http://ontologydesignpatterns.org/
      · Classification
      schemes
      NOR2O
      · Thesauri
      · Lexicons
      NOR2O
      FAO Water classification
      · Classification scheme
      · Path enumeration data model
      · Implemented in a database
    • 79. Re-engineeringModelforNORs
      Patterns for Re-engineering
      Non-Ontological Resources
      (PR-NOR)
      Ontology Forward
      Engineering
      Con-
      ceptual
      Speci-
      fication
      NOR Reverse
      Engineering
      Conceptua-
      lization
      Transformation
      Requirements
      Formalization
      Design
      Implementation
      Implementation
      RDF(S)
      Non-Ontological Resource
      Ontology
    • 80. PR-NOR library at the ODP Portal
      Technologicalsupport
      http://ontologydesignpatterns.org/wiki/Submissions:ReengineeringODPs
    • 81. 3. Generation of the RDF Data – NOR2O
      NOR2O
      Year
      Industry Production Index
      Province
    • 82. hydrOntology & databases
      NGN
      1:25.000
      multilingüe
    • 83. 3. Generation of the RDF Data – R2O & ODEMapster
      Creation of the R2O Mappings
    • 84. 3. Generation of the RDF Data – Geometry2RDF
      Oracle STO UTIL package
      SELECT TO_CHAR(SDO_UTIL.TO_GML311GEOMETRY(geometry))
      AS Gml311Geometry
      FROM "BCN200"."BCN200_0301L_RIO" c
      WHERE c.Etiqueta='Arroyo'
    • 85. 3. Generation of the RDF Data – Geometry2RDF
    • 86. 3. Generation of the RDF Data – Geometry2RDF
    • 87. 3. Generation of the RDF data – RDF graphs
      IGN INE
      So far
      7 RDF NamedGraphs
      1.412.248 triples
      BTN25
      BCN200
      IPI
      ….
      http://geo.linkeddata.es/dataset/IGN/BTN25
      http://geo.linkeddata.es/dataset/IGN/BCN200
      http://geo.linkeddata.es/dataset/INE/IPI
    • 88. 4. Publication of the RDF Data
      Identification
      of the data sources
      Vocabulary
      development
      SPARQL
      Linked Data
      HTML
      Generation
      of the RDF Data
      IncludingProvenance
      Support
      Publication
      of the RDF data
      Pubby
      Pubby 0.3
      Data cleansing
      Linking
      the RDF data
      Enable effective
      discovery
      Virtuoso 6.1.0
    • 89. 4. Publication of the RDF Data
    • 90. 4. Publication of the RDF Data - License
      License for GeoLinkedData
      Creative Commons Attribution-ShareAlike 3.0
      GNU Free Documentation License
      Each dataset will have its own specific license, IGN, INE, etc.
    • 91. 5. Data cleansing
      Identification
      of the data sources
      Lack of documentation of the IGN datasets
      Broken links: Spain, IGN resources
      Lack of documentation of theontology
      Missingenglish and spanishlabels
      Building a spanish ontology and importing some concepts of other ontology (in English):
      Importing the English ontology. Add annotations like a Spanish label to them.
      Importing the English ontology, creating new concepts and properties with a Spanish name and map those to the English equivalents.
      Re-declaring the terms of the English ontology that we need (using the same URI as in the English ontology), and adding a Spanish label.
      Creating your own class and properties that model the same things as the English ontology.
      Vocabulary
      development
      Generation
      of the RDF Data
      Publication
      of the RDF data
      Data cleansing
      Linking
      the RDF data
      Enable effective
      discovery
    • 92. 5. Data cleansing
      URIs in Spanish
      http://geo.linkeddata.es/ontology/Río
      RDF allows UTF-8 characters for URIs
      But, Linked Data URIs has to be URLs as well
      So, non ASCII-US characters have to be %code
      http://geo.linkeddata.es/ontology/R%C3%ADo
    • 93. 6. Linking of the RDF Data
      Identification
      of the data sources
      Silk - A Link Discovery Framework for the Web of Data
      First set of links: Provinces of Spain
      86% accuracy
      Vocabulary
      development
      Geonames
      Generation
      of the RDF Data
      GeoLinkedData
      DBPedia
      Publication
      of the RDF data
      Data cleansing
      Linking
      the RDF data
      Enable effective
      discovery
    • 94. 6. Linking of the RDF Data
      http://geo.linkeddata.es/page/Provincia/Granada
      77
      Asunción Gómez Pérez
    • 95. 7. Enable effective discovery
      Identification
      of the data sources
      Vocabulary
      development
      Generation
      of the RDF Data
      Publication
      of the RDF data
      Data cleansing
      Linking
      the RDF data
      Enable effective
      discovery
    • 96. DEMO
      http://geo.linkeddata.es/
    • 97. Provinces
    • 98. IndustryProductionIndex – Capital of Province
    • 99. Rivers
    • 100. Beaches
    • 101. Contents
      IntroductiontoLinked Data
      Linked Data publication
      MethodologicalguidelinesforLinked Data publication
      RDB2RDF tools
      Technicalaspects of Linked Data publication
      Linked Data consumption
      84
    • 102. Ontology-based Access to DBs
      1
      3
      2
      4
      Build a new ontology from 1 DB schema and 1 DB
      Align the ontology built with approach 1 with a legacy ontology
      Align an existing DB with a legacy ontology
      a) Massive dump (semantic data warehouse)
      b) Query-driven
      Align an ontology network with n DB schemas and other data sources
      a) Massive dump (semantic data warehouse)
      b) Query-driven
      new ontology
      existing ontology
    • 103. Ontology-based Access to Databases
      Universidad
      Profesor
      Doctorando
      Ontología
      ?
      Organización
      Personal
      BDR
      Modelo
      Relacional
      Pregunta: Nombre de los profesores de la universidad UPM
      * Un profesor es una persona cuyo puesto es “docente”
      * Una universidad es una organización de tipo “3”
      Procesador
      Procesado de la consulta de acuerdo a la descripción formal de correspondencia
      Consulta: valores de la columna nombre de los registros de la tabla Personal para los que el valor de la columna puesto is “docente” que estén relacionados con al menos un registro de la tabla Organización con el valor “3” en la columna tipo y “UPM” en la columna nombre.
    • 104. Align data sourceswithlegacyontologies
      Aeropuertos
      Ontología O2
      Ontología O1
      Centro
      Comunicaciones
      PuntoGPS
      Estación
      Punto Europeo
      Aeropuerto
      PuntoAsiatico
      PuntoEspañol
      Aeropuerto
      f (Aeropuertos)
      =
      PuntoEuropeo
      f (Aeropuertos)
      =
      RC(O2,M1)
      RC(O1,M1)
      Modelo Relacional M1
    • 105. R2O
      is a declarative language to specify mappings between relational data sources and ontologies.
      <xml>
      R2O Mapping
      </xml>
      Organization
      Persons
      University
      RDB
      Professor
      Student
      Relational Model
      Ontology
    • 106. Example: types of mappingsneeded
      Attibute Mapping with transformation
      (Regular Expression)
      Attibute Direct Mapping
      Relation Mapping w. Transformation
      (Regular Expression)
      Relation Mapping w. Transformation
      (Keyword search)
    • 107. Population example (II)
      Population example (II)
      The Operation element defines a transformation based on a regular expression to be applied to the database column for extracting property values
    • 108. For concepts...
      One or more concepts can be extracted from a single data field (not in 1NF).
      A view maps exactly one concept in the ontology.
      For attributes...
      A column in a database view maps directly an attribute or a relation.
      A subset of the columns in the view map a concept in the ontology.
      A subset (selection) of the records of a database view map a concept in the ontology.
      A column in a database view maps an attribute or a relation after some transformation.
      A subset of the records of a database view map a concept in the onto. but the selection cannot be made using SQL.
      A set of columns in a database view map an attribute or a relation.
      R2O (Relational-to-Ontology) Language
    • 109. R2O Basic Syntax
      <conceptmap-defname="Customer">
      <identified-by> Table key </identified-by>
      <uri-as>operation</uri-as>
      <applies-if>condition</applies-if>
      <joins-via> expression </joins-via>
      <documentation>description …</documentation>
      <described-by>attributes,relations</described-by>
      </conceptmap-def>
      <attributemap-defname="http://esperonto/ff#Title">
      <aftertransform>
      <operationoper-id="constant">
      <arg-restrictionon-param="const-val">
      <has-column>fsb_ajut.titol</has-column>
      </arg-restriction>
      </operation>
      </aftertransform>
      </attributemap-def>
      <relationmap-defname="http://esperonto/ff#isCandidateFor">
      <to-concept name="http://esperonto/ff#FundOpp">
      <joins-via>
      <operationoper-id=“equals">
      <arg-restrictionon-param="value1">
      <has-column>fsb_ajut.id</has-column>
      </arg-restriction>
      <arg-restriction on-param="value2">
      <has-column>fsb_candidate.forFund</has-column>
      </arg-restriction>
      </operation>
      </joins-via>
      </relationmap-def>
    • 110. ODEMapster
      generates RDF instances from relational instances based on the mapping description expressed in the R2O document
    • 111. 94
      Mapping Design
      • 3 Mapping Creation Steps
      • 112. Load Ontology
      • 113. Load Database(s)
      • 114. Create mapping
      • 115. 2 Usage Modes
      • 116. Online mode (run time query execution)
      • 117. Offline mode (materialized RDF dump)
    • Contents
      IntroductiontoLinked Data
      Linked Data publication
      MethodologicalguidelinesforLinked Data publication
      RDB2RDF tools
      Technicalaspects of Linked Data publication
      Linked Data consumption
      95
    • 118. Using an RDF repository
      Itallowsstoring and accessing RDF data
      Forexample, SESAME (http://www.openrdf.org/)
      Downloaditfromhttp://www.openrdf.org/download.jsp
      openrdf-sesame-2.3.0-sdk.zip
      Deploythe .war in Tomcat (JDK and Tomcatneeded)
      Create a repository at
      http://localhost:8080/openrdf-sesame
      Check:
      http://localhost:8080/openrdf-sesame/repositories/XXXX
      http://localhost:8080/openrdf-sesame/repositories/XXX/statements
    • 119. Linked Data frontend
      Toexpose data as Linked Data
      Includingcontentnegotiation, etc.
      Forexample, Pubby
      http://www4.wiwiss.fu-berlin.de/pubby/
      Installation
      Use pubby-0.3.zip
      Deploythewebapp folder (and rename)in Tomcat
      Modify config.n3
      Restarttomcat
      Check: http://localhost:8080/XXX/
    • 120. REST API
      • Forexample, RDF2Go
      • 121. Java abstractionover RDF repositories
      http://rdf2go.semweb4j.org/
    • 122. Add SPARQL explorer
      Forexample, SNORQL (http://wiki.github.com/kurtjx/SNORQL/)
    • 123. ValidatingLinked Data URLs
      http://validator.linkeddata.org/vapour
      100
    • 124. Contents
      IntroductiontoLinked Data
      Linked Data publication
      MethodologicalguidelinesforLinked Data publication
      RDB2RDF tools
      Technicalaspects of Linked Data publication
      Linked Data consumption
      101
    • 125. RelFinder: finding relations in Linked Data
      E.g., relations between films
      “Pulp Fiction”, “Kill Bill” y “Reservoir Dogs”
    • 126. Exerciseon data.gov.uk
      • Publicschools in London thatcontaintheword “music”
    • Exercise: find information in DBPedia
      Image by http://www.flickr.com/photos/bflv/
      http://dbpedia.org/resource/Darth_Vader)
      • Findficticious serial killers in DBPedia
      (etc)
    • 127. Designing URI sets forthePublic Sector (UK)
      http://www.cabinetoffice.gov.uk/media/301253/puiblic_sector_uri.pdf
      105
    • 128. Asociación Española de Linked Data
      106
    • 129. IntroductiontoLinked Data
      Oscar Corcho, Asunción Gómez Pérez ({ocorcho, asun}@fi.upm.es)
      Universidad Politécnica de Madrid
      Universidad del Valle, Cali, ColombiaSeptember 10th 2010
      Credits: Raúl García Castro, Oscar Muñoz, Jose Angel Ramos Gargantilla, María del Carmen Suárez de Figueroa, Boris Villazón, Alex de León, Víctor Saquicela, Luis Vilches, Miguel Angel García, Manuel Salvadores, Guillermo Alvaro, Juan Sequeda, Carlos Ruiz Moreno and manyothers
      WorkdistributedunderthelicenseCreativeCommonsAttribution-Noncommercial-Share Alike 3.0