0
IntroductiontoLinked Data<br />Oscar Corcho, Asunción Gómez Pérez ({ocorcho, asun}@fi.upm.es)<br />Universidad Politécnica...
Contents<br />IntroductiontoLinked Data<br />Linked Data publication<br />MethodologicalguidelinesforLinked Data publicati...
Whatisthe Web of Linked Data?<br />An extension of the current Web…<br />… where information and services are given well-d...
What is Linked Data?<br />Linked Data is a term used to describe a recommended best practice for exposing, sharing, and co...
The fourprinciples (Tim Berners Lee, 2006)<br />Use URIs as names for things <br />Use HTTP URIs so that people can look u...
Linked Open Data evolution<br /><ul><li>2007
2008
2009</li></li></ul><li>7<br />LOD Cloud May 2007<br />Facts:<br /><ul><li>Focal points:
DBPedia: RDFizedvesion of Wikipiedia; many ingoing and outgoing links
Music-related datasets
Big datasets include FOAF, US Census data
Size approx. 1 billion triples, 250k links</li></ul>Figure from [4]<br />
8<br />LOD Cloud September 2008<br />Facts:<br /><ul><li>More than 35 datasets interlinked
Commercial players joined the cloud, e.g., BBC
Companies began to publish and host dataset, e.g. OpenLink, Talis, or Garlik.
Size approx. 2 billion triples, 3 million links</li></ul>Figure from [4]<br />
9<br />LOD Cloud March 2009<br />Facts:<br /><ul><li>Big part from Linking Open Drug cloud and the BIO2RDF project (bottom)
Notable new datasets: Freebase, OpenCalais, ACM/IEEE
Size > 10 billion triples</li></ul>Figure from [4]<br />
LOD clouds<br />
WhyLinked Data?<br />Basically, tomovefrom a Web of documentsto a Web of Data<br />Let’s try anexample:<br /><ul><li>Tell ...
Informationsearch in the Web of documents<br /><ul><li>¿?</li></ul>Example courtesy of Guillermo Alvaro Rey<br />
What we were actually looking for<br />Example courtesy of Guillermo Alvaro Rey<br />
Itwouldbebettertomake a data query…<br />(footballplayersfrom Albacete whoplayedEurocup 2008)<br />Example courtesy of Gui...
Howshouldwepublish data?<br />Formats in which data ispublishednowadays…<br />XML<br />HTML<br />DBs<br />APIs<br />CSV<br...
Which format do we use then?<br />RDF (ResourceDescription Framework)<br />Data model<br />Basedon triples: subject, predi...
URIs (Universal-UniformResourceIdentifer)<br />Two types of identifiers can be used to identify Linked Data resources<br /...
How do wepublishLinked Data?<br />ExposingRelationalDatabasesorother similar formatsintoLinked Data<br />D2R<br />Triplify...
How do we consume Linked Data?<br />Linked Data browsers<br />To explore things and datasets and to navigate between them....
Linked Data browsers (Disco)<br />
Linked Data Mashup (LinkedGeoData)<br />© Migración de datos a la Web de los Datos - Enfoques, técnicas y herramientas<br ...
Linked Data Mashup (DBpedia Mobile)<br />http://wiki.dbpedia.org/DBpediaMobile<br />© Migración de datos a la Web de los D...
 Linked Data Search Engines (Sindice and SIG.MA)<br />Entity lookup service. Find a document that mentions a URI or a keyw...
Linked Data SearchEngines (NYT)<br />The New York Times: Alumni In The News<br />http://data.nytimes.com/schools/schools.h...
Linked Data SearchEngines (NYT)<br />The New York Times: Source code is available<br />… and is based on SPARQL queries<br />
Oneadditionalmotivation: Open Government<br />Government and state administration should be opened at all levels to effect...
 B. Obama –Transparency and Open Government
 T. Berners-Lee - Raw data now!
 J. Manuel Alonso - ¿Qué es Open Data?
Open Government Data
8 Principles of Open Government Data</li></li></ul><li>Open Government. USA and UK<br />27<br />BOTTOM-UP<br />Top-down<br />
Linked Data Mashup (data.gov)<br />Clean Air Status and Trends (CASTNET)<br />http://data-gov.tw.rpi.edu/demo/exhibit/demo...
Linked Data in the UK<br />Education<br />http://education.data.gov.uk/id/school/106661<br />Parliament<br />http://parlia...
Linked Data Mashup (data.gov.uk)<br />Research Funding Explorer<br />http://bis.clients.talis.com/<br />
Open GovernmentSpain. Euskadi<br />31<br />
Open GovernmentSpain. Abredatos<br />32<br />
Open GovernmentSpain. Zaragoza <br />33<br />
Open GovernmentSpain. Asturias<br />34<br />
Linked Data Mashup (Waterquality)<br />Water quality in Asturias’ beaches<br />http://datos.fundacionctic.org/sandbox/astu...
Contents<br />IntroductiontoLinked Data<br />Linked Data publication<br />MethodologicalguidelinesforLinked Data publicati...
GeoLinkedData<br />It is an open initiative whose aim is to enrich the Web of Data with Spanish geospatial data.<br />This...
Motivation<br />		99.171 % English<br />		0.019 % Spanish<br />The Web of Data ismainlyfor<br />Englishspeakers<br />Poorp...
Related Work<br />
Impact of Geo.linkeddata.es<br />Number of triples in Spanish (July 2010): 1.412.248 <br />Number of triples in Spanish (E...
Processfor Publishing Linked Data onthe Web<br />Identification<br />of the data sources<br />Vocabulary<br />development<...
1. Identification and selection of the data sources<br />Identification<br />of the data sources<br />Instituto Geográfico...
1. Identification and selection of the data sources<br />Instituto Geográfico Nacional (GeographicSpanishInstitute)<br />M...
Monolingual
Numericalinformation
Particularaties</li></ul>Geo (textual level)<br />Temporal<br />43<br />Asunción Gómez Pérez<br />
1. Identification and selection of the data sources<br />IGN-E<br />
1. Identification and selection of the data sources<br />IndustryProductionIndex<br />Year<br />Province<br />
2. Vocabulary development<br />http://www4.wiwiss.fu-berlin.de/bizer/pub/LinkedDataTutorial/#whichvocabs<br />Identificati...
2. Vocabularydevelopment<br />Features<br />Lightweight : <br />Taxonomies and a fewproperties<br />Consensuatedvocabulari...
Knowledge Resources<br />Ontological Resources<br />O. Design Patterns<br />3<br />4<br />O. Repositories and Registries<b...
Vocabularydevelopment: Specification<br />Content requirements: Identifythe set of questionsthattheontologyshouldanswer<br...
2. Lightweight Ontology Development<br />WGS84 Geo Positioning: an RDF vocabulary<br />scv:Dimension<br />scv:Item<br />sc...
Objetivos:<br />INSPIRE intenta conseguir fuentes armonizadas de Información Geográfica para dar soporte a la formulación,...
INSPIRE - Anexos<br />Luis Manuel Vilches Blázquez<br />
hydrOntology<br />Existencia de gran diversidad de problemas (múltiples fuentes, heterogeneidad de contenido y estructurac...
Fuentes<br />Tesauros y Bibliografía<br />Catálogos de fenómenos<br />Getty<br />FTT ADL<br />BCN25<br />GEMET<br />WFD<br...
Criterios de estructuración <br />Directiva Marco del Agua<br />Propuesta por Parlamento y Consejo de la UE<br />Lista de ...
 Modelización del dominio hidrográfico <br />Luis Manuel Vilches Blázquez<br />
Implementación & Formalizacón<br />+  Pellet<br />4<br />1<br />2<br />5<br />3<br />+150 conceptos (classes) , 47 tipos d...
2. Vocabularydevelopment: HydrOntology<br />58<br />Asunción Gómez Pérez<br />
3. Generation of RDF<br />From the Data sources<br />Geographic information (Databases)<br />Statistic information (.xsl)<...
3. Generation of the RDF Data<br />NOR2O<br />INE<br />ODEMapster<br />IGN<br />Geometry2RDF<br />Geospatial<br />column<b...
3. Generation of the RDF Data / instances <br />NOR2O is a software librarythatimplementsthetransformationsproposedbythePa...
Re-engineeringModelforNORs<br />Patterns for Re-engineering<br />Non-Ontological Resources <br />(PR-NOR)<br />Ontology Fo...
PR-NOR library at the ODP Portal<br />Technologicalsupport<br />http://ontologydesignpatterns.org/wiki/Submissions:Reengin...
3. Generation of the RDF Data – NOR2O<br />NOR2O<br />Year<br />Industry Production Index<br />Province<br />
hydrOntology & databases<br />NGN<br />1:25.000<br />multilingüe<br />
3. Generation of the RDF Data – R2O & ODEMapster<br />Creation of the R2O Mappings<br />
3. Generation of the RDF Data – Geometry2RDF<br />Oracle STO UTIL package <br />SELECT  TO_CHAR(SDO_UTIL.TO_GML311GEOMETRY...
3. Generation of the RDF Data – Geometry2RDF<br />
3. Generation of the RDF Data – Geometry2RDF<br />
3. Generation of the RDF data – RDF graphs<br />			IGN				        INE<br />So far<br />7 RDF NamedGraphs<br />1.412.248 tr...
4. Publication of the RDF Data<br />Identification<br />of the data sources<br />Vocabulary<br />development<br />SPARQL<b...
4. Publication of the RDF Data<br />
4. Publication of the RDF Data - License<br />License for GeoLinkedData<br />Creative Commons Attribution-ShareAlike 3.0 <...
5. Data cleansing<br />Identification<br />of the data sources<br />Lack of documentation of the IGN datasets<br />Broken ...
5. Data cleansing<br />URIs in Spanish<br />http://geo.linkeddata.es/ontology/Río<br />RDF allows UTF-8 characters for URI...
6. Linking of the RDF Data<br />Identification<br />of the data sources<br />Silk - A Link Discovery Framework for the Web...
6. Linking of the RDF Data<br />http://geo.linkeddata.es/page/Provincia/Granada<br />77<br />Asunción Gómez Pérez<br />
7. Enable effective discovery<br />Identification<br />of the data sources<br />Vocabulary<br />development<br />Generatio...
DEMO<br />http://geo.linkeddata.es/<br />
Provinces<br />
IndustryProductionIndex – Capital of Province<br />
Rivers<br />
Beaches<br />
Contents<br />IntroductiontoLinked Data<br />Linked Data publication<br />MethodologicalguidelinesforLinked Data publicati...
Upcoming SlideShare
Loading in...5
×

Introduction to Linked Data

3,538

Published on

Basic introductory talk about the Web of Linked Data, given to undergraduate and posgraduate students of Universidad del Valle (Cali, Colombia) in September 2010. Knowledge about Semantic Web is required

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
3,538
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
106
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide
  • 727 veces el term-based record-basedthesaurus
  • Transcript of "Introduction to Linked Data"

    1. 1. IntroductiontoLinked Data<br />Oscar Corcho, Asunción Gómez Pérez ({ocorcho, asun}@fi.upm.es)<br />Universidad Politécnica de Madrid<br />Universidad del Valle, Cali, ColombiaSeptember 10th 2010<br />Credits: Raúl García Castro, Oscar Muñoz, Jose Angel Ramos Gargantilla, María del Carmen Suárez de Figueroa, Boris Villazón, Alex de León, Víctor Saquicela, Luis Vilches, Miguel Angel García, Manuel Salvadores, Guillermo Alvaro, Juan Sequeda, Carlos Ruiz Moreno and manyothers<br />WorkdistributedunderthelicenseCreativeCommonsAttribution-Noncommercial-Share Alike 3.0<br />
    2. 2. Contents<br />IntroductiontoLinked Data<br />Linked Data publication<br />MethodologicalguidelinesforLinked Data publication<br />RDB2RDF tools<br />Technicalaspects of Linked Data publication<br />Linked Data consumption<br />2<br />
    3. 3. Whatisthe Web of Linked Data?<br />An extension of the current Web…<br />… where information and services are given well-defined and explicitly represented meaning, …<br />… so that it can be shared and used by humans and machines, ...<br />... better enabling them to work in cooperation<br />How?<br />Promoting information exchange by tagging web content with machineprocessable descriptions of its meaning. <br />And technologies and infrastructure to do this<br />And clear principles on how to publish data<br />data<br />
    4. 4. What is Linked Data?<br />Linked Data is a term used to describe a recommended best practice for exposing, sharing, and connecting pieces of data, information, and knowledge on the Semantic Web using URIs and RDF.<br />Part of the Semantic Web<br />Exposing, sharing and connecting data<br />Technologies: URIs and RDF (although others are also important)<br />
    5. 5. The fourprinciples (Tim Berners Lee, 2006)<br />Use URIs as names for things <br />Use HTTP URIs so that people can look up those names. <br />When someone looks up a URI, provide useful information, using the standards (RDF*, SPARQL) <br />Include links to other URIs, so that they can discover more things. <br />http://www.w3.org/DesignIssues/LinkedData.html<br />5<br />http://www.ted.com/talks/tim_berners_lee_on_the_next_web.html<br />
    6. 6. Linked Open Data evolution<br /><ul><li>2007
    7. 7. 2008
    8. 8. 2009</li></li></ul><li>7<br />LOD Cloud May 2007<br />Facts:<br /><ul><li>Focal points:
    9. 9. DBPedia: RDFizedvesion of Wikipiedia; many ingoing and outgoing links
    10. 10. Music-related datasets
    11. 11. Big datasets include FOAF, US Census data
    12. 12. Size approx. 1 billion triples, 250k links</li></ul>Figure from [4]<br />
    13. 13. 8<br />LOD Cloud September 2008<br />Facts:<br /><ul><li>More than 35 datasets interlinked
    14. 14. Commercial players joined the cloud, e.g., BBC
    15. 15. Companies began to publish and host dataset, e.g. OpenLink, Talis, or Garlik.
    16. 16. Size approx. 2 billion triples, 3 million links</li></ul>Figure from [4]<br />
    17. 17. 9<br />LOD Cloud March 2009<br />Facts:<br /><ul><li>Big part from Linking Open Drug cloud and the BIO2RDF project (bottom)
    18. 18. Notable new datasets: Freebase, OpenCalais, ACM/IEEE
    19. 19. Size > 10 billion triples</li></ul>Figure from [4]<br />
    20. 20. LOD clouds<br />
    21. 21. WhyLinked Data?<br />Basically, tomovefrom a Web of documentsto a Web of Data<br />Let’s try anexample:<br /><ul><li>Tell me whichfootballplayers, born in theprovince of Albacete, in Spain, havescored a goal in theWorld Cup final</li></ul>Disclaimer:<br />Sorryto use anexampleaboutfootball, butyouhavetounderstandthatforseveralyearsSpaniardswillbetalkingaboutfootball a lot ;-)<br />Example courtesy of Guillermo Alvaro Rey<br />
    22. 22. Informationsearch in the Web of documents<br /><ul><li>¿?</li></ul>Example courtesy of Guillermo Alvaro Rey<br />
    23. 23. What we were actually looking for<br />Example courtesy of Guillermo Alvaro Rey<br />
    24. 24. Itwouldbebettertomake a data query…<br />(footballplayersfrom Albacete whoplayedEurocup 2008)<br />Example courtesy of Guillermo Alvaro Rey<br />
    25. 25. Howshouldwepublish data?<br />Formats in which data ispublishednowadays…<br />XML<br />HTML<br />DBs<br />APIs<br />CSV<br />XLS<br />…<br />However, mainlimitationsfrom a Web of Data point of view<br />Difficulttointegrate<br />Data isnotlinkedtoeachother, as ithappenswith Web documents.<br />
    26. 26. Which format do we use then?<br />RDF (ResourceDescription Framework)<br />Data model<br />Basedon triples: subject, predicate, object<br /><Oscar> <vive en> <Madrid><br /><Madrid> <es la capital de> <España><br /><España> <es campeona de> <Mundial de Fútbol><br />…<br />Serialised in differentformats<br />RDF/XML, RDFa, N3, Turtle, JSON…<br />
    27. 27. URIs (Universal-UniformResourceIdentifer)<br />Two types of identifiers can be used to identify Linked Data resources<br />URIRefs(Unique Resource IdentifiersReferences)<br />A URI and an optional FragmentIdentifier separated from the URI by the hash symbol ‘#’<br />http://www.ontology.org/people#Person<br />people:Person<br />Plain URIs can also be used, as in FOAF:<br />http://xmlns.com/foaf/0.1/Person<br />17<br />
    28. 28. How do wepublishLinked Data?<br />ExposingRelationalDatabasesorother similar formatsintoLinked Data<br />D2R<br />Triplify<br />R2O<br />NOR2O<br />Virtuoso<br />Ultrawrap<br />…<br />Usingnative RDF triplestores<br />Sesame<br />Jena<br />Owlim<br />Talisplatform<br />…<br />Incorporatingit in theform of RDFa in CMSslikeDrupal<br />18<br />
    29. 29. How do we consume Linked Data?<br />Linked Data browsers<br />To explore things and datasets and to navigate between them.<br />Tabulator Browser (MIT, USA), Marbles (FU Berlin, DE), OpenLink RDF Browser (OpenLink, UK), Zitgist RDF Browser (Zitgist, USA), Disco Hyperdata Browser (FU Berlin, DE), Fenfire (DERI, Ireland)<br />Linked Data mashups<br />Sites that mash up (thus combine Linked data)<br />Revyu.com (KMI, UK), DBtune Slashfacet (Queen Mary, UK), DBPedia Mobile (FU Berlin, DE), Semantic Web Pipes (DERI, Ireland) <br />Search engines<br />To search for Linked Data.<br />Falcons (IWS, China), Sindice (DERI, Ireland), MicroSearch (Yahoo, Spain), Watson (Open University, UK), SWSE (DERI, Ireland), Swoogle (UMBC, USA)<br />Listing on this slide by T. Heath, M. Hausenblas, C. Bizer, R. Cyganiak, O. Hartig<br />19<br />
    30. 30. Linked Data browsers (Disco)<br />
    31. 31. Linked Data Mashup (LinkedGeoData)<br />© Migración de datos a la Web de los Datos - Enfoques, técnicas y herramientas<br />Luis Manuel Vilches Blázquez<br />
    32. 32. Linked Data Mashup (DBpedia Mobile)<br />http://wiki.dbpedia.org/DBpediaMobile<br />© Migración de datos a la Web de los Datos - Enfoques, técnicas y herramientas<br />Luis Manuel Vilches Blázquez<br />
    33. 33. Linked Data Search Engines (Sindice and SIG.MA)<br />Entity lookup service. Find a document that mentions a URI or a keyword. <br />
    34. 34. Linked Data SearchEngines (NYT)<br />The New York Times: Alumni In The News<br />http://data.nytimes.com/schools/schools.html<br />
    35. 35. Linked Data SearchEngines (NYT)<br />The New York Times: Source code is available<br />… and is based on SPARQL queries<br />
    36. 36. Oneadditionalmotivation: Open Government<br />Government and state administration should be opened at all levels to effective public scrutiny and oversight<br />Objectives:<br />Transparency<br />Participation<br />Collaboration<br />Inclusion<br />Cost reduction<br />Interoperability<br />Reusability<br />Leadership<br />Market & Value<br />26<br /><ul><li>Some Links:
    37. 37. B. Obama –Transparency and Open Government
    38. 38. T. Berners-Lee - Raw data now!
    39. 39. J. Manuel Alonso - ¿Qué es Open Data?
    40. 40. Open Government Data
    41. 41. 8 Principles of Open Government Data</li></li></ul><li>Open Government. USA and UK<br />27<br />BOTTOM-UP<br />Top-down<br />
    42. 42. Linked Data Mashup (data.gov)<br />Clean Air Status and Trends (CASTNET)<br />http://data-gov.tw.rpi.edu/demo/exhibit/demo-8-castnet.php<br />
    43. 43. Linked Data in the UK<br />Education<br />http://education.data.gov.uk/id/school/106661<br />Parliament<br />http://parliament.psi.enakting.org/id/member/1227<br />Maps<br />E.g., London: http://data.ordnancesurvey.co.uk/id/7000000000041428<br />http://map.psi.enakting.org<br />Transport<br />http://www.dft.gov.uk/naptan/<br />SameAs service<br />http://www.sameas.org<br />Challenges<br />http://gov.tso.co.uk/openup/sparql/gov-transport<br />29<br />
    44. 44. Linked Data Mashup (data.gov.uk)<br />Research Funding Explorer<br />http://bis.clients.talis.com/<br />
    45. 45. Open GovernmentSpain. Euskadi<br />31<br />
    46. 46. Open GovernmentSpain. Abredatos<br />32<br />
    47. 47. Open GovernmentSpain. Zaragoza <br />33<br />
    48. 48. Open GovernmentSpain. Asturias<br />34<br />
    49. 49. Linked Data Mashup (Waterquality)<br />Water quality in Asturias’ beaches<br />http://datos.fundacionctic.org/sandbox/asturias/playas/<br />
    50. 50. Contents<br />IntroductiontoLinked Data<br />Linked Data publication<br />MethodologicalguidelinesforLinked Data publication<br />RDB2RDF tools<br />Technicalaspects of Linked Data publication<br />Linked Data consumption<br />36<br />
    51. 51. GeoLinkedData<br />It is an open initiative whose aim is to enrich the Web of Data with Spanish geospatial data.<br />This initiative has started off by publishing diverse information sources, such as National Geographic Institute of Spain (IGN-E) and National Statistics Institute (INE)<br />http://geo.linkeddata.es<br />
    52. 52. Motivation<br /> 99.171 % English<br /> 0.019 % Spanish<br />The Web of Data ismainlyfor<br />Englishspeakers<br />Poorpresence of Spanish<br />Source:Billion Triples dataset at http://km.aifb.kit.edu/projects/btc-2010/<br />Thanks to Aidan and Richard<br />
    53. 53. Related Work<br />
    54. 54. Impact of Geo.linkeddata.es<br />Number of triples in Spanish (July 2010): 1.412.248 <br />Number of triples in Spanish (EndAugust 2010): 21.463.088<br />40<br />Asunción Gómez Pérez<br />
    55. 55. Processfor Publishing Linked Data onthe Web<br />Identification<br />of the data sources<br />Vocabulary<br />development<br />Generation<br />of the RDF Data<br />Publication<br />of the RDF data <br />Data cleansing<br />Linking <br />the RDF data<br />Enable effective <br />discovery<br />
    56. 56. 1. Identification and selection of the data sources<br />Identification<br />of the data sources<br />Instituto GeográficoNacional<br />Vocabulary<br />development<br />Generation<br />of the RDF Data<br />Publication<br />of the RDF data <br />Data cleansing<br />Linking <br />the RDF data<br />Instituto Nacionalde Estadística<br />Enable effective <br />discovery<br />
    57. 57. 1. Identification and selection of the data sources<br />Instituto Geográfico Nacional (GeographicSpanishInstitute)<br />Multilingual (Spanish, Vasc, Gallician, Catalan)<br />Conceptualizationmistmatches<br />Granularity (scale concept)<br />Textual information<br />Particularaties<br />Longitude<br />latitude<br /><ul><li>Instituto Nacional de Estadística (StatisticSpanishInstitute)
    58. 58. Monolingual
    59. 59. Numericalinformation
    60. 60. Particularaties</li></ul>Geo (textual level)<br />Temporal<br />43<br />Asunción Gómez Pérez<br />
    61. 61. 1. Identification and selection of the data sources<br />IGN-E<br />
    62. 62. 1. Identification and selection of the data sources<br />IndustryProductionIndex<br />Year<br />Province<br />
    63. 63. 2. Vocabulary development<br />http://www4.wiwiss.fu-berlin.de/bizer/pub/LinkedDataTutorial/#whichvocabs<br />Identification<br />of the data sources<br />Vocabulary<br />development<br />Generation<br />of the RDF Data<br />Thisisnotenough<br />Publication<br />of the RDF data <br />Data cleansing<br />Linking <br />the RDF data<br />Enable effective <br />discovery<br />
    64. 64. 2. Vocabularydevelopment<br />Features<br />Lightweight : <br />Taxonomies and a fewproperties<br />Consensuatedvocabularies<br />Toavoidthemappingproblems<br />Multilingual<br />Linked data are multilingual<br />TheNeOnmethodology can helpto<br />Re-enginer Non ontologicalresourcesintoontologies<br />Pros: use domainterminologyalreadyconsensuatedbydomainexperts<br />Withdraw in heavyweightontologiesthosefeaturesthatyoudon’tneed<br />Reuseexistingvocabularies<br />47<br />Identification<br />of the data sources<br />Vocabulary<br />development<br />Generation<br />of the RDF Data<br />Publication<br />of the RDF data <br />Data cleansing<br />Linking <br />the RDF data<br />Enable effective <br />discovery<br />Asunción Gómez Pérez<br />
    65. 65. Knowledge Resources<br />Ontological Resources<br />O. Design Patterns<br />3<br />4<br />O. Repositories and Registries<br />5<br />6<br />Flogic<br />RDF(S)<br />OWL<br />OntologicalResource<br />Reuse<br /> O. Aligning<br /> O. Merging<br />5<br />6<br />2<br />Ontology Design<br />Pattern Reuse<br />Non Ontological Resource<br />Reuse<br />4<br />3<br />6<br />Non Ontological Resources<br />2<br />Ontological Resource<br />Reengineering<br />7<br />Glossaries<br />Dictionaries<br />Lexicons<br />5<br />Non Ontological Resource<br />Reengineering<br />4<br />6<br />Classification<br />Schemas<br />Thesauri<br />Taxonomies<br />Alignments<br />2<br />RDF(S)<br />1<br />Flogic<br />O. Conceptualization<br />O. Implementation<br />O. Formalization<br />O. Specification<br />Scheduling<br />OWL<br />8<br />Ontology Restructuring<br />(Pruning, Extension, <br />Specialization, Modularization)<br />9<br />O. Localization<br />1,2,3,4,5,6,7,8, 9<br />Ontology Support Activities: Knowledge Acquisition (Elicitation); Documentation; <br />Configuration Management; Evaluation (V&V); Assessment<br />48<br />
    66. 66. Vocabularydevelopment: Specification<br />Content requirements: Identifythe set of questionsthattheontologyshouldanswer<br />Whichone are theprovinces in Spain?<br />Where are thebeaches?<br />Where are thereservoirs?<br />Identifytheproductionindex in Madrid<br />Whichoneisthecitywithhigherproductionindex?<br />Give me Madrid latitude and altitude<br />….<br />Non-contentrequirements<br />Theontologymustbe in thefourofficialSpanishlanguages<br />49<br />Asunción Gómez Pérez<br />
    67. 67. 2. Lightweight Ontology Development<br />WGS84 Geo Positioning: an RDF vocabulary<br />scv:Dimension<br />scv:Item<br />scv:Dataset<br />hydrographical phenomena (rivers, lakes, etc.)<br />Vocabulary for instants, intervals, durations, etc.<br />Names and international code systems for territories and groups<br />Ontology for OGC Geography Markup Language <br />reused<br />Following the INSPIRE <br />(INfrastructure for SPatial InfoRmation in Europe) recommendation.<br />hydrOntology,SCOVO, FAO Geopolitcal, WGS84, GML, and Time<br />
    68. 68. Objetivos:<br />INSPIRE intenta conseguir fuentes armonizadas de Información Geográfica para dar soporte a la formulación, implementación y evaluación de políticas comunitarias (Medio Ambiente, etc).<br />Fuentes de Información Geográfica: Bases de datos de los Estados Miembros (UE) a nivel local, regional, nacional e internacional.<br />Contexto – Directiva INSPIRE <br />Luis Manuel Vilches Blázquez<br />
    69. 69. INSPIRE - Anexos<br />Luis Manuel Vilches Blázquez<br />
    70. 70. hydrOntology<br />Existencia de gran diversidad de problemas (múltiples fuentes, heterogeneidad de contenido y estructuración, ambigüedad del lenguaje natural, etc.) en la información geográfica.<br />Necesidad de un modelo compartido para solventar los problemas de armonización y estructuración de la información hidrográfica.<br />hydrOntology es una ontología global de dominio desarrollada conforme a un acercamiento top-down. <br />Recubrir la mayoría de los fenómenos representables cartográficamente asociados al dominio hidrográfico.<br />Servir como marco de armonización entre los diferentes productores de información geo-espacial en el entorno nacional e internacional.<br />Comenzar con los pasos necesarios para obtener una mejor organización y gestión de la información geográfica (hidrográfica).<br />Luis Manuel Vilches Blázquez<br />
    71. 71. Fuentes<br />Tesauros y Bibliografía<br />Catálogos de fenómenos<br />Getty<br />FTT ADL<br />BCN25<br />GEMET<br />WFD<br />CC.AA.<br />EGM & ERM<br />Diccionarios y<br />Monografías<br />BCN200<br />Nomenclátor Geográfico Nacional<br />Nomenclátor Conciso<br />Luis Manuel Vilches Blázquez<br />
    72. 72. Criterios de estructuración <br />Directiva Marco del Agua<br />Propuesta por Parlamento y Consejo de la UE<br />Lista de definiciones de fenómenos hidrográficos<br />Proyecto SDIGER<br />Proyecto piloto INSPIRE<br />Dos cuencas, países e idiomas<br />Criterios semánticos<br />Diccionarios geográficos<br />Diccionario de la Real Academia de la Lengua<br />WordNet<br />Wikipedia<br />Bibliografía de varias áreas de conocimiento<br />Herencia: Estructuración actual de catálogos<br />Asesoramiento expertos en toponimia del IGN<br />Luis Manuel Vilches Blázquez<br />
    73. 73. Modelización del dominio hidrográfico <br />Luis Manuel Vilches Blázquez<br />
    74. 74. Implementación & Formalizacón<br />+ Pellet<br />4<br />1<br />2<br />5<br />3<br />+150 conceptos (classes) , 47 tipos de relaciones (properties) y 64 tipos de atributos (attribute types)<br />Luis Manuel Vilches Blázquez<br />
    75. 75. 2. Vocabularydevelopment: HydrOntology<br />58<br />Asunción Gómez Pérez<br />
    76. 76. 3. Generation of RDF<br />From the Data sources<br />Geographic information (Databases)<br />Statistic information (.xsl)<br />Geospatial information <br />Different technologies for RDF generation<br />Reengineering patterns<br />R20 and ODEMapster<br />Annotation tools<br />Geometry generation<br />Identification<br />of the data sources<br />Vocabulary<br />development<br />Generation<br />of the RDF Data<br />Publication<br />of the RDF data <br />Data cleansing<br />Linking <br />the RDF data<br />Enable effective <br />discovery<br />
    77. 77. 3. Generation of the RDF Data<br />NOR2O<br />INE<br />ODEMapster<br />IGN<br />Geometry2RDF<br />Geospatial<br />column<br />IGN<br />
    78. 78. 3. Generation of the RDF Data / instances <br />NOR2O is a software librarythatimplementsthetransformationsproposedbythePatternsfor Re-engineering Non-OntologicalResources (PR-NOR). Currentlywehave 16 PR-NORs.<br />PR-NORs define a procedurethattransforms a Non-OntologicalResource (NOR) componentsintoontologyelements. http://ontologydesignpatterns.org/<br />· Classification<br />schemes<br />NOR2O<br />· Thesauri<br />· Lexicons<br />NOR2O<br />FAO Water classification<br />· Classification scheme<br />· Path enumeration data model<br />· Implemented in a database<br />
    79. 79. Re-engineeringModelforNORs<br />Patterns for Re-engineering<br />Non-Ontological Resources <br />(PR-NOR)<br />Ontology Forward <br />Engineering<br /> Con-<br />ceptual<br />Speci-<br />fication<br />NOR Reverse <br />Engineering<br />Conceptua-<br />lization<br />Transformation<br />Requirements<br />Formalization<br />Design<br />Implementation<br />Implementation<br />RDF(S)<br />Non-Ontological Resource<br />Ontology<br />
    80. 80. PR-NOR library at the ODP Portal<br />Technologicalsupport<br />http://ontologydesignpatterns.org/wiki/Submissions:ReengineeringODPs<br />
    81. 81. 3. Generation of the RDF Data – NOR2O<br />NOR2O<br />Year<br />Industry Production Index<br />Province<br />
    82. 82. hydrOntology & databases<br />NGN<br />1:25.000<br />multilingüe<br />
    83. 83. 3. Generation of the RDF Data – R2O & ODEMapster<br />Creation of the R2O Mappings<br />
    84. 84. 3. Generation of the RDF Data – Geometry2RDF<br />Oracle STO UTIL package <br />SELECT TO_CHAR(SDO_UTIL.TO_GML311GEOMETRY(geometry)) <br /> AS Gml311Geometry<br />FROM "BCN200"."BCN200_0301L_RIO" c<br />WHERE c.Etiqueta='Arroyo'<br />
    85. 85. 3. Generation of the RDF Data – Geometry2RDF<br />
    86. 86. 3. Generation of the RDF Data – Geometry2RDF<br />
    87. 87. 3. Generation of the RDF data – RDF graphs<br /> IGN INE<br />So far<br />7 RDF NamedGraphs<br />1.412.248 triples<br />BTN25<br />BCN200<br />IPI<br />….<br />http://geo.linkeddata.es/dataset/IGN/BTN25<br />http://geo.linkeddata.es/dataset/IGN/BCN200<br />http://geo.linkeddata.es/dataset/INE/IPI<br />
    88. 88. 4. Publication of the RDF Data<br />Identification<br />of the data sources<br />Vocabulary<br />development<br />SPARQL<br />Linked Data<br />HTML<br />Generation<br />of the RDF Data<br />IncludingProvenance<br />Support<br />Publication<br />of the RDF data <br />Pubby<br />Pubby 0.3<br />Data cleansing<br />Linking <br />the RDF data<br />Enable effective <br />discovery<br />Virtuoso 6.1.0<br />
    89. 89. 4. Publication of the RDF Data<br />
    90. 90. 4. Publication of the RDF Data - License<br />License for GeoLinkedData<br />Creative Commons Attribution-ShareAlike 3.0 <br />GNU Free Documentation License<br />Each dataset will have its own specific license, IGN, INE, etc.<br />
    91. 91. 5. Data cleansing<br />Identification<br />of the data sources<br />Lack of documentation of the IGN datasets<br />Broken links: Spain, IGN resources<br />Lack of documentation of theontology<br />Missingenglish and spanishlabels<br />Building a spanish ontology and importing some concepts of other ontology (in English):<br />Importing the English ontology. Add annotations like a Spanish label to them.<br />Importing the English ontology, creating new concepts and properties with a Spanish name and map those to the English equivalents.<br />Re-declaring the terms of the English ontology that we need (using the same URI as in the English ontology), and adding a Spanish label.<br />Creating your own class and properties that model the same things as the English ontology. <br />Vocabulary<br />development<br />Generation<br />of the RDF Data<br />Publication<br />of the RDF data <br />Data cleansing<br />Linking <br />the RDF data<br />Enable effective <br />discovery<br />
    92. 92. 5. Data cleansing<br />URIs in Spanish<br />http://geo.linkeddata.es/ontology/Río<br />RDF allows UTF-8 characters for URIs<br />But, Linked Data URIs has to be URLs as well<br />So, non ASCII-US characters have to be %code<br />http://geo.linkeddata.es/ontology/R%C3%ADo<br />
    93. 93. 6. Linking of the RDF Data<br />Identification<br />of the data sources<br />Silk - A Link Discovery Framework for the Web of Data<br />First set of links: Provinces of Spain<br />86% accuracy<br />Vocabulary<br />development<br />Geonames<br />Generation<br />of the RDF Data<br />GeoLinkedData<br />DBPedia<br />Publication<br />of the RDF data <br />Data cleansing<br />Linking <br />the RDF data<br />Enable effective <br />discovery<br />
    94. 94. 6. Linking of the RDF Data<br />http://geo.linkeddata.es/page/Provincia/Granada<br />77<br />Asunción Gómez Pérez<br />
    95. 95. 7. Enable effective discovery<br />Identification<br />of the data sources<br />Vocabulary<br />development<br />Generation<br />of the RDF Data<br />Publication<br />of the RDF data <br />Data cleansing<br />Linking <br />the RDF data<br />Enable effective <br />discovery<br />
    96. 96. DEMO<br />http://geo.linkeddata.es/<br />
    97. 97. Provinces<br />
    98. 98. IndustryProductionIndex – Capital of Province<br />
    99. 99. Rivers<br />
    100. 100. Beaches<br />
    101. 101. Contents<br />IntroductiontoLinked Data<br />Linked Data publication<br />MethodologicalguidelinesforLinked Data publication<br />RDB2RDF tools<br />Technicalaspects of Linked Data publication<br />Linked Data consumption<br />84<br />
    102. 102. Ontology-based Access to DBs<br />1<br />3<br />2<br />4<br />Build a new ontology from 1 DB schema and 1 DB<br />Align the ontology built with approach 1 with a legacy ontology<br />Align an existing DB with a legacy ontology<br /> a) Massive dump (semantic data warehouse)<br /> b) Query-driven<br />Align an ontology network with n DB schemas and other data sources<br /> a) Massive dump (semantic data warehouse)<br /> b) Query-driven<br />new ontology<br />existing ontology<br />
    103. 103. Ontology-based Access to Databases<br />Universidad<br />Profesor<br />Doctorando<br />Ontología<br />?<br />Organización<br />Personal<br />BDR<br />Modelo<br />Relacional<br />Pregunta: Nombre de los profesores de la universidad UPM<br />* Un profesor es una persona cuyo puesto es “docente”<br />* Una universidad es una organización de tipo “3”<br />Procesador<br />Procesado de la consulta de acuerdo a la descripción formal de correspondencia<br />Consulta: valores de la columna nombre de los registros de la tabla Personal para los que el valor de la columna puesto is “docente” que estén relacionados con al menos un registro de la tabla Organización con el valor “3” en la columna tipo y “UPM” en la columna nombre.<br />
    104. 104. Align data sourceswithlegacyontologies<br />Aeropuertos<br />Ontología O2<br />Ontología O1<br />Centro<br />Comunicaciones<br />PuntoGPS<br />Estación<br />Punto Europeo<br />Aeropuerto<br />PuntoAsiatico<br />PuntoEspañol<br />Aeropuerto<br />f (Aeropuertos)<br />=<br />PuntoEuropeo<br />f (Aeropuertos)<br />=<br />RC(O2,M1)<br />RC(O1,M1)<br />Modelo Relacional M1<br />
    105. 105. R2O<br /> is a declarative language to specify mappings between relational data sources and ontologies.<br /><xml><br />R2O Mapping<br /></xml><br />Organization<br />Persons<br />University<br />RDB<br />Professor<br />Student<br />Relational Model<br />Ontology<br />
    106. 106. Example: types of mappingsneeded<br />Attibute Mapping with transformation<br />(Regular Expression)<br />Attibute Direct Mapping<br />Relation Mapping w. Transformation<br />(Regular Expression)<br />Relation Mapping w. Transformation<br />(Keyword search)<br />
    107. 107. Population example (II)<br />Population example (II)<br />The Operation element defines a transformation based on a regular expression to be applied to the database column for extracting property values<br />
    108. 108. For concepts...<br />One or more concepts can be extracted from a single data field (not in 1NF).<br />A view maps exactly one concept in the ontology.<br />For attributes...<br />A column in a database view maps directly an attribute or a relation.<br />A subset of the columns in the view map a concept in the ontology.<br />A subset (selection) of the records of a database view map a concept in the ontology.<br />A column in a database view maps an attribute or a relation after some transformation.<br />A subset of the records of a database view map a concept in the onto. but the selection cannot be made using SQL.<br />A set of columns in a database view map an attribute or a relation.<br />R2O (Relational-to-Ontology) Language<br />
    109. 109. R2O Basic Syntax<br /><conceptmap-defname="Customer"><br /> <identified-by> Table key </identified-by><br /> <uri-as>operation</uri-as><br /> <applies-if>condition</applies-if><br /> <joins-via> expression </joins-via><br /> <documentation>description …</documentation><br /> <described-by>attributes,relations</described-by><br /></conceptmap-def><br /><attributemap-defname="http://esperonto/ff#Title"><br /> <aftertransform><br /> <operationoper-id="constant"><br /> <arg-restrictionon-param="const-val"><br /> <has-column>fsb_ajut.titol</has-column><br /> </arg-restriction><br /> </operation><br /> </aftertransform><br /></attributemap-def><br /><relationmap-defname="http://esperonto/ff#isCandidateFor"><br /> <to-concept name="http://esperonto/ff#FundOpp"><br /> <joins-via><br /> <operationoper-id=“equals"><br /> <arg-restrictionon-param="value1"><br /> <has-column>fsb_ajut.id</has-column><br /> </arg-restriction><br /> <arg-restriction on-param="value2"><br /> <has-column>fsb_candidate.forFund</has-column><br /> </arg-restriction><br /> </operation><br /> </joins-via><br /></relationmap-def><br />
    110. 110. ODEMapster<br /> generates RDF instances from relational instances based on the mapping description expressed in the R2O document <br />
    111. 111. 94<br />Mapping Design<br /><ul><li>3 Mapping Creation Steps
    112. 112. Load Ontology
    113. 113. Load Database(s)
    114. 114. Create mapping
    115. 115. 2 Usage Modes
    116. 116. Online mode (run time query execution)
    117. 117. Offline mode (materialized RDF dump)</li></li></ul><li>Contents<br />IntroductiontoLinked Data<br />Linked Data publication<br />MethodologicalguidelinesforLinked Data publication<br />RDB2RDF tools<br />Technicalaspects of Linked Data publication<br />Linked Data consumption<br />95<br />
    118. 118. Using an RDF repository<br />Itallowsstoring and accessing RDF data<br />Forexample, SESAME (http://www.openrdf.org/)<br />Downloaditfromhttp://www.openrdf.org/download.jsp<br />openrdf-sesame-2.3.0-sdk.zip<br />Deploythe .war in Tomcat (JDK and Tomcatneeded)<br />Create a repository at<br />http://localhost:8080/openrdf-sesame<br />Check: <br />http://localhost:8080/openrdf-sesame/repositories/XXXX<br />http://localhost:8080/openrdf-sesame/repositories/XXX/statements<br />
    119. 119. Linked Data frontend<br />Toexpose data as Linked Data<br />Includingcontentnegotiation, etc.<br />Forexample, Pubby<br />http://www4.wiwiss.fu-berlin.de/pubby/<br />Installation<br />Use pubby-0.3.zip<br />Deploythewebapp folder (and rename)in Tomcat<br />Modify config.n3<br />Restarttomcat<br />Check: http://localhost:8080/XXX/<br />
    120. 120. REST API<br /><ul><li>Forexample, RDF2Go
    121. 121. Java abstractionover RDF repositories</li></ul>http://rdf2go.semweb4j.org/<br />
    122. 122. Add SPARQL explorer<br />Forexample, SNORQL (http://wiki.github.com/kurtjx/SNORQL/)<br />
    123. 123. ValidatingLinked Data URLs<br />http://validator.linkeddata.org/vapour<br />100<br />
    124. 124. Contents<br />IntroductiontoLinked Data<br />Linked Data publication<br />MethodologicalguidelinesforLinked Data publication<br />RDB2RDF tools<br />Technicalaspects of Linked Data publication<br />Linked Data consumption<br />101<br />
    125. 125. RelFinder: finding relations in Linked Data<br />E.g., relations between films<br />“Pulp Fiction”, “Kill Bill” y “Reservoir Dogs”<br />
    126. 126. Exerciseon data.gov.uk<br /><ul><li>Publicschools in London thatcontaintheword “music”</li></li></ul><li>Exercise: find information in DBPedia<br />Image by http://www.flickr.com/photos/bflv/<br />http://dbpedia.org/resource/Darth_Vader)<br /><ul><li>Findficticious serial killers in DBPedia</li></ul>(etc)<br />
    127. 127. Designing URI sets forthePublic Sector (UK)<br />http://www.cabinetoffice.gov.uk/media/301253/puiblic_sector_uri.pdf<br />105<br />
    128. 128. Asociación Española de Linked Data<br />106<br />
    129. 129. IntroductiontoLinked Data<br />Oscar Corcho, Asunción Gómez Pérez ({ocorcho, asun}@fi.upm.es)<br />Universidad Politécnica de Madrid<br />Universidad del Valle, Cali, ColombiaSeptember 10th 2010<br />Credits: Raúl García Castro, Oscar Muñoz, Jose Angel Ramos Gargantilla, María del Carmen Suárez de Figueroa, Boris Villazón, Alex de León, Víctor Saquicela, Luis Vilches, Miguel Angel García, Manuel Salvadores, Guillermo Alvaro, Juan Sequeda, Carlos Ruiz Moreno and manyothers<br />WorkdistributedunderthelicenseCreativeCommonsAttribution-Noncommercial-Share Alike 3.0<br />
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×