Statistical Linked Data

                        Boris Villazón-Terrazas
      Facultad de Informática, Universidad Politécnica de Madrid
    Campus de Montegancedo sn 28660 Boadilla del Monte Madrid
                               sn,                   Monte,
                       http://www.oeg-upm.net
                         bvillazon@fi.upm.es
             Phone: 34.91.3366605, Fax: 34.91.3524819
      Slides available at: http://www.slideshare.net/boricles/


Acknowledgements: Local Government Management Services
Board - Ireland, Michael Hausenblas, Richard Cyganiak, Oscar
Corcho, Asunción Gómez-Pérez

WorkdistributedunderthelicenseCreativeCommonsAttribution-
Noncommercial-Share Alike 3.0
2
Specification – Spreadsheet about statistics

• Service Indicators of Ireland




                            3
Specification – Spreadsheet about statistics

• Service Indicators of Ireland
   • Data for 2009




                            4
Specification - URI design

• Base URI

  • http://stats.data-gov.ie


• TBOX URI

  • We use the RDF Data Cube Vocabulary


• ABOX URI

  • http:// stats.data-gov.ie /data/{resourceType}/{resource}




                                5
6
Vocabularies for modelling statistical data

• SCOVO
   • Michael Hausenblas, Danny Ayers, Lee Feigenbaum, Tom
     Heath,
     Heath Wolfgang Halb Y es Raimond 2008
                      Halb, Yves Raimond,
   • It is deprecated

     http://purl.org/NET/scovo
     http://purl org/NET/scovo




• SDMX – RDF
   • Richard Cyganiak, Chris Dollin, Dave Reynolds, 2010
   • Provides a general means to publish statistical data in RDF
     (exploiting the SDMX information model).
http://publishing-statistical-data.googlecode.com/svn/trunk/specs/src/main/html/index.html




                                        7
Vocabularies for modelling statistical data
• RDF Data Cube
     • Richard Cyganiak, Dave Reynolds, Jeni Tennison, 2010

     • SDMX is massive, and many parts of the standard are not
       really relevant for the kind of web-based data publishing that
       RDF excels in. So they identified a core of SDMX that
       seemed most relevant for data publishing, called that core
       “Data Cube”, and published it separately.

     • W3C Government Linked Data Working group is going to
       propose RDF Data Cube for modelling government statistical
       data.



http://linked-statistics.org/datacube/
http://linked statistics org/datacube/



                                         8
RDF Data Cube Vocabulary

qb http://purl.org/linked-data/cube#




                             9
RDF Data Cube – Main elements




 10
11
RDF Data Cube - Concepts


                                             rdf:type

     skos:Concept


                                                 stats:concept/f
               rdf:type
                df t
                          skos:broader                                  skos:broader


rdf:type                stats:concept/f-1                          stats:concept/f-2



            rdf:type             skos:broader


                       stats:concept/f 1 2
                       stats:concept/f-1-2           rdf:http://www.w3.org/1999/02/22-rdf-syntax-ns#
                                                     rdf:http://www w3 org/1999/02/22 rdf syntax ns#
                                                     skos: http://www.w3.org/2004/02/skos/core#
                                                     stats: http://stats.data-gov.ie/data/
                                                12
RDF Data Cube - Properties

            qb:MeasureProperty


                         rdf:type

                                        rdfs:subPropertyOf
             stats:property/f-1-2
                           /f                                             qb:obsValue



  qb:concept
   b       t                                       rdfs:label
                  rdfs:range



stats:concept/f-1-2             xsd:double                          “Average time …”

                                                  rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns#
                                                  rdfs: http://www.w3.org/2000/01/rdf-schema#
                                                  qb: http://purl.org/linked-data/cube#
                                                  stats: http://stats.data-gov.ie/data/
                                             13
RDF Data Cube – Data structure definition

qb:DataStructureDefinition


               rdf:type
                             qb:component

     stats:dsd/f-1-2                      stats:componet/geoArea
                                                   p     g


                             qb:component



                                  stats:componet/refPeriod
              q
              qb:component
                    p



  stats:componet/f-1-2                    rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns#
                                          qb: http://purl.org/linked-data/cube#
                                               http://purl org/linked data/cube#
                                          stats: http://stats.data-gov.ie/data/

                                         14
RDF Data Cube – DataSet

                      qb:DataSet

                 rdf:type
                                             qb:structure
                     stats:data/f-1-2                                    stats:dsd/f-1-2



    qb:dataSet                              qb:dataSet




    stats:data/f-1-          ……              stats:data/f-1-
2/2009/county/donegal                     2/2009/county/cavan


          rdf:type                           rdf:type

                         qb:Observation            rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns#
                                                   qb: http://purl.org/linked-data/cube#
                                                   stats: http://stats.data-gov.ie/data/
                                              15
RDF Data Cube – Observation
rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns#
qb: http://purl.org/linked-data/cube#
stats: http://stats.data-gov.ie/data/
property: http://stats.data-gov.ie/property/
sdmx-dimension: http://purl.org/linked-data/sdmx/2009/dimension#


                                           http://reference.data.gov
                                                .uk/id/year/2009


                                                            sdmx-dimension:refPeriod

                              qb:dataSet                               rdf:type
                                                stats:data/f-1-
       stats:data/f-1-2                                                           qb:Observation
                                            2/2009/county/donegal


                    property:geoArea                                    property:f-1-2



                    http://geo.data-
                    http://geo data-
                 gov.ie/county/donegal                                            5.29


                                                      16
17
Fuseki endpoint

• http://data-gov.ie/sparql




                              18
Fuseki endpoint
• N
  Now you can play a bit with SPARQL … ;)
               l          ith           )

• What is the value of the Service Indicator F 2 1
                                             F-2-1
  (Percentage of cases in respect of fire in which first
  attendance is at the scene within 10 minutes) in the
  County of D
  C        f Donegal?l?
   PREFIX data: <http://stats.data-gov.ie/data/>
   PREFIX qb: <http://purl.org/linked-data/cube#>
           qb     ttp //pu o g/    ed data/cube#
   PREFIX sdmx-dimension : <http://purl.org/linked-data/sdmx/2009/dimension#>
   PREFIX property: <http://stats.data-gov.ie/property/>
   SELECT ?obs ?ds ?val
   WHERE {
   ?obs a qb:Observation .
   ?obs sdmx-dimension :refPeriod <http://reference.data.gov.uk/id/year/2009> .
   ?obs property:geoArea <http://geo.data-gov.ie/county/donegal> .
   ?obs qb:dataSet ?ds .
   ?obs property:f-2-1 ?val .
   }
                                           19
Metadata publication – VoID
   • VoID description
        • void.ttl




http://www.w3.org/TR/void/
                             20
sitemap4rdf


• Simple command line tool
• Sends a SPARQL query to list all URIs
• Generates sitemap

  sitemap4rdf http://yoursite/sparql http://yoursite/resource/

  Example:

  sitemap4rdf http://geo.linkeddata.es/sparql http://geo.linkeddata.es/
  sitemap4rdf http://data-gov.ie/sparql http://stats.data-gov.ie


• run sitemap4rdf specifying the SPARQL endpoint
  and the prefix of the URLs to include in the Sitemap


                                       21
22
Exploitation

http://county-rank.data-gov.ie/




                             23
Information about the County




24
Service Indicators - I




25
Service Indicators - II




26
Service Indicators - III




27
Statistical Linked Data

                        Boris Villazón-Terrazas
      Facultad de Informática, Universidad Politécnica de Madrid
    Campus de Montegancedo sn 28660 Boadilla del Monte Madrid
                               sn,                   Monte,
                       http://www.oeg-upm.net
                         bvillazon@fi.upm.es
             Phone: 34.91.3366605, Fax: 34.91.3524819
      Slides available at: http://www.slideshare.net/boricles/


Acknowledgements: Local Government Management Services
Board - Ireland, Michael Hausenblas, Richard Cyganiak, Oscar
Corcho, Asunción Gómez-Pérez

WorkdistributedunderthelicenseCreativeCommonsAttribution-
Noncommercial-Share Alike 3.0

Statistical Linked Data

  • 1.
    Statistical Linked Data Boris Villazón-Terrazas Facultad de Informática, Universidad Politécnica de Madrid Campus de Montegancedo sn 28660 Boadilla del Monte Madrid sn, Monte, http://www.oeg-upm.net bvillazon@fi.upm.es Phone: 34.91.3366605, Fax: 34.91.3524819 Slides available at: http://www.slideshare.net/boricles/ Acknowledgements: Local Government Management Services Board - Ireland, Michael Hausenblas, Richard Cyganiak, Oscar Corcho, Asunción Gómez-Pérez WorkdistributedunderthelicenseCreativeCommonsAttribution- Noncommercial-Share Alike 3.0
  • 2.
  • 3.
    Specification – Spreadsheetabout statistics • Service Indicators of Ireland 3
  • 4.
    Specification – Spreadsheetabout statistics • Service Indicators of Ireland • Data for 2009 4
  • 5.
    Specification - URIdesign • Base URI • http://stats.data-gov.ie • TBOX URI • We use the RDF Data Cube Vocabulary • ABOX URI • http:// stats.data-gov.ie /data/{resourceType}/{resource} 5
  • 6.
  • 7.
    Vocabularies for modellingstatistical data • SCOVO • Michael Hausenblas, Danny Ayers, Lee Feigenbaum, Tom Heath, Heath Wolfgang Halb Y es Raimond 2008 Halb, Yves Raimond, • It is deprecated http://purl.org/NET/scovo http://purl org/NET/scovo • SDMX – RDF • Richard Cyganiak, Chris Dollin, Dave Reynolds, 2010 • Provides a general means to publish statistical data in RDF (exploiting the SDMX information model). http://publishing-statistical-data.googlecode.com/svn/trunk/specs/src/main/html/index.html 7
  • 8.
    Vocabularies for modellingstatistical data • RDF Data Cube • Richard Cyganiak, Dave Reynolds, Jeni Tennison, 2010 • SDMX is massive, and many parts of the standard are not really relevant for the kind of web-based data publishing that RDF excels in. So they identified a core of SDMX that seemed most relevant for data publishing, called that core “Data Cube”, and published it separately. • W3C Government Linked Data Working group is going to propose RDF Data Cube for modelling government statistical data. http://linked-statistics.org/datacube/ http://linked statistics org/datacube/ 8
  • 9.
    RDF Data CubeVocabulary qb http://purl.org/linked-data/cube# 9
  • 10.
    RDF Data Cube– Main elements 10
  • 11.
  • 12.
    RDF Data Cube- Concepts rdf:type skos:Concept stats:concept/f rdf:type df t skos:broader skos:broader rdf:type stats:concept/f-1 stats:concept/f-2 rdf:type skos:broader stats:concept/f 1 2 stats:concept/f-1-2 rdf:http://www.w3.org/1999/02/22-rdf-syntax-ns# rdf:http://www w3 org/1999/02/22 rdf syntax ns# skos: http://www.w3.org/2004/02/skos/core# stats: http://stats.data-gov.ie/data/ 12
  • 13.
    RDF Data Cube- Properties qb:MeasureProperty rdf:type rdfs:subPropertyOf stats:property/f-1-2 /f qb:obsValue qb:concept b t rdfs:label rdfs:range stats:concept/f-1-2 xsd:double “Average time …” rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns# rdfs: http://www.w3.org/2000/01/rdf-schema# qb: http://purl.org/linked-data/cube# stats: http://stats.data-gov.ie/data/ 13
  • 14.
    RDF Data Cube– Data structure definition qb:DataStructureDefinition rdf:type qb:component stats:dsd/f-1-2 stats:componet/geoArea p g qb:component stats:componet/refPeriod q qb:component p stats:componet/f-1-2 rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns# qb: http://purl.org/linked-data/cube# http://purl org/linked data/cube# stats: http://stats.data-gov.ie/data/ 14
  • 15.
    RDF Data Cube– DataSet qb:DataSet rdf:type qb:structure stats:data/f-1-2 stats:dsd/f-1-2 qb:dataSet qb:dataSet stats:data/f-1- …… stats:data/f-1- 2/2009/county/donegal 2/2009/county/cavan rdf:type rdf:type qb:Observation rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns# qb: http://purl.org/linked-data/cube# stats: http://stats.data-gov.ie/data/ 15
  • 16.
    RDF Data Cube– Observation rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns# qb: http://purl.org/linked-data/cube# stats: http://stats.data-gov.ie/data/ property: http://stats.data-gov.ie/property/ sdmx-dimension: http://purl.org/linked-data/sdmx/2009/dimension# http://reference.data.gov .uk/id/year/2009 sdmx-dimension:refPeriod qb:dataSet rdf:type stats:data/f-1- stats:data/f-1-2 qb:Observation 2/2009/county/donegal property:geoArea property:f-1-2 http://geo.data- http://geo data- gov.ie/county/donegal 5.29 16
  • 17.
  • 18.
  • 19.
    Fuseki endpoint • N Now you can play a bit with SPARQL … ;) l ith ) • What is the value of the Service Indicator F 2 1 F-2-1 (Percentage of cases in respect of fire in which first attendance is at the scene within 10 minutes) in the County of D C f Donegal?l? PREFIX data: <http://stats.data-gov.ie/data/> PREFIX qb: <http://purl.org/linked-data/cube#> qb ttp //pu o g/ ed data/cube# PREFIX sdmx-dimension : <http://purl.org/linked-data/sdmx/2009/dimension#> PREFIX property: <http://stats.data-gov.ie/property/> SELECT ?obs ?ds ?val WHERE { ?obs a qb:Observation . ?obs sdmx-dimension :refPeriod <http://reference.data.gov.uk/id/year/2009> . ?obs property:geoArea <http://geo.data-gov.ie/county/donegal> . ?obs qb:dataSet ?ds . ?obs property:f-2-1 ?val . } 19
  • 20.
    Metadata publication –VoID • VoID description • void.ttl http://www.w3.org/TR/void/ 20
  • 21.
    sitemap4rdf • Simple commandline tool • Sends a SPARQL query to list all URIs • Generates sitemap sitemap4rdf http://yoursite/sparql http://yoursite/resource/ Example: sitemap4rdf http://geo.linkeddata.es/sparql http://geo.linkeddata.es/ sitemap4rdf http://data-gov.ie/sparql http://stats.data-gov.ie • run sitemap4rdf specifying the SPARQL endpoint and the prefix of the URLs to include in the Sitemap 21
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
  • 29.
    Statistical Linked Data Boris Villazón-Terrazas Facultad de Informática, Universidad Politécnica de Madrid Campus de Montegancedo sn 28660 Boadilla del Monte Madrid sn, Monte, http://www.oeg-upm.net bvillazon@fi.upm.es Phone: 34.91.3366605, Fax: 34.91.3524819 Slides available at: http://www.slideshare.net/boricles/ Acknowledgements: Local Government Management Services Board - Ireland, Michael Hausenblas, Richard Cyganiak, Oscar Corcho, Asunción Gómez-Pérez WorkdistributedunderthelicenseCreativeCommonsAttribution- Noncommercial-Share Alike 3.0