SlideShare a Scribd company logo
1 of 34
Digital Enterprise Research Institute                                                                www.deri.ie




                 Self-service Linked Government Data

                      Fadi Maali, Richard Cyganiak, Vassilios Peristeras
                                                          firstname.lastname@deri.org




 Copyright 2011 Digital Enterprise Research Institute. All rights reserved.




                                                                               Enabling networked knowledge
data.gov.uk
Digital Enterprise Research Institute                                 www.deri.ie




                                                Enabling networked knowledge
                                                                                2
data.gov.uk
Digital Enterprise Research Institute                                 www.deri.ie




                                                Enabling networked knowledge
                                                                                3
data.gov
Digital Enterprise Research Institute                              www.deri.ie




                                             Enabling networked knowledge
                                                                             4
data.gov
Digital Enterprise Research Institute                              www.deri.ie




                             4997 datasets



                             2590 in CSV




                             272 in RDF




                                             Enabling networked knowledge
                                                                             5
Why Linked Governemnt Data
                                  (LGD)?
Digital Enterprise Research Institute                                        www.deri.ie




                                                    Web accessible

                                                    Interlinkable

                                                    Decentralised publishing of
                                                     data

                                                    Standardised




                                                      Enabling networked knowledge
                                                                                       6
LGD
Digital Enterprise Research Institute                                   www.deri.ie




                                  We need government
                                  data as Linked Data not
                                  just Raw Data
                                   ….aha, and of a good
                                   quality!


                                                  Enabling networked knowledge
                                                                                  7
LGD is Costly
Digital Enterprise Research Institute                                   www.deri.ie




                                We want governments to
                                provide Linked Data not
                                just Raw Data… and of
                                good quality



    http://code.google.com/p/google-refine/

                                                  Enabling networked knowledge
                                                                                  8
Self-service Approach
Digital Enterprise Research Institute                                    www.deri.ie




                                               DIY

                                                   Enabling networked knowledge
                                                                                   9
Self-service Approach
Digital Enterprise Research Institute                                               www.deri.ie




                                                   DIY
          Provide tools, models and algorithms that enable the self-service approach (a
          publishing pipeline)



                                                          Enabling networked knowledge
                                                                                     10
Publishing pipeline requirements
Digital Enterprise Research Institute                                   www.deri.ie

         Interactive approach

         Graphical user interface

         Reproducibility and traceability

         Flexibility

         Decentralisation

         Results sharing




                                                  Enabling networked knowledge
                                                                             11
Publishing pipeline requirements
Digital Enterprise Research Institute                                   www.deri.ie

         Interactive approach

         Graphical user interface

         Reproducibility and traceability

         Flexibility

         Decentralisation

         Results sharing




                                                  Enabling networked knowledge
                                                                             12
Google Refine
Digital Enterprise Research Institute                                        www.deri.ie

         Powerful data editing, transformation and enriching capabilities

         Import capabilities e.g. JSON, Excel, CSV, TSV, XML, etc.

         Persistent undo/redo history

         Popular in open data community

         Extensible and under active development

         Free and open source



    http://code.google.com/p/google-refine/

                                                       Enabling networked knowledge
                                                                                  13
DIY Recipe (1000 feet view)
Digital Enterprise Research Institute                                                    www.deri.ie




 Publishers provide RDF                 Tool support to select
 representation of their                datasets of interest and        User shares the RDF
 catalogues                             put them into RDF               data




                                                                   Enabling networked knowledge
                                                                                              14
DIY Recipe (100 feet view)
Digital Enterprise Research Institute                                              www.deri.ie



      Publishers provide RDF representation
      of their catalogues
                                                   Tool support to select
                                                   datasets of interest     User shares the
                                                   and put them into RDF    RDF data




                                        dcat




                                                  Enabling networked knowledge
                                                                             15
DIY Recipe (100 feet view)
Digital Enterprise Research Institute                                                    www.deri.ie


                                 Tool support to select datasets of
Publishers provide
RDF representation of            interest and put them into RDF               User shares the RDF
their catalogues                                                              data




           dcat



                                             Google Refine

                                         + RDF export extension
                                         + RDF reconciliation extension




                                                                Enabling networked knowledge
                                                                                           16
DIY Recipe (100 feet view)
Digital Enterprise Research Institute                                                               www.deri.ie




Publishers provide               Tool support to select
RDF representation of            datasets of interest and put     User shares the RDF data
their catalogues                 them into RDF




           dcat                         Google Refine               Share RDF data publicly (on
                                 + RDF export extension             CKAN.net) along with the sufficient
                                 + RDF reconciliation extension     provenance description




                                                                          Enabling networked knowledge
                                                                                                     17
A Walk-through (1/5)
Digital Enterprise Research Institute                                    www.deri.ie




                                                   Enabling networked knowledge
                                                                              18
A Walk-through (2/5)
Digital Enterprise Research Institute                                    www.deri.ie




                                                   Enabling networked knowledge
                                                                              19
A Walk-through (3/5)
Digital Enterprise Research Institute                                    www.deri.ie




                                                   Enabling networked knowledge
                                                                              20
A Walk-through (4/5)
Digital Enterprise Research Institute                                    www.deri.ie




                                                   Enabling networked knowledge
                                                                              21
A Walk-through (5/5)
Digital Enterprise Research Institute                                    www.deri.ie




                                                   Enabling networked knowledge
                                                                              22
Data on CKAN.net
Digital Enterprise Research Institute                                     www.deri.ie




                                                    Enabling networked knowledge
                                                                               23
Data Provenance (simplified)
Digital Enterprise Research Institute                                                        www.deri.ie




                                                         :dataset



                                                                         dct:source
                                                  :wasExportedBy




   :json-history                                     :export-process                   :csv-ds
                                    :operations                        :usedData



                                                                       Enabling networked knowledge
                                                                                                  24
DIY Recipe (10 feet view)
Digital Enterprise Research Institute                                        www.deri.ie




    Dcat
         An RDF vocabulary to describe government catalogues


         Current status: First Public Working Draft by the W3C GLD Working
          Group
          http://www.w3.org/TR/vocab-dcat/


         Used on data.gov.uk (RDFa) and CKAN-based catalogues

     “Enabling Interoperability of Government Data Catalogues.”
     EGOV 2010


                                                       Enabling networked knowledge
                                                                                  25
DIY Recipe (10 feet view)
Digital Enterprise Research Institute                                    www.deri.ie


    RDF Mapping




                                                   Enabling networked knowledge
                                                                              26
More on RDF Mapping
Digital Enterprise Research Institute                              www.deri.ie

         RDF-centric mapping

         Multiple tree structure

         Expression language for
          custom expression

         Vocabularies/ontologies
          support




                                             Enabling networked knowledge
                                                                        27
DIY Recipe (10 feet view)
Digital Enterprise Research Institute                                               www.deri.ie




    Interlinking


                                                                         Silk LSL
                             RDF Reconcile   Crafted RDF   Silk Server
 Google
                             Extension
 Refine

                                                            SPARQL endpoint


                                                             SPARQL endpoint with
                                                             fulltext extension



                                                            Enabling networked knowledge
                                                                                       28
More on Interlinking
Digital Enterprise Research Institute                                       www.deri.ie

         Interlinking as a pre-RDF-creation step  less unnecessary
          owl:sameAs


         Focus on the interface


         Semi-automatic process with good user support




   “Re-using Cool URIs: Entity Reconciliation Against LOD Hubs.”
   LDOW 2011


                                                      Enabling networked knowledge
                                                                                 29
DIY Recipe (10 feet view)
Digital Enterprise Research Institute                                         www.deri.ie



    Sharing
         Captures the operations applied to the data


         Represent them according to Open Provenance Model
          Vocabulary (OPMV)


         Share the data and its provennce on CKAN.net


   CKAN Extension fro Google Refine
   http://lab.linkeddata.deri.ie/2011/grefine-ckan/


                                                        Enabling networked knowledge
                                                                                   30
Case study - Fingal Catalogue
Digital Enterprise Research Institute                                                www.deri.ie




         Number of datasets:            74 (68 available in CSV and 56 in XML)

                                        Fingal county Council (41), Central Statistics
         Top publishers:                Office (17), Department of Education and
                                        Science (4)
                                        Demographics(18), Citizen Participation(18),
         Top domains:
                                        Education(9)



          http://data.fingal.ie




                                                          Enabling networked knowledge
                                                                                     31
Case study - Fingal Catalogue
Digital Enterprise Research Institute                                     www.deri.ie

         The catalogue was represented in Dcat

         60 datasets were converted to RDF using the publishing
          pipeline (~300K triples)

         Data Cube was used for statistical data

         URIs were used consistently and shared among datasets 
          the data was interlinked

         Externally linked to DBpedia




                                                    Enabling networked knowledge
                                                                               32
Open Issues
Digital Enterprise Research Institute                                     www.deri.ie

         Evaluating/Refining the crowd-sourcing aspects of the RDF
          creation process


         RDF Modeling: Can we assist RDF modeling by examining the
          raw data?




                                                    Enabling networked knowledge
                                                                               33
Lessons Learned
Digital Enterprise Research Institute                                       www.deri.ie

         Interactive approach

         Focus on plumbing tools together but don’t enforce a rigid
          process

         Make it easy to adopt best-practices and good recipes




                                                      Enabling networked knowledge
                                                                                 34

More Related Content

What's hot

Towards Expertise Modelling for Routing Data Cleaning Tasks within a Communit...
Towards Expertise Modelling for Routing Data Cleaning Tasks within a Communit...Towards Expertise Modelling for Routing Data Cleaning Tasks within a Communit...
Towards Expertise Modelling for Routing Data Cleaning Tasks within a Communit...Umair ul Hassan
 
Enterprise Energy Management using a Linked Dataspace for Energy Intelligence
Enterprise Energy Management using a Linked Dataspace for Energy IntelligenceEnterprise Energy Management using a Linked Dataspace for Energy Intelligence
Enterprise Energy Management using a Linked Dataspace for Energy IntelligenceEdward Curry
 
Using Linked Data and the Internet of Things for Energy Management
Using Linked Data and the Internet of Things for Energy ManagementUsing Linked Data and the Internet of Things for Energy Management
Using Linked Data and the Internet of Things for Energy ManagementEdward Curry
 
An Environmental Chargeback for Data Center and Cloud Computing Consumers
An Environmental Chargeback for Data Center and Cloud Computing ConsumersAn Environmental Chargeback for Data Center and Cloud Computing Consumers
An Environmental Chargeback for Data Center and Cloud Computing ConsumersEdward Curry
 
Federating Distributed Social Data to Build an Interlinked Online Information...
Federating Distributed Social Data to Build an Interlinked Online Information...Federating Distributed Social Data to Build an Interlinked Online Information...
Federating Distributed Social Data to Build an Interlinked Online Information...Alexandre Passant
 
Externalization Trend
Externalization TrendExternalization Trend
Externalization TrendNigel Green
 
WikiSym2012 Deletion Discussions in Wikipedia: Decision Factors and Outcomes
WikiSym2012 Deletion Discussions in Wikipedia: Decision Factors and OutcomesWikiSym2012 Deletion Discussions in Wikipedia: Decision Factors and Outcomes
WikiSym2012 Deletion Discussions in Wikipedia: Decision Factors and Outcomesjodischneider
 
Envisioning a discussion dashboard for collective intelligence of web convers...
Envisioning a discussion dashboard for collective intelligence of web convers...Envisioning a discussion dashboard for collective intelligence of web convers...
Envisioning a discussion dashboard for collective intelligence of web convers...jodischneider
 
Applied Linked Open Data: A Mobile Solution for Galway Volvo Ocean Race
Applied Linked Open Data: A Mobile Solution for Galway Volvo Ocean RaceApplied Linked Open Data: A Mobile Solution for Galway Volvo Ocean Race
Applied Linked Open Data: A Mobile Solution for Galway Volvo Ocean RaceDerilinx
 
System of Systems Information Interoperability using a Linked Dataspace
System of Systems Information Interoperability using a Linked DataspaceSystem of Systems Information Interoperability using a Linked Dataspace
System of Systems Information Interoperability using a Linked DataspaceEdward Curry
 
Knowledge management on the desktop
Knowledge management on the desktopKnowledge management on the desktop
Knowledge management on the desktopLaura Dragan
 
Transitioning web application frameworks towards the Semantic Web (master the...
Transitioning web application frameworks towards the Semantic Web (master the...Transitioning web application frameworks towards the Semantic Web (master the...
Transitioning web application frameworks towards the Semantic Web (master the...Benjamin Heitmann
 
Big Data and Content Management. SkyDox and the European Court of Human Righ...
Big Data and Content Management.  SkyDox and the European Court of Human Righ...Big Data and Content Management.  SkyDox and the European Court of Human Righ...
Big Data and Content Management. SkyDox and the European Court of Human Righ...SkyDox LTD
 
Introduction to Open Data
Introduction to Open DataIntroduction to Open Data
Introduction to Open DataDerilinx
 
Leveraging existing Web Frameworks for a SIOC explorer (Scripting for the Sem...
Leveraging existing Web Frameworks for a SIOC explorer (Scripting for the Sem...Leveraging existing Web Frameworks for a SIOC explorer (Scripting for the Sem...
Leveraging existing Web Frameworks for a SIOC explorer (Scripting for the Sem...Benjamin Heitmann
 
Stefan Decker Keynote at CSHALS
Stefan Decker Keynote at CSHALSStefan Decker Keynote at CSHALS
Stefan Decker Keynote at CSHALSStefan Decker
 
AiLibrary Garage.com application review - by Gordon Kraft
AiLibrary Garage.com   application review - by Gordon Kraft AiLibrary Garage.com   application review - by Gordon Kraft
AiLibrary Garage.com application review - by Gordon Kraft Gordon Kraft
 
Dcat - Machine Accessible Data Catalogues
Dcat - Machine Accessible Data CataloguesDcat - Machine Accessible Data Catalogues
Dcat - Machine Accessible Data CataloguesFadi Maali
 
What your hairstyle says about your political preferences, and why you should...
What your hairstyle says about your political preferences, and why you should...What your hairstyle says about your political preferences, and why you should...
What your hairstyle says about your political preferences, and why you should...Benjamin Heitmann
 
Semantic Enterprise 2.0 - Enabling Semantic Web technologies in Enterprise 2...
Semantic Enterprise 2.0 - Enabling Semantic Web technologies in Enterprise 2...Semantic Enterprise 2.0 - Enabling Semantic Web technologies in Enterprise 2...
Semantic Enterprise 2.0 - Enabling Semantic Web technologies in Enterprise 2...Alexandre Passant
 

What's hot (20)

Towards Expertise Modelling for Routing Data Cleaning Tasks within a Communit...
Towards Expertise Modelling for Routing Data Cleaning Tasks within a Communit...Towards Expertise Modelling for Routing Data Cleaning Tasks within a Communit...
Towards Expertise Modelling for Routing Data Cleaning Tasks within a Communit...
 
Enterprise Energy Management using a Linked Dataspace for Energy Intelligence
Enterprise Energy Management using a Linked Dataspace for Energy IntelligenceEnterprise Energy Management using a Linked Dataspace for Energy Intelligence
Enterprise Energy Management using a Linked Dataspace for Energy Intelligence
 
Using Linked Data and the Internet of Things for Energy Management
Using Linked Data and the Internet of Things for Energy ManagementUsing Linked Data and the Internet of Things for Energy Management
Using Linked Data and the Internet of Things for Energy Management
 
An Environmental Chargeback for Data Center and Cloud Computing Consumers
An Environmental Chargeback for Data Center and Cloud Computing ConsumersAn Environmental Chargeback for Data Center and Cloud Computing Consumers
An Environmental Chargeback for Data Center and Cloud Computing Consumers
 
Federating Distributed Social Data to Build an Interlinked Online Information...
Federating Distributed Social Data to Build an Interlinked Online Information...Federating Distributed Social Data to Build an Interlinked Online Information...
Federating Distributed Social Data to Build an Interlinked Online Information...
 
Externalization Trend
Externalization TrendExternalization Trend
Externalization Trend
 
WikiSym2012 Deletion Discussions in Wikipedia: Decision Factors and Outcomes
WikiSym2012 Deletion Discussions in Wikipedia: Decision Factors and OutcomesWikiSym2012 Deletion Discussions in Wikipedia: Decision Factors and Outcomes
WikiSym2012 Deletion Discussions in Wikipedia: Decision Factors and Outcomes
 
Envisioning a discussion dashboard for collective intelligence of web convers...
Envisioning a discussion dashboard for collective intelligence of web convers...Envisioning a discussion dashboard for collective intelligence of web convers...
Envisioning a discussion dashboard for collective intelligence of web convers...
 
Applied Linked Open Data: A Mobile Solution for Galway Volvo Ocean Race
Applied Linked Open Data: A Mobile Solution for Galway Volvo Ocean RaceApplied Linked Open Data: A Mobile Solution for Galway Volvo Ocean Race
Applied Linked Open Data: A Mobile Solution for Galway Volvo Ocean Race
 
System of Systems Information Interoperability using a Linked Dataspace
System of Systems Information Interoperability using a Linked DataspaceSystem of Systems Information Interoperability using a Linked Dataspace
System of Systems Information Interoperability using a Linked Dataspace
 
Knowledge management on the desktop
Knowledge management on the desktopKnowledge management on the desktop
Knowledge management on the desktop
 
Transitioning web application frameworks towards the Semantic Web (master the...
Transitioning web application frameworks towards the Semantic Web (master the...Transitioning web application frameworks towards the Semantic Web (master the...
Transitioning web application frameworks towards the Semantic Web (master the...
 
Big Data and Content Management. SkyDox and the European Court of Human Righ...
Big Data and Content Management.  SkyDox and the European Court of Human Righ...Big Data and Content Management.  SkyDox and the European Court of Human Righ...
Big Data and Content Management. SkyDox and the European Court of Human Righ...
 
Introduction to Open Data
Introduction to Open DataIntroduction to Open Data
Introduction to Open Data
 
Leveraging existing Web Frameworks for a SIOC explorer (Scripting for the Sem...
Leveraging existing Web Frameworks for a SIOC explorer (Scripting for the Sem...Leveraging existing Web Frameworks for a SIOC explorer (Scripting for the Sem...
Leveraging existing Web Frameworks for a SIOC explorer (Scripting for the Sem...
 
Stefan Decker Keynote at CSHALS
Stefan Decker Keynote at CSHALSStefan Decker Keynote at CSHALS
Stefan Decker Keynote at CSHALS
 
AiLibrary Garage.com application review - by Gordon Kraft
AiLibrary Garage.com   application review - by Gordon Kraft AiLibrary Garage.com   application review - by Gordon Kraft
AiLibrary Garage.com application review - by Gordon Kraft
 
Dcat - Machine Accessible Data Catalogues
Dcat - Machine Accessible Data CataloguesDcat - Machine Accessible Data Catalogues
Dcat - Machine Accessible Data Catalogues
 
What your hairstyle says about your political preferences, and why you should...
What your hairstyle says about your political preferences, and why you should...What your hairstyle says about your political preferences, and why you should...
What your hairstyle says about your political preferences, and why you should...
 
Semantic Enterprise 2.0 - Enabling Semantic Web technologies in Enterprise 2...
Semantic Enterprise 2.0 - Enabling Semantic Web technologies in Enterprise 2...Semantic Enterprise 2.0 - Enabling Semantic Web technologies in Enterprise 2...
Semantic Enterprise 2.0 - Enabling Semantic Web technologies in Enterprise 2...
 

Similar to Self-service Linked Government Data

Slims arindam presentaion
Slims arindam presentaionSlims arindam presentaion
Slims arindam presentaionArindam Halder
 
Manfred Linking the Real World
Manfred Linking the Real WorldManfred Linking the Real World
Manfred Linking the Real Worldsssw2012
 
Linked Open Government Data
Linked Open Government DataLinked Open Government Data
Linked Open Government DataDerilinx
 
Building Optimisation using Scenario Modeling and Linked Data
Building Optimisation using Scenario Modeling and Linked DataBuilding Optimisation using Scenario Modeling and Linked Data
Building Optimisation using Scenario Modeling and Linked DataEdward Curry
 
Multi-Source Provenance-Aware User Interest Profiling on the Social Semantic Web
Multi-Source Provenance-Aware User Interest Profiling on the Social Semantic WebMulti-Source Provenance-Aware User Interest Profiling on the Social Semantic Web
Multi-Source Provenance-Aware User Interest Profiling on the Social Semantic WebFabrizio Orlandi
 
Aggregated, Interoperable and Multi-Domain User Profiles for the Social Web
Aggregated, Interoperable and Multi-Domain User Profiles for the Social WebAggregated, Interoperable and Multi-Domain User Profiles for the Social Web
Aggregated, Interoperable and Multi-Domain User Profiles for the Social WebFabrizio Orlandi
 
Linked Data: opportunities and challenges
Linked Data: opportunities and challengesLinked Data: opportunities and challenges
Linked Data: opportunities and challengesMichael Hausenblas
 
Linked Data lifecycle
Linked Data lifecycleLinked Data lifecycle
Linked Data lifecycleFadi Maali
 
VoID: Metadata for RDF Datasets
VoID: Metadata for RDF DatasetsVoID: Metadata for RDF Datasets
VoID: Metadata for RDF DatasetsRichard Cyganiak
 
Wikipedia (DBpedia): Crowdsourced Data Curation
Wikipedia (DBpedia): Crowdsourced Data CurationWikipedia (DBpedia): Crowdsourced Data Curation
Wikipedia (DBpedia): Crowdsourced Data CurationEdward Curry
 
Data Curation at the New York Times
Data Curation at the New York TimesData Curation at the New York Times
Data Curation at the New York TimesEdward Curry
 
Open Data - Where can it take us?
Open Data - Where can it take us? Open Data - Where can it take us?
Open Data - Where can it take us? Derilinx
 
Hello Open World - Semtech 2009
Hello Open World - Semtech 2009Hello Open World - Semtech 2009
Hello Open World - Semtech 2009Alexandre Passant
 
Annotating Microblog Posts with Sensor Data for Emergency Reporting Applications
Annotating Microblog Posts with Sensor Data for Emergency Reporting ApplicationsAnnotating Microblog Posts with Sensor Data for Emergency Reporting Applications
Annotating Microblog Posts with Sensor Data for Emergency Reporting ApplicationsDavid Crowley
 
DERI Overview - March 2011
DERI Overview - March 2011DERI Overview - March 2011
DERI Overview - March 2011mellotte
 
Interlinking Personal Semantic Data on the Semantic Desktop and the Web of Data
Interlinking Personal Semantic Data on the Semantic Desktop and the Web of DataInterlinking Personal Semantic Data on the Semantic Desktop and the Web of Data
Interlinking Personal Semantic Data on the Semantic Desktop and the Web of DataLaura Dragan
 
AAAI 2012 at Standord
AAAI 2012 at StandordAAAI 2012 at Standord
AAAI 2012 at StandordTed Vickey
 

Similar to Self-service Linked Government Data (20)

Slims arindam presentaion
Slims arindam presentaionSlims arindam presentaion
Slims arindam presentaion
 
Manfred Linking the Real World
Manfred Linking the Real WorldManfred Linking the Real World
Manfred Linking the Real World
 
Linked Open Government Data
Linked Open Government DataLinked Open Government Data
Linked Open Government Data
 
Building Optimisation using Scenario Modeling and Linked Data
Building Optimisation using Scenario Modeling and Linked DataBuilding Optimisation using Scenario Modeling and Linked Data
Building Optimisation using Scenario Modeling and Linked Data
 
Multi-Source Provenance-Aware User Interest Profiling on the Social Semantic Web
Multi-Source Provenance-Aware User Interest Profiling on the Social Semantic WebMulti-Source Provenance-Aware User Interest Profiling on the Social Semantic Web
Multi-Source Provenance-Aware User Interest Profiling on the Social Semantic Web
 
Sdecker
SdeckerSdecker
Sdecker
 
Aggregated, Interoperable and Multi-Domain User Profiles for the Social Web
Aggregated, Interoperable and Multi-Domain User Profiles for the Social WebAggregated, Interoperable and Multi-Domain User Profiles for the Social Web
Aggregated, Interoperable and Multi-Domain User Profiles for the Social Web
 
Linked Data: opportunities and challenges
Linked Data: opportunities and challengesLinked Data: opportunities and challenges
Linked Data: opportunities and challenges
 
Open Data Applications
Open Data ApplicationsOpen Data Applications
Open Data Applications
 
Linked Data lifecycle
Linked Data lifecycleLinked Data lifecycle
Linked Data lifecycle
 
VoID: Metadata for RDF Datasets
VoID: Metadata for RDF DatasetsVoID: Metadata for RDF Datasets
VoID: Metadata for RDF Datasets
 
Wikipedia (DBpedia): Crowdsourced Data Curation
Wikipedia (DBpedia): Crowdsourced Data CurationWikipedia (DBpedia): Crowdsourced Data Curation
Wikipedia (DBpedia): Crowdsourced Data Curation
 
How to Publish Open Data
How to Publish Open DataHow to Publish Open Data
How to Publish Open Data
 
Data Curation at the New York Times
Data Curation at the New York TimesData Curation at the New York Times
Data Curation at the New York Times
 
Open Data - Where can it take us?
Open Data - Where can it take us? Open Data - Where can it take us?
Open Data - Where can it take us?
 
Hello Open World - Semtech 2009
Hello Open World - Semtech 2009Hello Open World - Semtech 2009
Hello Open World - Semtech 2009
 
Annotating Microblog Posts with Sensor Data for Emergency Reporting Applications
Annotating Microblog Posts with Sensor Data for Emergency Reporting ApplicationsAnnotating Microblog Posts with Sensor Data for Emergency Reporting Applications
Annotating Microblog Posts with Sensor Data for Emergency Reporting Applications
 
DERI Overview - March 2011
DERI Overview - March 2011DERI Overview - March 2011
DERI Overview - March 2011
 
Interlinking Personal Semantic Data on the Semantic Desktop and the Web of Data
Interlinking Personal Semantic Data on the Semantic Desktop and the Web of DataInterlinking Personal Semantic Data on the Semantic Desktop and the Web of Data
Interlinking Personal Semantic Data on the Semantic Desktop and the Web of Data
 
AAAI 2012 at Standord
AAAI 2012 at StandordAAAI 2012 at Standord
AAAI 2012 at Standord
 

Recently uploaded

Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...itnewsafrica
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Kaya Weers
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxfnnc6jmgwh
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 

Recently uploaded (20)

Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 

Self-service Linked Government Data

  • 1. Digital Enterprise Research Institute www.deri.ie Self-service Linked Government Data Fadi Maali, Richard Cyganiak, Vassilios Peristeras firstname.lastname@deri.org Copyright 2011 Digital Enterprise Research Institute. All rights reserved. Enabling networked knowledge
  • 2. data.gov.uk Digital Enterprise Research Institute www.deri.ie Enabling networked knowledge 2
  • 3. data.gov.uk Digital Enterprise Research Institute www.deri.ie Enabling networked knowledge 3
  • 4. data.gov Digital Enterprise Research Institute www.deri.ie Enabling networked knowledge 4
  • 5. data.gov Digital Enterprise Research Institute www.deri.ie 4997 datasets 2590 in CSV 272 in RDF Enabling networked knowledge 5
  • 6. Why Linked Governemnt Data (LGD)? Digital Enterprise Research Institute www.deri.ie  Web accessible  Interlinkable  Decentralised publishing of data  Standardised Enabling networked knowledge 6
  • 7. LGD Digital Enterprise Research Institute www.deri.ie We need government data as Linked Data not just Raw Data ….aha, and of a good quality! Enabling networked knowledge 7
  • 8. LGD is Costly Digital Enterprise Research Institute www.deri.ie We want governments to provide Linked Data not just Raw Data… and of good quality http://code.google.com/p/google-refine/ Enabling networked knowledge 8
  • 9. Self-service Approach Digital Enterprise Research Institute www.deri.ie DIY Enabling networked knowledge 9
  • 10. Self-service Approach Digital Enterprise Research Institute www.deri.ie DIY Provide tools, models and algorithms that enable the self-service approach (a publishing pipeline) Enabling networked knowledge 10
  • 11. Publishing pipeline requirements Digital Enterprise Research Institute www.deri.ie  Interactive approach  Graphical user interface  Reproducibility and traceability  Flexibility  Decentralisation  Results sharing Enabling networked knowledge 11
  • 12. Publishing pipeline requirements Digital Enterprise Research Institute www.deri.ie  Interactive approach  Graphical user interface  Reproducibility and traceability  Flexibility  Decentralisation  Results sharing Enabling networked knowledge 12
  • 13. Google Refine Digital Enterprise Research Institute www.deri.ie  Powerful data editing, transformation and enriching capabilities  Import capabilities e.g. JSON, Excel, CSV, TSV, XML, etc.  Persistent undo/redo history  Popular in open data community  Extensible and under active development  Free and open source http://code.google.com/p/google-refine/ Enabling networked knowledge 13
  • 14. DIY Recipe (1000 feet view) Digital Enterprise Research Institute www.deri.ie Publishers provide RDF Tool support to select representation of their datasets of interest and User shares the RDF catalogues put them into RDF data Enabling networked knowledge 14
  • 15. DIY Recipe (100 feet view) Digital Enterprise Research Institute www.deri.ie Publishers provide RDF representation of their catalogues Tool support to select datasets of interest User shares the and put them into RDF RDF data dcat Enabling networked knowledge 15
  • 16. DIY Recipe (100 feet view) Digital Enterprise Research Institute www.deri.ie Tool support to select datasets of Publishers provide RDF representation of interest and put them into RDF User shares the RDF their catalogues data dcat Google Refine + RDF export extension + RDF reconciliation extension Enabling networked knowledge 16
  • 17. DIY Recipe (100 feet view) Digital Enterprise Research Institute www.deri.ie Publishers provide Tool support to select RDF representation of datasets of interest and put User shares the RDF data their catalogues them into RDF dcat Google Refine Share RDF data publicly (on + RDF export extension CKAN.net) along with the sufficient + RDF reconciliation extension provenance description Enabling networked knowledge 17
  • 18. A Walk-through (1/5) Digital Enterprise Research Institute www.deri.ie Enabling networked knowledge 18
  • 19. A Walk-through (2/5) Digital Enterprise Research Institute www.deri.ie Enabling networked knowledge 19
  • 20. A Walk-through (3/5) Digital Enterprise Research Institute www.deri.ie Enabling networked knowledge 20
  • 21. A Walk-through (4/5) Digital Enterprise Research Institute www.deri.ie Enabling networked knowledge 21
  • 22. A Walk-through (5/5) Digital Enterprise Research Institute www.deri.ie Enabling networked knowledge 22
  • 23. Data on CKAN.net Digital Enterprise Research Institute www.deri.ie Enabling networked knowledge 23
  • 24. Data Provenance (simplified) Digital Enterprise Research Institute www.deri.ie :dataset dct:source :wasExportedBy :json-history :export-process :csv-ds :operations :usedData Enabling networked knowledge 24
  • 25. DIY Recipe (10 feet view) Digital Enterprise Research Institute www.deri.ie Dcat  An RDF vocabulary to describe government catalogues  Current status: First Public Working Draft by the W3C GLD Working Group http://www.w3.org/TR/vocab-dcat/  Used on data.gov.uk (RDFa) and CKAN-based catalogues “Enabling Interoperability of Government Data Catalogues.” EGOV 2010 Enabling networked knowledge 25
  • 26. DIY Recipe (10 feet view) Digital Enterprise Research Institute www.deri.ie RDF Mapping Enabling networked knowledge 26
  • 27. More on RDF Mapping Digital Enterprise Research Institute www.deri.ie  RDF-centric mapping  Multiple tree structure  Expression language for custom expression  Vocabularies/ontologies support Enabling networked knowledge 27
  • 28. DIY Recipe (10 feet view) Digital Enterprise Research Institute www.deri.ie Interlinking Silk LSL RDF Reconcile Crafted RDF Silk Server Google Extension Refine SPARQL endpoint SPARQL endpoint with fulltext extension Enabling networked knowledge 28
  • 29. More on Interlinking Digital Enterprise Research Institute www.deri.ie  Interlinking as a pre-RDF-creation step  less unnecessary owl:sameAs  Focus on the interface  Semi-automatic process with good user support “Re-using Cool URIs: Entity Reconciliation Against LOD Hubs.” LDOW 2011 Enabling networked knowledge 29
  • 30. DIY Recipe (10 feet view) Digital Enterprise Research Institute www.deri.ie Sharing  Captures the operations applied to the data  Represent them according to Open Provenance Model Vocabulary (OPMV)  Share the data and its provennce on CKAN.net CKAN Extension fro Google Refine http://lab.linkeddata.deri.ie/2011/grefine-ckan/ Enabling networked knowledge 30
  • 31. Case study - Fingal Catalogue Digital Enterprise Research Institute www.deri.ie Number of datasets: 74 (68 available in CSV and 56 in XML) Fingal county Council (41), Central Statistics Top publishers: Office (17), Department of Education and Science (4) Demographics(18), Citizen Participation(18), Top domains: Education(9) http://data.fingal.ie Enabling networked knowledge 31
  • 32. Case study - Fingal Catalogue Digital Enterprise Research Institute www.deri.ie  The catalogue was represented in Dcat  60 datasets were converted to RDF using the publishing pipeline (~300K triples)  Data Cube was used for statistical data  URIs were used consistently and shared among datasets  the data was interlinked  Externally linked to DBpedia Enabling networked knowledge 32
  • 33. Open Issues Digital Enterprise Research Institute www.deri.ie  Evaluating/Refining the crowd-sourcing aspects of the RDF creation process  RDF Modeling: Can we assist RDF modeling by examining the raw data? Enabling networked knowledge 33
  • 34. Lessons Learned Digital Enterprise Research Institute www.deri.ie  Interactive approach  Focus on plumbing tools together but don’t enforce a rigid process  Make it easy to adopt best-practices and good recipes Enabling networked knowledge 34