SlideShare a Scribd company logo
1 of 34
Digital Enterprise Research Institute                                                                www.deri.ie




                 Self-service Linked Government Data

                      Fadi Maali, Richard Cyganiak, Vassilios Peristeras
                                                          firstname.lastname@deri.org




 Copyright 2011 Digital Enterprise Research Institute. All rights reserved.




                                                                               Enabling networked knowledge
data.gov.uk
Digital Enterprise Research Institute                                 www.deri.ie




                                                Enabling networked knowledge
                                                                                2
data.gov.uk
Digital Enterprise Research Institute                                 www.deri.ie




                                                Enabling networked knowledge
                                                                                3
data.gov
Digital Enterprise Research Institute                              www.deri.ie




                                             Enabling networked knowledge
                                                                             4
data.gov
Digital Enterprise Research Institute                              www.deri.ie




                             4997 datasets



                             2590 in CSV




                             272 in RDF




                                             Enabling networked knowledge
                                                                             5
Why Linked Governemnt Data
                                  (LGD)?
Digital Enterprise Research Institute                                        www.deri.ie




                                                    Web accessible

                                                    Interlinkable

                                                    Decentralised publishing of
                                                     data

                                                    Standardised




                                                      Enabling networked knowledge
                                                                                       6
LGD
Digital Enterprise Research Institute                                   www.deri.ie




                                  We need government
                                  data as Linked Data not
                                  just Raw Data
                                   ….aha, and of a good
                                   quality!


                                                  Enabling networked knowledge
                                                                                  7
LGD is Costly
Digital Enterprise Research Institute                                   www.deri.ie




                                We want governments to
                                provide Linked Data not
                                just Raw Data… and of
                                good quality



    http://code.google.com/p/google-refine/

                                                  Enabling networked knowledge
                                                                                  8
Self-service Approach
Digital Enterprise Research Institute                                    www.deri.ie




                                               DIY

                                                   Enabling networked knowledge
                                                                                   9
Self-service Approach
Digital Enterprise Research Institute                                               www.deri.ie




                                                   DIY
          Provide tools, models and algorithms that enable the self-service approach (a
          publishing pipeline)



                                                          Enabling networked knowledge
                                                                                     10
Publishing pipeline requirements
Digital Enterprise Research Institute                                   www.deri.ie

         Interactive approach

         Graphical user interface

         Reproducibility and traceability

         Flexibility

         Decentralisation

         Results sharing




                                                  Enabling networked knowledge
                                                                             11
Publishing pipeline requirements
Digital Enterprise Research Institute                                   www.deri.ie

         Interactive approach

         Graphical user interface

         Reproducibility and traceability

         Flexibility

         Decentralisation

         Results sharing




                                                  Enabling networked knowledge
                                                                             12
Google Refine
Digital Enterprise Research Institute                                        www.deri.ie

         Powerful data editing, transformation and enriching capabilities

         Import capabilities e.g. JSON, Excel, CSV, TSV, XML, etc.

         Persistent undo/redo history

         Popular in open data community

         Extensible and under active development

         Free and open source



    http://code.google.com/p/google-refine/

                                                       Enabling networked knowledge
                                                                                  13
DIY Recipe (1000 feet view)
Digital Enterprise Research Institute                                                    www.deri.ie




 Publishers provide RDF                 Tool support to select
 representation of their                datasets of interest and        User shares the RDF
 catalogues                             put them into RDF               data




                                                                   Enabling networked knowledge
                                                                                              14
DIY Recipe (100 feet view)
Digital Enterprise Research Institute                                              www.deri.ie



      Publishers provide RDF representation
      of their catalogues
                                                   Tool support to select
                                                   datasets of interest     User shares the
                                                   and put them into RDF    RDF data




                                        dcat




                                                  Enabling networked knowledge
                                                                             15
DIY Recipe (100 feet view)
Digital Enterprise Research Institute                                                    www.deri.ie


                                 Tool support to select datasets of
Publishers provide
RDF representation of            interest and put them into RDF               User shares the RDF
their catalogues                                                              data




           dcat



                                             Google Refine

                                         + RDF export extension
                                         + RDF reconciliation extension




                                                                Enabling networked knowledge
                                                                                           16
DIY Recipe (100 feet view)
Digital Enterprise Research Institute                                                               www.deri.ie




Publishers provide               Tool support to select
RDF representation of            datasets of interest and put     User shares the RDF data
their catalogues                 them into RDF




           dcat                         Google Refine               Share RDF data publicly (on
                                 + RDF export extension             CKAN.net) along with the sufficient
                                 + RDF reconciliation extension     provenance description




                                                                          Enabling networked knowledge
                                                                                                     17
A Walk-through (1/5)
Digital Enterprise Research Institute                                    www.deri.ie




                                                   Enabling networked knowledge
                                                                              18
A Walk-through (2/5)
Digital Enterprise Research Institute                                    www.deri.ie




                                                   Enabling networked knowledge
                                                                              19
A Walk-through (3/5)
Digital Enterprise Research Institute                                    www.deri.ie




                                                   Enabling networked knowledge
                                                                              20
A Walk-through (4/5)
Digital Enterprise Research Institute                                    www.deri.ie




                                                   Enabling networked knowledge
                                                                              21
A Walk-through (5/5)
Digital Enterprise Research Institute                                    www.deri.ie




                                                   Enabling networked knowledge
                                                                              22
Data on CKAN.net
Digital Enterprise Research Institute                                     www.deri.ie




                                                    Enabling networked knowledge
                                                                               23
Data Provenance (simplified)
Digital Enterprise Research Institute                                                        www.deri.ie




                                                         :dataset



                                                                         dct:source
                                                  :wasExportedBy




   :json-history                                     :export-process                   :csv-ds
                                    :operations                        :usedData



                                                                       Enabling networked knowledge
                                                                                                  24
DIY Recipe (10 feet view)
Digital Enterprise Research Institute                                        www.deri.ie




    Dcat
         An RDF vocabulary to describe government catalogues


         Current status: First Public Working Draft by the W3C GLD Working
          Group
          http://www.w3.org/TR/vocab-dcat/


         Used on data.gov.uk (RDFa) and CKAN-based catalogues

     “Enabling Interoperability of Government Data Catalogues.”
     EGOV 2010


                                                       Enabling networked knowledge
                                                                                  25
DIY Recipe (10 feet view)
Digital Enterprise Research Institute                                    www.deri.ie


    RDF Mapping




                                                   Enabling networked knowledge
                                                                              26
More on RDF Mapping
Digital Enterprise Research Institute                              www.deri.ie

         RDF-centric mapping

         Multiple tree structure

         Expression language for
          custom expression

         Vocabularies/ontologies
          support




                                             Enabling networked knowledge
                                                                        27
DIY Recipe (10 feet view)
Digital Enterprise Research Institute                                               www.deri.ie




    Interlinking


                                                                         Silk LSL
                             RDF Reconcile   Crafted RDF   Silk Server
 Google
                             Extension
 Refine

                                                            SPARQL endpoint


                                                             SPARQL endpoint with
                                                             fulltext extension



                                                            Enabling networked knowledge
                                                                                       28
More on Interlinking
Digital Enterprise Research Institute                                       www.deri.ie

         Interlinking as a pre-RDF-creation step  less unnecessary
          owl:sameAs


         Focus on the interface


         Semi-automatic process with good user support




   “Re-using Cool URIs: Entity Reconciliation Against LOD Hubs.”
   LDOW 2011


                                                      Enabling networked knowledge
                                                                                 29
DIY Recipe (10 feet view)
Digital Enterprise Research Institute                                         www.deri.ie



    Sharing
         Captures the operations applied to the data


         Represent them according to Open Provenance Model
          Vocabulary (OPMV)


         Share the data and its provennce on CKAN.net


   CKAN Extension fro Google Refine
   http://lab.linkeddata.deri.ie/2011/grefine-ckan/


                                                        Enabling networked knowledge
                                                                                   30
Case study - Fingal Catalogue
Digital Enterprise Research Institute                                                www.deri.ie




         Number of datasets:            74 (68 available in CSV and 56 in XML)

                                        Fingal county Council (41), Central Statistics
         Top publishers:                Office (17), Department of Education and
                                        Science (4)
                                        Demographics(18), Citizen Participation(18),
         Top domains:
                                        Education(9)



          http://data.fingal.ie




                                                          Enabling networked knowledge
                                                                                     31
Case study - Fingal Catalogue
Digital Enterprise Research Institute                                     www.deri.ie

         The catalogue was represented in Dcat

         60 datasets were converted to RDF using the publishing
          pipeline (~300K triples)

         Data Cube was used for statistical data

         URIs were used consistently and shared among datasets 
          the data was interlinked

         Externally linked to DBpedia




                                                    Enabling networked knowledge
                                                                               32
Open Issues
Digital Enterprise Research Institute                                     www.deri.ie

         Evaluating/Refining the crowd-sourcing aspects of the RDF
          creation process


         RDF Modeling: Can we assist RDF modeling by examining the
          raw data?




                                                    Enabling networked knowledge
                                                                               33
Lessons Learned
Digital Enterprise Research Institute                                       www.deri.ie

         Interactive approach

         Focus on plumbing tools together but don’t enforce a rigid
          process

         Make it easy to adopt best-practices and good recipes




                                                      Enabling networked knowledge
                                                                                 34

More Related Content

What's hot

Towards Expertise Modelling for Routing Data Cleaning Tasks within a Communit...
Towards Expertise Modelling for Routing Data Cleaning Tasks within a Communit...Towards Expertise Modelling for Routing Data Cleaning Tasks within a Communit...
Towards Expertise Modelling for Routing Data Cleaning Tasks within a Communit...Umair ul Hassan
 
Enterprise Energy Management using a Linked Dataspace for Energy Intelligence
Enterprise Energy Management using a Linked Dataspace for Energy IntelligenceEnterprise Energy Management using a Linked Dataspace for Energy Intelligence
Enterprise Energy Management using a Linked Dataspace for Energy IntelligenceEdward Curry
 
Using Linked Data and the Internet of Things for Energy Management
Using Linked Data and the Internet of Things for Energy ManagementUsing Linked Data and the Internet of Things for Energy Management
Using Linked Data and the Internet of Things for Energy ManagementEdward Curry
 
An Environmental Chargeback for Data Center and Cloud Computing Consumers
An Environmental Chargeback for Data Center and Cloud Computing ConsumersAn Environmental Chargeback for Data Center and Cloud Computing Consumers
An Environmental Chargeback for Data Center and Cloud Computing ConsumersEdward Curry
 
Federating Distributed Social Data to Build an Interlinked Online Information...
Federating Distributed Social Data to Build an Interlinked Online Information...Federating Distributed Social Data to Build an Interlinked Online Information...
Federating Distributed Social Data to Build an Interlinked Online Information...Alexandre Passant
 
Externalization Trend
Externalization TrendExternalization Trend
Externalization TrendNigel Green
 
WikiSym2012 Deletion Discussions in Wikipedia: Decision Factors and Outcomes
WikiSym2012 Deletion Discussions in Wikipedia: Decision Factors and OutcomesWikiSym2012 Deletion Discussions in Wikipedia: Decision Factors and Outcomes
WikiSym2012 Deletion Discussions in Wikipedia: Decision Factors and Outcomesjodischneider
 
Envisioning a discussion dashboard for collective intelligence of web convers...
Envisioning a discussion dashboard for collective intelligence of web convers...Envisioning a discussion dashboard for collective intelligence of web convers...
Envisioning a discussion dashboard for collective intelligence of web convers...jodischneider
 
Applied Linked Open Data: A Mobile Solution for Galway Volvo Ocean Race
Applied Linked Open Data: A Mobile Solution for Galway Volvo Ocean RaceApplied Linked Open Data: A Mobile Solution for Galway Volvo Ocean Race
Applied Linked Open Data: A Mobile Solution for Galway Volvo Ocean RaceDerilinx
 
System of Systems Information Interoperability using a Linked Dataspace
System of Systems Information Interoperability using a Linked DataspaceSystem of Systems Information Interoperability using a Linked Dataspace
System of Systems Information Interoperability using a Linked DataspaceEdward Curry
 
Knowledge management on the desktop
Knowledge management on the desktopKnowledge management on the desktop
Knowledge management on the desktopLaura Dragan
 
Transitioning web application frameworks towards the Semantic Web (master the...
Transitioning web application frameworks towards the Semantic Web (master the...Transitioning web application frameworks towards the Semantic Web (master the...
Transitioning web application frameworks towards the Semantic Web (master the...Benjamin Heitmann
 
Big Data and Content Management. SkyDox and the European Court of Human Righ...
Big Data and Content Management.  SkyDox and the European Court of Human Righ...Big Data and Content Management.  SkyDox and the European Court of Human Righ...
Big Data and Content Management. SkyDox and the European Court of Human Righ...SkyDox LTD
 
Introduction to Open Data
Introduction to Open DataIntroduction to Open Data
Introduction to Open DataDerilinx
 
Leveraging existing Web Frameworks for a SIOC explorer (Scripting for the Sem...
Leveraging existing Web Frameworks for a SIOC explorer (Scripting for the Sem...Leveraging existing Web Frameworks for a SIOC explorer (Scripting for the Sem...
Leveraging existing Web Frameworks for a SIOC explorer (Scripting for the Sem...Benjamin Heitmann
 
Stefan Decker Keynote at CSHALS
Stefan Decker Keynote at CSHALSStefan Decker Keynote at CSHALS
Stefan Decker Keynote at CSHALSStefan Decker
 
AiLibrary Garage.com application review - by Gordon Kraft
AiLibrary Garage.com   application review - by Gordon Kraft AiLibrary Garage.com   application review - by Gordon Kraft
AiLibrary Garage.com application review - by Gordon Kraft Gordon Kraft
 
Dcat - Machine Accessible Data Catalogues
Dcat - Machine Accessible Data CataloguesDcat - Machine Accessible Data Catalogues
Dcat - Machine Accessible Data CataloguesFadi Maali
 
What your hairstyle says about your political preferences, and why you should...
What your hairstyle says about your political preferences, and why you should...What your hairstyle says about your political preferences, and why you should...
What your hairstyle says about your political preferences, and why you should...Benjamin Heitmann
 
Semantic Enterprise 2.0 - Enabling Semantic Web technologies in Enterprise 2...
Semantic Enterprise 2.0 - Enabling Semantic Web technologies in Enterprise 2...Semantic Enterprise 2.0 - Enabling Semantic Web technologies in Enterprise 2...
Semantic Enterprise 2.0 - Enabling Semantic Web technologies in Enterprise 2...Alexandre Passant
 

What's hot (20)

Towards Expertise Modelling for Routing Data Cleaning Tasks within a Communit...
Towards Expertise Modelling for Routing Data Cleaning Tasks within a Communit...Towards Expertise Modelling for Routing Data Cleaning Tasks within a Communit...
Towards Expertise Modelling for Routing Data Cleaning Tasks within a Communit...
 
Enterprise Energy Management using a Linked Dataspace for Energy Intelligence
Enterprise Energy Management using a Linked Dataspace for Energy IntelligenceEnterprise Energy Management using a Linked Dataspace for Energy Intelligence
Enterprise Energy Management using a Linked Dataspace for Energy Intelligence
 
Using Linked Data and the Internet of Things for Energy Management
Using Linked Data and the Internet of Things for Energy ManagementUsing Linked Data and the Internet of Things for Energy Management
Using Linked Data and the Internet of Things for Energy Management
 
An Environmental Chargeback for Data Center and Cloud Computing Consumers
An Environmental Chargeback for Data Center and Cloud Computing ConsumersAn Environmental Chargeback for Data Center and Cloud Computing Consumers
An Environmental Chargeback for Data Center and Cloud Computing Consumers
 
Federating Distributed Social Data to Build an Interlinked Online Information...
Federating Distributed Social Data to Build an Interlinked Online Information...Federating Distributed Social Data to Build an Interlinked Online Information...
Federating Distributed Social Data to Build an Interlinked Online Information...
 
Externalization Trend
Externalization TrendExternalization Trend
Externalization Trend
 
WikiSym2012 Deletion Discussions in Wikipedia: Decision Factors and Outcomes
WikiSym2012 Deletion Discussions in Wikipedia: Decision Factors and OutcomesWikiSym2012 Deletion Discussions in Wikipedia: Decision Factors and Outcomes
WikiSym2012 Deletion Discussions in Wikipedia: Decision Factors and Outcomes
 
Envisioning a discussion dashboard for collective intelligence of web convers...
Envisioning a discussion dashboard for collective intelligence of web convers...Envisioning a discussion dashboard for collective intelligence of web convers...
Envisioning a discussion dashboard for collective intelligence of web convers...
 
Applied Linked Open Data: A Mobile Solution for Galway Volvo Ocean Race
Applied Linked Open Data: A Mobile Solution for Galway Volvo Ocean RaceApplied Linked Open Data: A Mobile Solution for Galway Volvo Ocean Race
Applied Linked Open Data: A Mobile Solution for Galway Volvo Ocean Race
 
System of Systems Information Interoperability using a Linked Dataspace
System of Systems Information Interoperability using a Linked DataspaceSystem of Systems Information Interoperability using a Linked Dataspace
System of Systems Information Interoperability using a Linked Dataspace
 
Knowledge management on the desktop
Knowledge management on the desktopKnowledge management on the desktop
Knowledge management on the desktop
 
Transitioning web application frameworks towards the Semantic Web (master the...
Transitioning web application frameworks towards the Semantic Web (master the...Transitioning web application frameworks towards the Semantic Web (master the...
Transitioning web application frameworks towards the Semantic Web (master the...
 
Big Data and Content Management. SkyDox and the European Court of Human Righ...
Big Data and Content Management.  SkyDox and the European Court of Human Righ...Big Data and Content Management.  SkyDox and the European Court of Human Righ...
Big Data and Content Management. SkyDox and the European Court of Human Righ...
 
Introduction to Open Data
Introduction to Open DataIntroduction to Open Data
Introduction to Open Data
 
Leveraging existing Web Frameworks for a SIOC explorer (Scripting for the Sem...
Leveraging existing Web Frameworks for a SIOC explorer (Scripting for the Sem...Leveraging existing Web Frameworks for a SIOC explorer (Scripting for the Sem...
Leveraging existing Web Frameworks for a SIOC explorer (Scripting for the Sem...
 
Stefan Decker Keynote at CSHALS
Stefan Decker Keynote at CSHALSStefan Decker Keynote at CSHALS
Stefan Decker Keynote at CSHALS
 
AiLibrary Garage.com application review - by Gordon Kraft
AiLibrary Garage.com   application review - by Gordon Kraft AiLibrary Garage.com   application review - by Gordon Kraft
AiLibrary Garage.com application review - by Gordon Kraft
 
Dcat - Machine Accessible Data Catalogues
Dcat - Machine Accessible Data CataloguesDcat - Machine Accessible Data Catalogues
Dcat - Machine Accessible Data Catalogues
 
What your hairstyle says about your political preferences, and why you should...
What your hairstyle says about your political preferences, and why you should...What your hairstyle says about your political preferences, and why you should...
What your hairstyle says about your political preferences, and why you should...
 
Semantic Enterprise 2.0 - Enabling Semantic Web technologies in Enterprise 2...
Semantic Enterprise 2.0 - Enabling Semantic Web technologies in Enterprise 2...Semantic Enterprise 2.0 - Enabling Semantic Web technologies in Enterprise 2...
Semantic Enterprise 2.0 - Enabling Semantic Web technologies in Enterprise 2...
 

Similar to Self-service Linked Government Data

Slims arindam presentaion
Slims arindam presentaionSlims arindam presentaion
Slims arindam presentaionArindam Halder
 
Manfred Linking the Real World
Manfred Linking the Real WorldManfred Linking the Real World
Manfred Linking the Real Worldsssw2012
 
Linked Open Government Data
Linked Open Government DataLinked Open Government Data
Linked Open Government DataDerilinx
 
Building Optimisation using Scenario Modeling and Linked Data
Building Optimisation using Scenario Modeling and Linked DataBuilding Optimisation using Scenario Modeling and Linked Data
Building Optimisation using Scenario Modeling and Linked DataEdward Curry
 
Multi-Source Provenance-Aware User Interest Profiling on the Social Semantic Web
Multi-Source Provenance-Aware User Interest Profiling on the Social Semantic WebMulti-Source Provenance-Aware User Interest Profiling on the Social Semantic Web
Multi-Source Provenance-Aware User Interest Profiling on the Social Semantic WebFabrizio Orlandi
 
Aggregated, Interoperable and Multi-Domain User Profiles for the Social Web
Aggregated, Interoperable and Multi-Domain User Profiles for the Social WebAggregated, Interoperable and Multi-Domain User Profiles for the Social Web
Aggregated, Interoperable and Multi-Domain User Profiles for the Social WebFabrizio Orlandi
 
Linked Data: opportunities and challenges
Linked Data: opportunities and challengesLinked Data: opportunities and challenges
Linked Data: opportunities and challengesMichael Hausenblas
 
Linked Data lifecycle
Linked Data lifecycleLinked Data lifecycle
Linked Data lifecycleFadi Maali
 
VoID: Metadata for RDF Datasets
VoID: Metadata for RDF DatasetsVoID: Metadata for RDF Datasets
VoID: Metadata for RDF DatasetsRichard Cyganiak
 
Wikipedia (DBpedia): Crowdsourced Data Curation
Wikipedia (DBpedia): Crowdsourced Data CurationWikipedia (DBpedia): Crowdsourced Data Curation
Wikipedia (DBpedia): Crowdsourced Data CurationEdward Curry
 
Data Curation at the New York Times
Data Curation at the New York TimesData Curation at the New York Times
Data Curation at the New York TimesEdward Curry
 
Open Data - Where can it take us?
Open Data - Where can it take us? Open Data - Where can it take us?
Open Data - Where can it take us? Derilinx
 
Hello Open World - Semtech 2009
Hello Open World - Semtech 2009Hello Open World - Semtech 2009
Hello Open World - Semtech 2009Alexandre Passant
 
Annotating Microblog Posts with Sensor Data for Emergency Reporting Applications
Annotating Microblog Posts with Sensor Data for Emergency Reporting ApplicationsAnnotating Microblog Posts with Sensor Data for Emergency Reporting Applications
Annotating Microblog Posts with Sensor Data for Emergency Reporting ApplicationsDavid Crowley
 
DERI Overview - March 2011
DERI Overview - March 2011DERI Overview - March 2011
DERI Overview - March 2011mellotte
 
Interlinking Personal Semantic Data on the Semantic Desktop and the Web of Data
Interlinking Personal Semantic Data on the Semantic Desktop and the Web of DataInterlinking Personal Semantic Data on the Semantic Desktop and the Web of Data
Interlinking Personal Semantic Data on the Semantic Desktop and the Web of DataLaura Dragan
 
AAAI 2012 at Standord
AAAI 2012 at StandordAAAI 2012 at Standord
AAAI 2012 at StandordTed Vickey
 

Similar to Self-service Linked Government Data (20)

Slims arindam presentaion
Slims arindam presentaionSlims arindam presentaion
Slims arindam presentaion
 
Manfred Linking the Real World
Manfred Linking the Real WorldManfred Linking the Real World
Manfred Linking the Real World
 
Linked Open Government Data
Linked Open Government DataLinked Open Government Data
Linked Open Government Data
 
Building Optimisation using Scenario Modeling and Linked Data
Building Optimisation using Scenario Modeling and Linked DataBuilding Optimisation using Scenario Modeling and Linked Data
Building Optimisation using Scenario Modeling and Linked Data
 
Multi-Source Provenance-Aware User Interest Profiling on the Social Semantic Web
Multi-Source Provenance-Aware User Interest Profiling on the Social Semantic WebMulti-Source Provenance-Aware User Interest Profiling on the Social Semantic Web
Multi-Source Provenance-Aware User Interest Profiling on the Social Semantic Web
 
Sdecker
SdeckerSdecker
Sdecker
 
Aggregated, Interoperable and Multi-Domain User Profiles for the Social Web
Aggregated, Interoperable and Multi-Domain User Profiles for the Social WebAggregated, Interoperable and Multi-Domain User Profiles for the Social Web
Aggregated, Interoperable and Multi-Domain User Profiles for the Social Web
 
Linked Data: opportunities and challenges
Linked Data: opportunities and challengesLinked Data: opportunities and challenges
Linked Data: opportunities and challenges
 
Open Data Applications
Open Data ApplicationsOpen Data Applications
Open Data Applications
 
Linked Data lifecycle
Linked Data lifecycleLinked Data lifecycle
Linked Data lifecycle
 
VoID: Metadata for RDF Datasets
VoID: Metadata for RDF DatasetsVoID: Metadata for RDF Datasets
VoID: Metadata for RDF Datasets
 
Wikipedia (DBpedia): Crowdsourced Data Curation
Wikipedia (DBpedia): Crowdsourced Data CurationWikipedia (DBpedia): Crowdsourced Data Curation
Wikipedia (DBpedia): Crowdsourced Data Curation
 
How to Publish Open Data
How to Publish Open DataHow to Publish Open Data
How to Publish Open Data
 
Data Curation at the New York Times
Data Curation at the New York TimesData Curation at the New York Times
Data Curation at the New York Times
 
Open Data - Where can it take us?
Open Data - Where can it take us? Open Data - Where can it take us?
Open Data - Where can it take us?
 
Hello Open World - Semtech 2009
Hello Open World - Semtech 2009Hello Open World - Semtech 2009
Hello Open World - Semtech 2009
 
Annotating Microblog Posts with Sensor Data for Emergency Reporting Applications
Annotating Microblog Posts with Sensor Data for Emergency Reporting ApplicationsAnnotating Microblog Posts with Sensor Data for Emergency Reporting Applications
Annotating Microblog Posts with Sensor Data for Emergency Reporting Applications
 
DERI Overview - March 2011
DERI Overview - March 2011DERI Overview - March 2011
DERI Overview - March 2011
 
Interlinking Personal Semantic Data on the Semantic Desktop and the Web of Data
Interlinking Personal Semantic Data on the Semantic Desktop and the Web of DataInterlinking Personal Semantic Data on the Semantic Desktop and the Web of Data
Interlinking Personal Semantic Data on the Semantic Desktop and the Web of Data
 
AAAI 2012 at Standord
AAAI 2012 at StandordAAAI 2012 at Standord
AAAI 2012 at Standord
 

Recently uploaded

UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Scott Andery
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 

Recently uploaded (20)

UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 

Self-service Linked Government Data

  • 1. Digital Enterprise Research Institute www.deri.ie Self-service Linked Government Data Fadi Maali, Richard Cyganiak, Vassilios Peristeras firstname.lastname@deri.org Copyright 2011 Digital Enterprise Research Institute. All rights reserved. Enabling networked knowledge
  • 2. data.gov.uk Digital Enterprise Research Institute www.deri.ie Enabling networked knowledge 2
  • 3. data.gov.uk Digital Enterprise Research Institute www.deri.ie Enabling networked knowledge 3
  • 4. data.gov Digital Enterprise Research Institute www.deri.ie Enabling networked knowledge 4
  • 5. data.gov Digital Enterprise Research Institute www.deri.ie 4997 datasets 2590 in CSV 272 in RDF Enabling networked knowledge 5
  • 6. Why Linked Governemnt Data (LGD)? Digital Enterprise Research Institute www.deri.ie  Web accessible  Interlinkable  Decentralised publishing of data  Standardised Enabling networked knowledge 6
  • 7. LGD Digital Enterprise Research Institute www.deri.ie We need government data as Linked Data not just Raw Data ….aha, and of a good quality! Enabling networked knowledge 7
  • 8. LGD is Costly Digital Enterprise Research Institute www.deri.ie We want governments to provide Linked Data not just Raw Data… and of good quality http://code.google.com/p/google-refine/ Enabling networked knowledge 8
  • 9. Self-service Approach Digital Enterprise Research Institute www.deri.ie DIY Enabling networked knowledge 9
  • 10. Self-service Approach Digital Enterprise Research Institute www.deri.ie DIY Provide tools, models and algorithms that enable the self-service approach (a publishing pipeline) Enabling networked knowledge 10
  • 11. Publishing pipeline requirements Digital Enterprise Research Institute www.deri.ie  Interactive approach  Graphical user interface  Reproducibility and traceability  Flexibility  Decentralisation  Results sharing Enabling networked knowledge 11
  • 12. Publishing pipeline requirements Digital Enterprise Research Institute www.deri.ie  Interactive approach  Graphical user interface  Reproducibility and traceability  Flexibility  Decentralisation  Results sharing Enabling networked knowledge 12
  • 13. Google Refine Digital Enterprise Research Institute www.deri.ie  Powerful data editing, transformation and enriching capabilities  Import capabilities e.g. JSON, Excel, CSV, TSV, XML, etc.  Persistent undo/redo history  Popular in open data community  Extensible and under active development  Free and open source http://code.google.com/p/google-refine/ Enabling networked knowledge 13
  • 14. DIY Recipe (1000 feet view) Digital Enterprise Research Institute www.deri.ie Publishers provide RDF Tool support to select representation of their datasets of interest and User shares the RDF catalogues put them into RDF data Enabling networked knowledge 14
  • 15. DIY Recipe (100 feet view) Digital Enterprise Research Institute www.deri.ie Publishers provide RDF representation of their catalogues Tool support to select datasets of interest User shares the and put them into RDF RDF data dcat Enabling networked knowledge 15
  • 16. DIY Recipe (100 feet view) Digital Enterprise Research Institute www.deri.ie Tool support to select datasets of Publishers provide RDF representation of interest and put them into RDF User shares the RDF their catalogues data dcat Google Refine + RDF export extension + RDF reconciliation extension Enabling networked knowledge 16
  • 17. DIY Recipe (100 feet view) Digital Enterprise Research Institute www.deri.ie Publishers provide Tool support to select RDF representation of datasets of interest and put User shares the RDF data their catalogues them into RDF dcat Google Refine Share RDF data publicly (on + RDF export extension CKAN.net) along with the sufficient + RDF reconciliation extension provenance description Enabling networked knowledge 17
  • 18. A Walk-through (1/5) Digital Enterprise Research Institute www.deri.ie Enabling networked knowledge 18
  • 19. A Walk-through (2/5) Digital Enterprise Research Institute www.deri.ie Enabling networked knowledge 19
  • 20. A Walk-through (3/5) Digital Enterprise Research Institute www.deri.ie Enabling networked knowledge 20
  • 21. A Walk-through (4/5) Digital Enterprise Research Institute www.deri.ie Enabling networked knowledge 21
  • 22. A Walk-through (5/5) Digital Enterprise Research Institute www.deri.ie Enabling networked knowledge 22
  • 23. Data on CKAN.net Digital Enterprise Research Institute www.deri.ie Enabling networked knowledge 23
  • 24. Data Provenance (simplified) Digital Enterprise Research Institute www.deri.ie :dataset dct:source :wasExportedBy :json-history :export-process :csv-ds :operations :usedData Enabling networked knowledge 24
  • 25. DIY Recipe (10 feet view) Digital Enterprise Research Institute www.deri.ie Dcat  An RDF vocabulary to describe government catalogues  Current status: First Public Working Draft by the W3C GLD Working Group http://www.w3.org/TR/vocab-dcat/  Used on data.gov.uk (RDFa) and CKAN-based catalogues “Enabling Interoperability of Government Data Catalogues.” EGOV 2010 Enabling networked knowledge 25
  • 26. DIY Recipe (10 feet view) Digital Enterprise Research Institute www.deri.ie RDF Mapping Enabling networked knowledge 26
  • 27. More on RDF Mapping Digital Enterprise Research Institute www.deri.ie  RDF-centric mapping  Multiple tree structure  Expression language for custom expression  Vocabularies/ontologies support Enabling networked knowledge 27
  • 28. DIY Recipe (10 feet view) Digital Enterprise Research Institute www.deri.ie Interlinking Silk LSL RDF Reconcile Crafted RDF Silk Server Google Extension Refine SPARQL endpoint SPARQL endpoint with fulltext extension Enabling networked knowledge 28
  • 29. More on Interlinking Digital Enterprise Research Institute www.deri.ie  Interlinking as a pre-RDF-creation step  less unnecessary owl:sameAs  Focus on the interface  Semi-automatic process with good user support “Re-using Cool URIs: Entity Reconciliation Against LOD Hubs.” LDOW 2011 Enabling networked knowledge 29
  • 30. DIY Recipe (10 feet view) Digital Enterprise Research Institute www.deri.ie Sharing  Captures the operations applied to the data  Represent them according to Open Provenance Model Vocabulary (OPMV)  Share the data and its provennce on CKAN.net CKAN Extension fro Google Refine http://lab.linkeddata.deri.ie/2011/grefine-ckan/ Enabling networked knowledge 30
  • 31. Case study - Fingal Catalogue Digital Enterprise Research Institute www.deri.ie Number of datasets: 74 (68 available in CSV and 56 in XML) Fingal county Council (41), Central Statistics Top publishers: Office (17), Department of Education and Science (4) Demographics(18), Citizen Participation(18), Top domains: Education(9) http://data.fingal.ie Enabling networked knowledge 31
  • 32. Case study - Fingal Catalogue Digital Enterprise Research Institute www.deri.ie  The catalogue was represented in Dcat  60 datasets were converted to RDF using the publishing pipeline (~300K triples)  Data Cube was used for statistical data  URIs were used consistently and shared among datasets  the data was interlinked  Externally linked to DBpedia Enabling networked knowledge 32
  • 33. Open Issues Digital Enterprise Research Institute www.deri.ie  Evaluating/Refining the crowd-sourcing aspects of the RDF creation process  RDF Modeling: Can we assist RDF modeling by examining the raw data? Enabling networked knowledge 33
  • 34. Lessons Learned Digital Enterprise Research Institute www.deri.ie  Interactive approach  Focus on plumbing tools together but don’t enforce a rigid process  Make it easy to adopt best-practices and good recipes Enabling networked knowledge 34