SlideShare a Scribd company logo
1 of 22
The necessity of metadata for linked open data and
        its contribution to policy analyses

     Anneke Zuiderwijk*, Keith Jeffery**, Marijn Janssen*

    *Delft University of Technology, The Netherlands
    **Science and Technology Facilities Council, United Kingdom




                         CEDEM 2012, May 3-4
Open governmental data

0 "We are sending a strong signal to administrations today. Your
  data is worth more if you give it away. So start releasing it
  now.” (December 12, 2011)

European Commission Vice President Neelie Kroes, digital agenda:
Turning government data into gold)

0 One of many examples that shows that open governmental
  data have gained considerable attention recently




                                CEDEM 2012
The ENGAGE project

0 ENGAGE (FP7): An Infrastructure for Open, Linked
  Governmental Data Provision towards Research Communities
  and Citizens (http://www.engage-project.eu)

0 Main goal: the development and use of a data infrastructure,
  incorporating distributed and diverse public sector information
  (PSI) resources.

0 The ENGAGE platform will enable researchers and citizens to:
    0 Discover and browse datasets across diverse and dispersed public
      sector information resources (local, national and European) in their
      own language
    0 Download the datasets
    0 Perform geospatial search of datasets
    0 Visualize properly structured datasets in data tables, maps and charts
                                      CEDEM 2012
Open governmental data

0 Open governmental data can be defined as “all stored data of
  the public sector which could be made accessible by
  government in the public interest without any restrictions on
  usage and distribution” (Geiger & Von Lucke, 2011, p. 185).

0 For example, public sector data can be:
    0   Geographic data (e.g. cadastral information)
    0   Legal data (e.g. courts decisions, legislation)
    0   Meteorological data (e.g. climate data, weather forecasts)
    0   Social data (e.g. population, public administration)
    0   Transport data (e.g. traffic congestion, work on roads)
    0   Business data (e.g. chamber of commerce, patents) (MEPSIR study,
        Dekkers et al., 2006)



                                     CEDEM 2012
Linked open data (LOD)

0 Focus on turning public sector                PUBLIC SECTOR (POLICY)
  data into LOD                        (1)
                                                DATA            METADATA

1. Public body produces data (and      (2)
   metadata)                                      PUBLICATION ON THE
                                                    SEMANTIC WEB
2. Data become available on the
                                       (3)
   Web of Data / Semantic Web
                                                  REUSING OPEN DATA
3. Open data can be reused
                                       (4)
4. Open data can be linked to other                   LINKING DATA
   data  show relationships
                                       (5)
5. Data are both open and linked                  LINKED OPEN DATA
   Linked Open Data (LOD)
                                      Figure 1: Process for creating Linked Open Data




                               CEDEM 2012
Metadata

0 Metadata are part of the LOD-process
0 Metadata are needed to make sense of the open data (Berners-
  Lee, 2009)

0 Metadata are defined as “structured information that
  describes, explains, locates, or otherwise makes it easier to
  retrieve, use, or manage an information resource.” (National
  Information Standards Organization, 2004, p. 1).

0 Metadata provision in the ideal situation:
    0 Discovery metadata, e.g. identifier, title, creator, keywords.
    0 Contextual metadata, e.g. organizations, projects, funding.
    0 Detailed metadata, e.g. quality and domain specific parameters.


                                     CEDEM 2012
Why metadata are necessary in analyzing LOD

0 Metadata for LOD can be useful in the following situations.
Metadata:
0   create order within datasets;
0   improve storing and preservation of LOD;
0   improve easily finding LOD;
0   improve the accessibility of LOD;
0   may make it possible to assess and rank the quality of LOD;
0   improve easily analyzing, comparing, reproducing and therefore finding
    inconsistencies in LOD;
0   improve chances of a correct interpretation of LOD;
0   improve the possibilities to find patterns in LOD to generate new
    hypotheses;
0   may improve visualizing LOD;
0   make it easier to link data ;
0   avoid unnecessary duplication of LOD.

                                      CEDEM 2012
Problem statement

0 Discrepancies between the benefits that are described in
  literature and the benefits that are obtained in reality

0 Current situation is a long way from the ideal situation:
    0 usually few and insufficient ways of managing metadata and
      interpretation of LOD (for instance Hernández-Pérez et al., 2009;
      Schuurman et al., 2008; Xiong et al., 2011);
    0 adding metadata is often viewed as an additional activity that only
      consumes resources.


0 Statements:
   0 Merely linking data is not enough to make use of open data
   0 Metadata are key enablers for the effective use of LOD in
      policy-making

                                      CEDEM 2012
Requirements for a metadata architecture

0 The metadata should:
   0 be easily discovered;
   0 interconvert common metadata formats used in PSI;
   0 provide a LOD representation of the metadata for browsing
      or query;
   0 maintain the capabilities of conventional information
      systems with structured query including convenient
      primitive operations.




                              CEDEM 2012
Outline architecture
0 The requirements lead to the following architecture:



      Portal server             PORTAL
                                                                      METADATA



          RUNNING
        SOFTWARE
        APPLICATION
                                                       PSI                PSI              PSI
                                                     DATA-              DATA-            DATA-
      Application Server
                                                     SET                SET              SET


                                                PSI Dataset Servers


               Figure 2: An architecture of a portal server for the provision of metadata.




                                                          CEDEM 2012
Metadata
0 Metadata should be used to implement this architecture

A 3-layer structure for metadata is used:
a) discovery (flat) metadata; for example:
    0   Dublin Core (DC);
    0   e-Government Metadata Standard (e-GMS);
    0   Comprehensive Knowledge Archive Network (CKAN);
    0   or similar ‘flat’ metadata
b) contextual metadata; uses the Common European Research
   Information Format (CERIF) ;
c) detailed metadata.




                                   CEDEM 2012
The Vision: Metadata for Data Model


                      DISCOVERY
 Linked
open data            (DC, eGMS…)
                                            Generate

                       CONTEXT
                        (CERIF)
   Formal                                   Point to
Information
  Systems               DETAIL
              (SUBJECT OR TOPIC SPECIFIC)
Design
The presented structure provides the next improved facilities:

0 CERIF provides a much richer metadata than the standards
  used commonly with PSI datasets.

0 The representation of contextual metadata (CERIF) allows rich
  semantics to be represented thus making the PSI datasets
  understandable to the end user (or software) through the
  metadata.

0 The Structured Query Language (SQL) has a simpler structure
  than SPARQL and includes convenient primitive operations for
  simple statistical calculations such as sum, count, average.



                                 CEDEM 2012
Benefits of architecture

0 Because of the powerful expressive semantics over formal
  syntax of CERIF we can:
   0 Generate discovery metadata from CERIF;
   0 Interconvert common metadata formats used in PSI using CERIF as the
     superset exchange mechanism;
   0 Provide a semantic web / LOD representation of the metadata for
     browsing or query using SPARQL;
   0 While maintaining a conventional information systems capability with
     structured query including convenient primitive operations.




                                   CEDEM 2012
Models for an infrastructure


0 The data model with its metadata described is only one
  relevant model

0 The other models are:
    0 User model
    0 Processing model
    0 Resource model
The Vision: The Models

   User Model
                 Processing
                   Model
                            Data Model
                                  Resource
                                   Model
 Complete cohort of users       Complete ICT environment for PSI
Model – User model


0 User Model: controls the way in which the end-user interacts
  with the e-infrastructure.
    0 User profile, security certification, privacy;
    0 Device and interaction mode preferences (keyboard/mouse through
      voice and gesture to brain-connected), language preference;
    0 Resource preferences (including contacts) with directories;


0 METADATA
Models – Processing model


0 Process Model controls the way processes are
  constructed and executed in the e-infrastructure
   0 Services
      0 Described for discovery, described for functional and non-functional
        (security, privacy, performance) properties
      0 Mobile (deployed in distributed / parallel execution environments)
      0 Open source where possible
   0 Service composition
      0 Dynamically (re-) composable during execution
0 METADATA
Models – Data model


0 Data Model controls data representation and data (re-)use
   0 Formal syntax (structure)
      0 Even for text, images, streamed video
   0 Declared semantics (meaning)


0 METADATA
Models – Resource model


0 Resource Model catalogs the available computing
  resources in the e-infrastructure
   0 This allows virtualisation so the user neither knows nor cares from
     where the data comes, or where the processing is done, as long as
     quality of service is maintained;
   0 Requires updating by resource owners – together with conditions of
     use
0 METADATA
Conclusions (1)

0 Metadata are needed to make sense of the open data
0 Merely linking data is not enough to make optimal use of open
  data
0 Metadata are key enablers for policy-making
0 Adding metadata can yield considerable benefits, including:
    0 creating order in datasets
    0 improving find ability, accessibility, storing and preservation of LOD
    0 improving easily analyzing, comparing, reproducing, finding
      inconsistencies
    0 correct interpretation and visualizing of LOD
    0 finding patters in LOD to generate new hypotheses
    0 making linking of data easier
    0 assessing and ranking the quality of LOD and avoiding unnecessary
      duplication of LOD


                                       CEDEM 2012
Conclusions (2)

0 Architecture for metadata:
   0 discovery metadata can be generated from CERIF
   0 common metadata formats can use CERIF as the superset exchange
     mechanism
   0 a LOD representation of the metadata for browsing or query can be
     made allowing the use of SPARQL
   0 while a conventional information systems capability with structured
     query including convenient primitive operations can be maintained
0 We recommend to further implement the proposed metadata
  architecture




                                    CEDEM 2012

More Related Content

What's hot

Introduction to Object Storage Solutions White Paper
Introduction to Object Storage Solutions White PaperIntroduction to Object Storage Solutions White Paper
Introduction to Object Storage Solutions White PaperHitachi Vantara
 
SharePoint 2010 Managed Metadata
SharePoint 2010 Managed MetadataSharePoint 2010 Managed Metadata
SharePoint 2010 Managed MetadataNick Hobbs
 
SPSTCDC - Managed Metadata and Taxonomies in SharePoint 2010 - Playing Tag
SPSTCDC - Managed Metadata and Taxonomies in SharePoint 2010 - Playing TagSPSTCDC - Managed Metadata and Taxonomies in SharePoint 2010 - Playing Tag
SPSTCDC - Managed Metadata and Taxonomies in SharePoint 2010 - Playing TagKnowledge Management Associates, LLC
 
The METL Process in Investment Banking
The METL Process in Investment BankingThe METL Process in Investment Banking
The METL Process in Investment BankingAntony Benzing
 
DataONE Education Module 07: Metadata
DataONE Education Module 07: MetadataDataONE Education Module 07: Metadata
DataONE Education Module 07: MetadataDataONE
 
Information Technology 104
Information Technology 104Information Technology 104
Information Technology 104'Vladimir Medina
 
An ecosystem to support FAIR data
An ecosystem to support FAIR dataAn ecosystem to support FAIR data
An ecosystem to support FAIR dataBlue BRIDGE
 
Introduction to Metadata
Introduction to MetadataIntroduction to Metadata
Introduction to MetadataEUDAT
 
SharePoint 2010 Managed Metadata Service Application
SharePoint 2010 Managed Metadata Service ApplicationSharePoint 2010 Managed Metadata Service Application
SharePoint 2010 Managed Metadata Service ApplicationMohamed Abdeen
 
WHAT ARE METADATA STANDARDS? EXPLAIN DUBLIN CORE IN DETAIL.
WHAT ARE METADATA STANDARDS? EXPLAIN DUBLIN CORE IN DETAIL.WHAT ARE METADATA STANDARDS? EXPLAIN DUBLIN CORE IN DETAIL.
WHAT ARE METADATA STANDARDS? EXPLAIN DUBLIN CORE IN DETAIL.`Shweta Bhavsar
 

What's hot (17)

Introduction to Object Storage Solutions White Paper
Introduction to Object Storage Solutions White PaperIntroduction to Object Storage Solutions White Paper
Introduction to Object Storage Solutions White Paper
 
SharePoint 2010 Managed Metadata
SharePoint 2010 Managed MetadataSharePoint 2010 Managed Metadata
SharePoint 2010 Managed Metadata
 
SPSTCDC - Managed Metadata and Taxonomies in SharePoint 2010 - Playing Tag
SPSTCDC - Managed Metadata and Taxonomies in SharePoint 2010 - Playing TagSPSTCDC - Managed Metadata and Taxonomies in SharePoint 2010 - Playing Tag
SPSTCDC - Managed Metadata and Taxonomies in SharePoint 2010 - Playing Tag
 
The METL Process in Investment Banking
The METL Process in Investment BankingThe METL Process in Investment Banking
The METL Process in Investment Banking
 
T9
T9T9
T9
 
DataONE Education Module 07: Metadata
DataONE Education Module 07: MetadataDataONE Education Module 07: Metadata
DataONE Education Module 07: Metadata
 
LOD2: State of Play WP1: Requirements, Design & LOD2 Stack Prototype
LOD2: State of Play WP1: Requirements, Design & LOD2 Stack PrototypeLOD2: State of Play WP1: Requirements, Design & LOD2 Stack Prototype
LOD2: State of Play WP1: Requirements, Design & LOD2 Stack Prototype
 
Information Technology 104
Information Technology 104Information Technology 104
Information Technology 104
 
An ecosystem to support FAIR data
An ecosystem to support FAIR dataAn ecosystem to support FAIR data
An ecosystem to support FAIR data
 
Introduction to Metadata
Introduction to MetadataIntroduction to Metadata
Introduction to Metadata
 
SharePoint 2010 Managed Metadata Service Application
SharePoint 2010 Managed Metadata Service ApplicationSharePoint 2010 Managed Metadata Service Application
SharePoint 2010 Managed Metadata Service Application
 
Metadata Standards
Metadata StandardsMetadata Standards
Metadata Standards
 
Playing Tag: Managed Metadata and Taxonomies in SharePoint 2010
Playing Tag: Managed Metadata and Taxonomies in SharePoint 2010Playing Tag: Managed Metadata and Taxonomies in SharePoint 2010
Playing Tag: Managed Metadata and Taxonomies in SharePoint 2010
 
Ijcatr04051015
Ijcatr04051015Ijcatr04051015
Ijcatr04051015
 
1771 1775
1771 17751771 1775
1771 1775
 
WHAT ARE METADATA STANDARDS? EXPLAIN DUBLIN CORE IN DETAIL.
WHAT ARE METADATA STANDARDS? EXPLAIN DUBLIN CORE IN DETAIL.WHAT ARE METADATA STANDARDS? EXPLAIN DUBLIN CORE IN DETAIL.
WHAT ARE METADATA STANDARDS? EXPLAIN DUBLIN CORE IN DETAIL.
 
Managed metadata in SharePoint 2010
Managed metadata in SharePoint 2010Managed metadata in SharePoint 2010
Managed metadata in SharePoint 2010
 

Viewers also liked (7)

#CeDEM12 Development of е-democracy in Bulgaria
#CeDEM12 Development of е-democracy in Bulgaria#CeDEM12 Development of е-democracy in Bulgaria
#CeDEM12 Development of е-democracy in Bulgaria
 
DATA.gv.at – Austrian Open Gov Data Portal
DATA.gv.at – Austrian Open Gov Data PortalDATA.gv.at – Austrian Open Gov Data Portal
DATA.gv.at – Austrian Open Gov Data Portal
 
E Challenges 2009 Workshop 10b Semantic Interoperability Methodologies
E Challenges 2009 Workshop 10b Semantic Interoperability MethodologiesE Challenges 2009 Workshop 10b Semantic Interoperability Methodologies
E Challenges 2009 Workshop 10b Semantic Interoperability Methodologies
 
Citizen Scape, Administration Tools
Citizen Scape, Administration ToolsCitizen Scape, Administration Tools
Citizen Scape, Administration Tools
 
Cvut future eu civil protection
Cvut future eu civil protectionCvut future eu civil protection
Cvut future eu civil protection
 
Sfu ceuss focus-mid-term symposium_13-03-2012_general
Sfu ceuss focus-mid-term symposium_13-03-2012_generalSfu ceuss focus-mid-term symposium_13-03-2012_general
Sfu ceuss focus-mid-term symposium_13-03-2012_general
 
SocialUniversity:How Do Universities Use Social Media? An Empirical Survey of...
SocialUniversity:How Do Universities Use Social Media? An Empirical Survey of...SocialUniversity:How Do Universities Use Social Media? An Empirical Survey of...
SocialUniversity:How Do Universities Use Social Media? An Empirical Survey of...
 

Similar to The necessity of metadata for linked open data and its contribution to policy analyses #CeDEM12

Data Integration Alternatives: When to use Data Virtualization, ETL, and ESB
Data Integration Alternatives: When to use Data Virtualization, ETL, and ESBData Integration Alternatives: When to use Data Virtualization, ETL, and ESB
Data Integration Alternatives: When to use Data Virtualization, ETL, and ESBDenodo
 
Linking HPC to Data Management - EUDAT Summer School (Giuseppe Fiameni, CINECA)
Linking HPC to Data Management - EUDAT Summer School (Giuseppe Fiameni, CINECA)Linking HPC to Data Management - EUDAT Summer School (Giuseppe Fiameni, CINECA)
Linking HPC to Data Management - EUDAT Summer School (Giuseppe Fiameni, CINECA)EUDAT
 
Linked Data for the Masses: The approach and the Software
Linked Data for the Masses: The approach and the SoftwareLinked Data for the Masses: The approach and the Software
Linked Data for the Masses: The approach and the SoftwareIMC Technologies
 
Myth Busters VII: I’m building a data mesh, so I don’t need data virtualization
Myth Busters VII: I’m building a data mesh, so I don’t need data virtualizationMyth Busters VII: I’m building a data mesh, so I don’t need data virtualization
Myth Busters VII: I’m building a data mesh, so I don’t need data virtualizationDenodo
 
Dublinked tech workshop_15_dec2011
Dublinked tech workshop_15_dec2011Dublinked tech workshop_15_dec2011
Dublinked tech workshop_15_dec2011Dublinked .
 
Data Infrastructure at LinkedIn
Data Infrastructure at LinkedIn Data Infrastructure at LinkedIn
Data Infrastructure at LinkedIn Amy W. Tang
 
Going local with a world-class data infrastructure: Enabling SDMX for researc...
Going local with a world-class data infrastructure: Enabling SDMX for researc...Going local with a world-class data infrastructure: Enabling SDMX for researc...
Going local with a world-class data infrastructure: Enabling SDMX for researc...Rob Grim
 
EUDAT data architecture and interoperability aspects – Daan Broeder
EUDAT data architecture and interoperability aspects – Daan BroederEUDAT data architecture and interoperability aspects – Daan Broeder
EUDAT data architecture and interoperability aspects – Daan BroederOpenAIRE
 
Tag.bio: Self Service Data Mesh Platform
Tag.bio: Self Service Data Mesh PlatformTag.bio: Self Service Data Mesh Platform
Tag.bio: Self Service Data Mesh PlatformSanjay Padhi, Ph.D
 
Unlock Your Data for ML & AI using Data Virtualization
Unlock Your Data for ML & AI using Data VirtualizationUnlock Your Data for ML & AI using Data Virtualization
Unlock Your Data for ML & AI using Data VirtualizationDenodo
 
Ontologies, controlled vocabularies and Dataverse
Ontologies, controlled vocabularies and DataverseOntologies, controlled vocabularies and Dataverse
Ontologies, controlled vocabularies and Dataversevty
 
Myth Busters II: BI Tools and Data Virtualization are Interchangeable
Myth Busters II: BI Tools and Data Virtualization are InterchangeableMyth Busters II: BI Tools and Data Virtualization are Interchangeable
Myth Busters II: BI Tools and Data Virtualization are InterchangeableDenodo
 
Data Virtualization: Introduction and Business Value (UK)
Data Virtualization: Introduction and Business Value (UK)Data Virtualization: Introduction and Business Value (UK)
Data Virtualization: Introduction and Business Value (UK)Denodo
 
(R17A0528) BIG DATA ANALYTICS.pdf
(R17A0528) BIG DATA ANALYTICS.pdf(R17A0528) BIG DATA ANALYTICS.pdf
(R17A0528) BIG DATA ANALYTICS.pdfPoornimaShetty27
 
(R17A0528) BIG DATA ANALYTICS.pdf
(R17A0528) BIG DATA ANALYTICS.pdf(R17A0528) BIG DATA ANALYTICS.pdf
(R17A0528) BIG DATA ANALYTICS.pdfSreenivasa Harish
 

Similar to The necessity of metadata for linked open data and its contribution to policy analyses #CeDEM12 (20)

Data Integration Alternatives: When to use Data Virtualization, ETL, and ESB
Data Integration Alternatives: When to use Data Virtualization, ETL, and ESBData Integration Alternatives: When to use Data Virtualization, ETL, and ESB
Data Integration Alternatives: When to use Data Virtualization, ETL, and ESB
 
Linking HPC to Data Management - EUDAT Summer School (Giuseppe Fiameni, CINECA)
Linking HPC to Data Management - EUDAT Summer School (Giuseppe Fiameni, CINECA)Linking HPC to Data Management - EUDAT Summer School (Giuseppe Fiameni, CINECA)
Linking HPC to Data Management - EUDAT Summer School (Giuseppe Fiameni, CINECA)
 
Linked Data for the Masses: The approach and the Software
Linked Data for the Masses: The approach and the SoftwareLinked Data for the Masses: The approach and the Software
Linked Data for the Masses: The approach and the Software
 
What is a DMP
What is a DMPWhat is a DMP
What is a DMP
 
Myth Busters VII: I’m building a data mesh, so I don’t need data virtualization
Myth Busters VII: I’m building a data mesh, so I don’t need data virtualizationMyth Busters VII: I’m building a data mesh, so I don’t need data virtualization
Myth Busters VII: I’m building a data mesh, so I don’t need data virtualization
 
A Finnish perspective on FAIRsFAIR outputs
A Finnish perspective on FAIRsFAIR outputsA Finnish perspective on FAIRsFAIR outputs
A Finnish perspective on FAIRsFAIR outputs
 
Dublinked tech workshop_15_dec2011
Dublinked tech workshop_15_dec2011Dublinked tech workshop_15_dec2011
Dublinked tech workshop_15_dec2011
 
Data Infrastructure at LinkedIn
Data Infrastructure at LinkedIn Data Infrastructure at LinkedIn
Data Infrastructure at LinkedIn
 
Going local with a world-class data infrastructure: Enabling SDMX for researc...
Going local with a world-class data infrastructure: Enabling SDMX for researc...Going local with a world-class data infrastructure: Enabling SDMX for researc...
Going local with a world-class data infrastructure: Enabling SDMX for researc...
 
EUDAT data architecture and interoperability aspects – Daan Broeder
EUDAT data architecture and interoperability aspects – Daan BroederEUDAT data architecture and interoperability aspects – Daan Broeder
EUDAT data architecture and interoperability aspects – Daan Broeder
 
Tag.bio: Self Service Data Mesh Platform
Tag.bio: Self Service Data Mesh PlatformTag.bio: Self Service Data Mesh Platform
Tag.bio: Self Service Data Mesh Platform
 
Configuring and Visualizing The Data Resources in a Cloud-based Data Collect...
Configuring and Visualizing The Data Resources  in a Cloud-based Data Collect...Configuring and Visualizing The Data Resources  in a Cloud-based Data Collect...
Configuring and Visualizing The Data Resources in a Cloud-based Data Collect...
 
Unlock Your Data for ML & AI using Data Virtualization
Unlock Your Data for ML & AI using Data VirtualizationUnlock Your Data for ML & AI using Data Virtualization
Unlock Your Data for ML & AI using Data Virtualization
 
Ontologies, controlled vocabularies and Dataverse
Ontologies, controlled vocabularies and DataverseOntologies, controlled vocabularies and Dataverse
Ontologies, controlled vocabularies and Dataverse
 
Myth Busters II: BI Tools and Data Virtualization are Interchangeable
Myth Busters II: BI Tools and Data Virtualization are InterchangeableMyth Busters II: BI Tools and Data Virtualization are Interchangeable
Myth Busters II: BI Tools and Data Virtualization are Interchangeable
 
Planetdata simpda
Planetdata simpdaPlanetdata simpda
Planetdata simpda
 
PlanetData: Consuming Structured Data at Web Scale
PlanetData: Consuming Structured Data at Web ScalePlanetData: Consuming Structured Data at Web Scale
PlanetData: Consuming Structured Data at Web Scale
 
Data Virtualization: Introduction and Business Value (UK)
Data Virtualization: Introduction and Business Value (UK)Data Virtualization: Introduction and Business Value (UK)
Data Virtualization: Introduction and Business Value (UK)
 
(R17A0528) BIG DATA ANALYTICS.pdf
(R17A0528) BIG DATA ANALYTICS.pdf(R17A0528) BIG DATA ANALYTICS.pdf
(R17A0528) BIG DATA ANALYTICS.pdf
 
(R17A0528) BIG DATA ANALYTICS.pdf
(R17A0528) BIG DATA ANALYTICS.pdf(R17A0528) BIG DATA ANALYTICS.pdf
(R17A0528) BIG DATA ANALYTICS.pdf
 

More from Danube University Krems, Centre for E-Governance

More from Danube University Krems, Centre for E-Governance (20)

Smart Cities workshop at CeDEM17
Smart Cities workshop at CeDEM17Smart Cities workshop at CeDEM17
Smart Cities workshop at CeDEM17
 
#CeDEM17 - Towards an Open Data based ICT Reference Architecture for Smart Ci...
#CeDEM17 - Towards an Open Data based ICT Reference Architecture for Smart Ci...#CeDEM17 - Towards an Open Data based ICT Reference Architecture for Smart Ci...
#CeDEM17 - Towards an Open Data based ICT Reference Architecture for Smart Ci...
 
#CeDEM17 - Financial Payments and Smart Cities
#CeDEM17 - Financial Payments and Smart Cities #CeDEM17 - Financial Payments and Smart Cities
#CeDEM17 - Financial Payments and Smart Cities
 
#CeDEM2017 Smart Cities of Self-Determined Data Subjects
#CeDEM2017 Smart Cities of Self-Determined Data Subjects#CeDEM2017 Smart Cities of Self-Determined Data Subjects
#CeDEM2017 Smart Cities of Self-Determined Data Subjects
 
Open Data as Enabler of Public Service Co-creation: Exploring the Drivers and...
Open Data as Enabler of Public Service Co-creation:Exploring the Drivers and...Open Data as Enabler of Public Service Co-creation:Exploring the Drivers and...
Open Data as Enabler of Public Service Co-creation: Exploring the Drivers and...
 
DatalEt-Ecosystem Provider - The DEEP project
DatalEt-Ecosystem Provider - The DEEP projectDatalEt-Ecosystem Provider - The DEEP project
DatalEt-Ecosystem Provider - The DEEP project
 
Towards Open Justice: ICT acceptance in the Greek justice system
Towards Open Justice: ICT acceptance in the Greek justice systemTowards Open Justice: ICT acceptance in the Greek justice system
Towards Open Justice: ICT acceptance in the Greek justice system
 
[X]CHANGING PERSPECTIVES
[X]CHANGING PERSPECTIVES[X]CHANGING PERSPECTIVES
[X]CHANGING PERSPECTIVES
 
Using fuzzy cognitive maps as decision support tool for smart cities goraczek
Using fuzzy cognitive maps as decision support tool for smart cities  goraczekUsing fuzzy cognitive maps as decision support tool for smart cities  goraczek
Using fuzzy cognitive maps as decision support tool for smart cities goraczek
 
Understanding of smartphone divide dal yong
Understanding of smartphone divide  dal yongUnderstanding of smartphone divide  dal yong
Understanding of smartphone divide dal yong
 
The motivations behind open access publishing judith schossboeck
The motivations behind open access publishing  judith schossboeckThe motivations behind open access publishing  judith schossboeck
The motivations behind open access publishing judith schossboeck
 
Social media as hobed of racism and hate speech kobayashi, kaigo, kwak
Social media as hobed of racism and hate speech kobayashi, kaigo, kwakSocial media as hobed of racism and hate speech kobayashi, kaigo, kwak
Social media as hobed of racism and hate speech kobayashi, kaigo, kwak
 
Social media and citizen engagement in asia skoric
Social media and citizen engagement in asia  skoricSocial media and citizen engagement in asia  skoric
Social media and citizen engagement in asia skoric
 
Realizin modeling and evaluation city's enerfy efficiency leonidas anthopoulos
Realizin modeling and evaluation city's enerfy efficiency leonidas anthopoulosRealizin modeling and evaluation city's enerfy efficiency leonidas anthopoulos
Realizin modeling and evaluation city's enerfy efficiency leonidas anthopoulos
 
Post 2015 paris c limate conference politics on the internet manuela hartwig
Post 2015 paris c limate conference politics on the internet  manuela hartwigPost 2015 paris c limate conference politics on the internet  manuela hartwig
Post 2015 paris c limate conference politics on the internet manuela hartwig
 
Open government and national sovereignty ivo babaja
Open government and national sovereignty  ivo babajaOpen government and national sovereignty  ivo babaja
Open government and national sovereignty ivo babaja
 
Health r isk communication in the digital era myojung chung
Health r isk communication in the digital era myojung chungHealth r isk communication in the digital era myojung chung
Health r isk communication in the digital era myojung chung
 
An analysis of japanese local government facebook profiles muneo kaigo
An analysis of japanese local government facebook profiles muneo kaigoAn analysis of japanese local government facebook profiles muneo kaigo
An analysis of japanese local government facebook profiles muneo kaigo
 
GovCamp 2016 - Co-Creation
GovCamp 2016 - Co-CreationGovCamp 2016 - Co-Creation
GovCamp 2016 - Co-Creation
 
Datenschutzbeauftragte werden in Zukunft eine wichtige Rolle im Unternehmen s...
Datenschutzbeauftragte werden in Zukunft eine wichtige Rolle im Unternehmen s...Datenschutzbeauftragte werden in Zukunft eine wichtige Rolle im Unternehmen s...
Datenschutzbeauftragte werden in Zukunft eine wichtige Rolle im Unternehmen s...
 

Recently uploaded

Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...RKavithamani
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactPECB
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphThiyagu K
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application ) Sakshi Ghasle
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppCeline George
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesFatimaKhan178732
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfchloefrazer622
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 

Recently uploaded (20)

Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
Staff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSDStaff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSD
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application )
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website App
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and Actinides
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 

The necessity of metadata for linked open data and its contribution to policy analyses #CeDEM12

  • 1. The necessity of metadata for linked open data and its contribution to policy analyses Anneke Zuiderwijk*, Keith Jeffery**, Marijn Janssen* *Delft University of Technology, The Netherlands **Science and Technology Facilities Council, United Kingdom CEDEM 2012, May 3-4
  • 2. Open governmental data 0 "We are sending a strong signal to administrations today. Your data is worth more if you give it away. So start releasing it now.” (December 12, 2011) European Commission Vice President Neelie Kroes, digital agenda: Turning government data into gold) 0 One of many examples that shows that open governmental data have gained considerable attention recently CEDEM 2012
  • 3. The ENGAGE project 0 ENGAGE (FP7): An Infrastructure for Open, Linked Governmental Data Provision towards Research Communities and Citizens (http://www.engage-project.eu) 0 Main goal: the development and use of a data infrastructure, incorporating distributed and diverse public sector information (PSI) resources. 0 The ENGAGE platform will enable researchers and citizens to: 0 Discover and browse datasets across diverse and dispersed public sector information resources (local, national and European) in their own language 0 Download the datasets 0 Perform geospatial search of datasets 0 Visualize properly structured datasets in data tables, maps and charts CEDEM 2012
  • 4. Open governmental data 0 Open governmental data can be defined as “all stored data of the public sector which could be made accessible by government in the public interest without any restrictions on usage and distribution” (Geiger & Von Lucke, 2011, p. 185). 0 For example, public sector data can be: 0 Geographic data (e.g. cadastral information) 0 Legal data (e.g. courts decisions, legislation) 0 Meteorological data (e.g. climate data, weather forecasts) 0 Social data (e.g. population, public administration) 0 Transport data (e.g. traffic congestion, work on roads) 0 Business data (e.g. chamber of commerce, patents) (MEPSIR study, Dekkers et al., 2006) CEDEM 2012
  • 5. Linked open data (LOD) 0 Focus on turning public sector PUBLIC SECTOR (POLICY) data into LOD (1) DATA METADATA 1. Public body produces data (and (2) metadata) PUBLICATION ON THE SEMANTIC WEB 2. Data become available on the (3) Web of Data / Semantic Web REUSING OPEN DATA 3. Open data can be reused (4) 4. Open data can be linked to other LINKING DATA data  show relationships (5) 5. Data are both open and linked  LINKED OPEN DATA Linked Open Data (LOD) Figure 1: Process for creating Linked Open Data CEDEM 2012
  • 6. Metadata 0 Metadata are part of the LOD-process 0 Metadata are needed to make sense of the open data (Berners- Lee, 2009) 0 Metadata are defined as “structured information that describes, explains, locates, or otherwise makes it easier to retrieve, use, or manage an information resource.” (National Information Standards Organization, 2004, p. 1). 0 Metadata provision in the ideal situation: 0 Discovery metadata, e.g. identifier, title, creator, keywords. 0 Contextual metadata, e.g. organizations, projects, funding. 0 Detailed metadata, e.g. quality and domain specific parameters. CEDEM 2012
  • 7. Why metadata are necessary in analyzing LOD 0 Metadata for LOD can be useful in the following situations. Metadata: 0 create order within datasets; 0 improve storing and preservation of LOD; 0 improve easily finding LOD; 0 improve the accessibility of LOD; 0 may make it possible to assess and rank the quality of LOD; 0 improve easily analyzing, comparing, reproducing and therefore finding inconsistencies in LOD; 0 improve chances of a correct interpretation of LOD; 0 improve the possibilities to find patterns in LOD to generate new hypotheses; 0 may improve visualizing LOD; 0 make it easier to link data ; 0 avoid unnecessary duplication of LOD. CEDEM 2012
  • 8. Problem statement 0 Discrepancies between the benefits that are described in literature and the benefits that are obtained in reality 0 Current situation is a long way from the ideal situation: 0 usually few and insufficient ways of managing metadata and interpretation of LOD (for instance Hernández-Pérez et al., 2009; Schuurman et al., 2008; Xiong et al., 2011); 0 adding metadata is often viewed as an additional activity that only consumes resources. 0 Statements: 0 Merely linking data is not enough to make use of open data 0 Metadata are key enablers for the effective use of LOD in policy-making CEDEM 2012
  • 9. Requirements for a metadata architecture 0 The metadata should: 0 be easily discovered; 0 interconvert common metadata formats used in PSI; 0 provide a LOD representation of the metadata for browsing or query; 0 maintain the capabilities of conventional information systems with structured query including convenient primitive operations. CEDEM 2012
  • 10. Outline architecture 0 The requirements lead to the following architecture: Portal server PORTAL METADATA RUNNING SOFTWARE APPLICATION PSI PSI PSI DATA- DATA- DATA- Application Server SET SET SET PSI Dataset Servers Figure 2: An architecture of a portal server for the provision of metadata. CEDEM 2012
  • 11. Metadata 0 Metadata should be used to implement this architecture A 3-layer structure for metadata is used: a) discovery (flat) metadata; for example: 0 Dublin Core (DC); 0 e-Government Metadata Standard (e-GMS); 0 Comprehensive Knowledge Archive Network (CKAN); 0 or similar ‘flat’ metadata b) contextual metadata; uses the Common European Research Information Format (CERIF) ; c) detailed metadata. CEDEM 2012
  • 12. The Vision: Metadata for Data Model DISCOVERY Linked open data (DC, eGMS…) Generate CONTEXT (CERIF) Formal Point to Information Systems DETAIL (SUBJECT OR TOPIC SPECIFIC)
  • 13. Design The presented structure provides the next improved facilities: 0 CERIF provides a much richer metadata than the standards used commonly with PSI datasets. 0 The representation of contextual metadata (CERIF) allows rich semantics to be represented thus making the PSI datasets understandable to the end user (or software) through the metadata. 0 The Structured Query Language (SQL) has a simpler structure than SPARQL and includes convenient primitive operations for simple statistical calculations such as sum, count, average. CEDEM 2012
  • 14. Benefits of architecture 0 Because of the powerful expressive semantics over formal syntax of CERIF we can: 0 Generate discovery metadata from CERIF; 0 Interconvert common metadata formats used in PSI using CERIF as the superset exchange mechanism; 0 Provide a semantic web / LOD representation of the metadata for browsing or query using SPARQL; 0 While maintaining a conventional information systems capability with structured query including convenient primitive operations. CEDEM 2012
  • 15. Models for an infrastructure 0 The data model with its metadata described is only one relevant model 0 The other models are: 0 User model 0 Processing model 0 Resource model
  • 16. The Vision: The Models User Model Processing Model Data Model Resource Model Complete cohort of users Complete ICT environment for PSI
  • 17. Model – User model 0 User Model: controls the way in which the end-user interacts with the e-infrastructure. 0 User profile, security certification, privacy; 0 Device and interaction mode preferences (keyboard/mouse through voice and gesture to brain-connected), language preference; 0 Resource preferences (including contacts) with directories; 0 METADATA
  • 18. Models – Processing model 0 Process Model controls the way processes are constructed and executed in the e-infrastructure 0 Services 0 Described for discovery, described for functional and non-functional (security, privacy, performance) properties 0 Mobile (deployed in distributed / parallel execution environments) 0 Open source where possible 0 Service composition 0 Dynamically (re-) composable during execution 0 METADATA
  • 19. Models – Data model 0 Data Model controls data representation and data (re-)use 0 Formal syntax (structure) 0 Even for text, images, streamed video 0 Declared semantics (meaning) 0 METADATA
  • 20. Models – Resource model 0 Resource Model catalogs the available computing resources in the e-infrastructure 0 This allows virtualisation so the user neither knows nor cares from where the data comes, or where the processing is done, as long as quality of service is maintained; 0 Requires updating by resource owners – together with conditions of use 0 METADATA
  • 21. Conclusions (1) 0 Metadata are needed to make sense of the open data 0 Merely linking data is not enough to make optimal use of open data 0 Metadata are key enablers for policy-making 0 Adding metadata can yield considerable benefits, including: 0 creating order in datasets 0 improving find ability, accessibility, storing and preservation of LOD 0 improving easily analyzing, comparing, reproducing, finding inconsistencies 0 correct interpretation and visualizing of LOD 0 finding patters in LOD to generate new hypotheses 0 making linking of data easier 0 assessing and ranking the quality of LOD and avoiding unnecessary duplication of LOD CEDEM 2012
  • 22. Conclusions (2) 0 Architecture for metadata: 0 discovery metadata can be generated from CERIF 0 common metadata formats can use CERIF as the superset exchange mechanism 0 a LOD representation of the metadata for browsing or query can be made allowing the use of SPARQL 0 while a conventional information systems capability with structured query including convenient primitive operations can be maintained 0 We recommend to further implement the proposed metadata architecture CEDEM 2012

Editor's Notes

  1. Start with a citation of NelieKroes - December 12, 2011This example shows that open data have gained considerable attention recentlyAnotherexample is the ENGAGE project
  2. Framework Programme 7 shows that attention of the European Commission for Open DataENGAGE is part of FP7Mail goalThe paper that we present here stems from the ENGAGE project
  3. What are open governmental data? Mention definition Geiger & Von Lucke.We adopt this definition as it excludes the publication of data which must remain confidential, are private or contain industrial secrets.Examples of open governmental data
  4. Linking data providesuswith the benefits of open data; obtainvaluebylinking, showingrelationshipHow are LOD created?A public body produces anonymised (non-personally identifiable) data during the course of its ordinary business. Produceddata become freely available to everyone on the Web of Data, also referred to as the Semantic Web. The public sector data are then referred to as open data and can be used, reused and redistributed by everyone, without restrictions from copyright, patents or other mechanisms of control. A possibility of reusing open data is by linking themto other data to show relationships with these other data.The Linked Data that are the outcome of this linking are defined as “a collection of interrelated datasets on the Web”. Data which are both open and linked, referred to as LOD, are data that meet the requirements of open data and that also show relationships among the open data thus providing information which may be defined as structured data in context. After PSI is converted into LOD, this creates interesting possibilities for analyzing policies of public bodies. e.g. 2 datasets: 1 withdemographic data, 1 with crime data. Linkingthem on the basis of postal codes will shows relationsshipsbetweendemographic data and crime data.
  5. We saw that publishing metadata is part of the LOD-process.Metadata are needed to make sense of the open data. Metadata are data about the data.We define metadata as “structured information that describes, explains, locates, or otherwise makes it easier to retrieve, use, or manage an information resource.”In the ideal situation for LOD, different types of metadata are provided: discovery (flat) metadata (which are descriptive and navigational),contextual metadata (which are descriptive, restrictive and navigational) detailed metadata (which cover schema metadata plus additional metadata to assure quality). These types of metadata describe among other things the following information about the LOD.Discovery (flat) metadata: identifier, title, creator, publisher, country, source, type, format, language, sector, subjects, keywords, relative information system, validity date (from – to), audience, legal framework, status, relevant resources and linked data sets.Contextual metadata: organizations, persons, projects, funding, facilities, equipment, services and pointers to detailed metadata.Detailed metadata: include quality (accuracy, precision, calibration and other parameters (Charalabidis, Ntanos, & Lampathaki, 2011) and domain or dataset-specific parameters that are used by software accessing and processing the dataset.
  6. Benefits of the metadata according to the literature overview.
  7. There are discrepancies between the benefits that are described in literature and the benefits that are obtained in reality. The currentsituation is insufficient.Statements
  8. Based on the literature overview and twouse cases we found that the basic capabilities that are created by adding metadata are as follows:The metadata should be easily discovered;The metadata should interconvert common metadata formats used in PSI;The metadata should provide a LOD representation of the metadata for browsing or query;The metadata should maintain the capabilities of conventional information systems with structured query including convenient primitive operations.To accomplish these capabilities we need discovery, contextual and detailed metadata.
  9. The challenge is to design an architecture to allow (a) end-user ‘citizen’ and ‘researcher’ access via a portal supported by metadata to PSI datasets for download; (b) access - utilising metadata – to those same datasets via a service from a running program on another system to utilise the information in another context. This leads naturally to an architecture sketched in Figure 2.
  10. - CERIF provides a much richer metadata than the standards used commonly with PSI datasets and so improves greatly the experience of the end user (or the software) in processing the PSI datasets described by the enhanced metadata.- The representation of contextual metadata (CERIF) allows rich semantics to be represented simply over a formal syntax thus making the PSI datasets understandable to the end user (or software) through the enhanced metadata. - The Structured Query Language (SQL) usually presented to the end-user through an easy-to-use Query By Example (QBE) interface has a simpler structure than SPARQL and includes convenient primitive operations for simple statistical calculations such as sum, count, average.