SlideShare a Scribd company logo
Using
 Web Data Provenance
           for
  Quality Assessment
Olaf Hartig*
Jun Zhao˚




*Humboldt-Universität zu Berlin ˚University of Oxford
Information Quality (IQ)
 ●   Common definition: fitness for use of information
 ●   Multidimensional concept
     Category*                            Criteria / Dimensions
     Intrinsic                            Accuracy, Believability, Objectivity, ...
     Contextual                           Completeness, Relevance, Timeliness, ...
     Representational                     Conciseness, Understandability, ...
     Accessibility                        Availability, Security, ...
                                                                 *Classification by Wang and Strong, 1996

 ●   IQ criteria not independent of each other
 ●   Relevancy of criteria determined by task and preferences

Olaf Hartig - Using Web Data Provenance for Quality Assessment                                              2
IQ Assessment

 ●   Assigning numerical values (IQ scores) to IQ criteria
 ●   It is difficult!
     ●   Precision vs. Practicality



               Manual methods                               Semi-automatic methods
               ●   Questionnaires                           ●    Rating-based
                                                            ●    Reputation-based



Olaf Hartig - Using Web Data Provenance for Quality Assessment                       3
Automated IQ Assessment
 ●   Literature only outlines ideas for automatic methods
 ●   Content analysis
     ●   Comparison (e.g. outlier detection)
     ●   Application of information retrieval methods
     ●   Analysis of results from data cleansing
     ●   Sampling techniques
 ●   Context analysis
     ●   Analysis of metadata
     ●   Utilization of domain knowledge



Olaf Hartig - Using Web Data Provenance for Quality Assessment   4
Our Goal:
                             Methods to automatically assess
                                 IQ criteria of Web data



 Primary means:
                                 Provenance of assessed data




Olaf Hartig - Using Web Data Provenance for Quality Assessment   5
Outline



           1. Web Data Provenance

           2. General Assessment Approach

           3. Development of Assessment Methods




Olaf Hartig - Using Web Data Provenance for Quality Assessment   6
Existing Provenance Research
 ●   Main research areas: (scientific) workflows, DBMSs
 ●   General focus:
           data creation




Olaf Hartig - Using Web Data Provenance for Quality Assessment   7
Provenance of Web Data




Olaf Hartig - Using Web Data Provenance for Quality Assessment   8
Provenance of Web Data



                      Web data provenance
                           comprises
                        two dimensions:
        Data Creation • Data Access


Olaf Hartig - Using Web Data Provenance for Quality Assessment   9
Model of Web Data Provenance
 ●   Provenance graph describes provenance of a data item
     ●   Nodes: provenance elements – pieces of provenance info
     ●   Edges: relate provenance elements to each other
     ●   Subgraphs for related data items possible




Olaf Hartig - Using Web Data Provenance for Quality Assessment    10
Model of Web Data Provenance
 ●   Provenance model defines:                                   Actors
     ●   Types of provenance elements
                                                                 Executions
     ●   Relationships
                                                                 Artifacts




Olaf Hartig - Using Web Data Provenance for Quality Assessment                11
Data Access Dimension
                                                                                       Data Item
                 Data Accessor
                  (Non-Human)
                                                                                           contains
                                      performs               retrieved by   Document

                       Execution Time
                                                     Data Access
                                 accessed

                       Data Providing Service
                              (Non-Human)
                                                         controls
         uses
                                              Service Provider
       Data Publisher
           (Human)

         Relation to
  the provided Information
         Resource




Olaf Hartig - Using Web Data Provenance for Quality Assessment                                        12
Data Access Dimension cont.

                                    (Verified)
                                     Artifact




                                                 Integrity Verification


            Verification Result
                                          {incomplete}
                                                                          Signer


                                                 Signature Verification      Relation to
                                                                          the signed Data

                      Signature Method




Olaf Hartig - Using Web Data Provenance for Quality Assessment                              13
Data Creation Dimension
                                                                        Provenance
                                                                        Information

                                                                                 Source Data
                                             Execution Time                                                 Provenance
                                                                                                            Information

                                                                                               Creation Guidelines
                    Data Creator
                                                                 Data Creation
               (Human or Non-human)

   {complete,disjoint}


                                                  Data Creating Device
                                                        (e.g. Sensor)                           Data Item

                          Data Creating Service
                            (e.g. Software Agent)                                          part of
                                 responsible for responsible for                                      Provenance
   Data Creating Entity                                                                               Information
 (e.g. Person, Group, Orga.)                                                            (Encompassing)
                                                                                          Data Item
         Relation to
      the created Data
Olaf Hartig - Using Web Data Provenance for Quality Assessment                                                            14
Outline



           1. Web Data Provenance

           2. General Assessment Approach

           3. Development of Assessment Methods




Olaf Hartig - Using Web Data Provenance for Quality Assessment   15
A General Approach

 ●   Blueprint for actual assessment methods that
     ●   Address specific scenario
     ●   Focus on specific IQ criterion
 ●   Provenance elements have an influence on IQ
 ●   Impact values represent these influences
 ●   Assessment is affected by knowing about the influences
 ●   Calculation of the IQ score with an assessment function
                                 that combines all impact values



Olaf Hartig - Using Web Data Provenance for Quality Assessment   16
General Assessment Procedure




 Step 1 – Generate a provenance graph for the data item

 Step 2 – Annotate the provenance graph with impact values

 Step 3 – Execute the assessment function




Olaf Hartig - Using Web Data Provenance for Quality Assessment   17
Outline



           1. Web Data Provenance

           2. General Assessment Approach

           3. Development of Assessment Methods




Olaf Hartig - Using Web Data Provenance for Quality Assessment   18
Designing Assessment Methods
 ●   Developing the general approach into an actual method
 ●   Fundamental design question:

     For which IQ criterion do we want to apply the method?




Olaf Hartig - Using Web Data Provenance for Quality Assessment   19
Designing Assessment Methods
 ●   Developing the general approach into an actual method
 ●   Fundamental design question:

         For which IQ criterion do we want to apply the method?



 ●   Timeliness: degree to which the data item is up-to-date
                 with respect to the task at hand
 ●   Representation* as an absolute measure in [0,1]
     ●   1 – meeting the most strict timeliness standards
     ●   0 – unacceptable

                                        *Following Ballou et al., 1998
Olaf Hartig - Using Web Data Provenance for Quality Assessment           20
1 Generate the Provenance Graph

 What types of provenance elements are necessary?
     What level of detail (i.e. granularity) is necessary?



 Where and how do we get provenance information?
 ●   Two complementary options:
     ●   Recording
     ●   Analyzing metadata



Olaf Hartig - Using Web Data Provenance for Quality Assessment   21
1 Generate the Provenance Graph
 Example:
 ●   Sensors (e.g. sensor1) hourly take measurement (e.g. msr)
 ●   All msr stored in a Web-accessible storage device (store)
 ●   Our system (sys) accesses them for further processing
 ●   sys assesses the timeliness of all msr




Olaf Hartig - Using Web Data Provenance for Quality Assessment   22
1 Generate the Provenance Graph
 Example:
 ●   Sensors (e.g. sensor1) hourly take measurement (e.g. msr)
 ●   All msr stored in a Web-accessible storage device (store)
 ●   Our system (sys) accesses them for further processing
 ●   sys assesses the timeliness of all msr
           msr                  created by                             performed by                  sensor1
     type: Data Item                                      cExc                                   type: Data Creator
                                                   type: Data Creation

       contained by                                                       Execution Time: 10:00

              doc                retrieved by                                                         store
        type: Document                                                                    type: Data Providing Service
                                                          aExc                accessed
                                                   type: Data Access
              sys                    performed by
      type: Data Accessor                                                Execution Time: 10:13
Olaf Hartig - Using Web Data Provenance for Quality Assessment                                                        23
2 Annotation with Impact Values

                                              How might each provenance
                                         element influence the IQ criterion?
 ●   Systematically analyze each type of provenance elements


                        What kind of impact values are necessary?
How do we represent the influences by impact values?
 ●   Impact values not necessarily numerical
 ●   Depends on the assessment function in step 3


                                   How do we determine impact values?

Olaf Hartig - Using Web Data Provenance for Quality Assessment                 24
Determining Impact Values
 ●   From the provenance information
 ●   From user input
     ●   Configuration options
     ●   Rating-based, Reputation-based
 ●   By content analysis
     ●   Comparison (e.g. outlier detection)
     ●   Adoption of information retrieval methods
     ●   Adoption of data cleansing techniques
 ●   By context analysis
     ●   Further metadata
     ●   Domain knowledge
Olaf Hartig - Using Web Data Provenance for Quality Assessment   25
2 Annotation with Impact Values

                                              How might each provenance
                                         element influence the IQ criterion?




 Data Creation Dimension:

      Prov. Element Type                          Impact Values
      Data Creation                               ●  creation time
                                                  ● weights


      Creation Guidelines                          -
      (Source) Data Item                          ●    expiry time
      Data Creator                                 -
Olaf Hartig - Using Web Data Provenance for Quality Assessment                 26
2 Annotation with Impact Values
           msr                  created by                                performed by                 sensor1
     type: Data Item                                       cExc                                    type: Data Creator
                                                   type: Data Creation

        contained by                                                         Execution Time: 10:00

              doc                 retrieved by                                                          store
        type: Document                                                                       type: Data Providing Service
                                                           aExc                 accessed
                                                      type: Data Access
              sys                     performed by
      type: Data Accessor                                                  Execution Time: 10:13


      Prov. Element Type                          Impact Values
      Data Creation                               ●  creation time
                                                  ● weights


      Creation Guidelines                          -
      (Source) Data Item                          ●    expiry time
      Data Creator                                 -
Olaf Hartig - Using Web Data Provenance for Quality Assessment                                                          27
2 Annotation with Impact Values
           msr                  created by                                performed by                 sensor1
     type: Data Item                                       cExc                                    type: Data Creator
                                                   type: Data Creation
                                           creation time
        contained by                          10:00                          Execution Time: 10:00

              doc                 retrieved by                                                          store
        type: Document                                                                       type: Data Providing Service
                                                           aExc                 accessed
                                                      type: Data Access
              sys                     performed by
      type: Data Accessor                                                  Execution Time: 10:13


      Prov. Element Type                          Impact Values
      Data Creation                               ●  creation time
                                                  ● weights


      Creation Guidelines                          -
      (Source) Data Item                          ●    expiry time
      Data Creator                                 -
Olaf Hartig - Using Web Data Provenance for Quality Assessment                                                          28
2 Annotation with Impact Values
           msr                created by                       performed by                sensor1
     type: Data Item                               cExc                                type: Data Creator
                   expiry time              type: Data Creation
                     11:00           creation time
       contained by                      10:00                    Execution Time: 10:00

              doc                 retrieved by                                                       store
        type: Document                                                                     type: Data Providing Service
                                                           aExc                accessed
                                                      type: Data Access
              sys                     performed by
      type: Data Accessor                                                 Execution Time: 10:13


      Prov. Element Type                          Impact Values
      Data Creation                               ●  creation time
                                                  ● weights


      Creation Guidelines                          -
      (Source) Data Item                          ●    expiry time
      Data Creator                                 -
Olaf Hartig - Using Web Data Provenance for Quality Assessment                                                        29
3 Assessment Function

     How do we represent the IQ criterion by an IQ score?


                 What does the assessment function look like?
 ●   Develop the function together with the impact values
 ●   Take incompleteness into consideration
     ●   Provenance graphs could be fragmentary
     ●   Annotations could be missing




Olaf Hartig - Using Web Data Provenance for Quality Assessment   30
Step 3 – Assessment Function




Olaf Hartig - Using Web Data Provenance for Quality Assessment   31
Step 3 – Assessment Function




           msr                created by                       performed by                sensor1
     type: Data Item                               cExc                                type: Data Creator
                   expiry time              type: Data Creation
                     11:00           creation time
       contained by                      10:00                    Execution Time: 10:00

              doc                retrieved by                                                      store
        type: Document                                                                   type: Data Providing Service
                                                          aExc               accessed
                                                    type: Data Access
              sys                    performed by
      type: Data Accessor                                               Execution Time: 10:13

Olaf Hartig - Using Web Data Provenance for Quality Assessment                                                      32
Step 3 – Assessment Function




           msr                created by                       performed by                sensor1
     type: Data Item                               cExc                                type: Data Creator
                   expiry time              type: Data Creation
                     11:00           creation time
       contained by                      10:00                    Execution Time: 10:00

              doc                retrieved by                                                      store
        type: Document                                                                   type: Data Providing Service
                                                          aExc               accessed
                                                    type: Data Access
              sys                    performed by
      type: Data Accessor                                               Execution Time: 10:13

Olaf Hartig - Using Web Data Provenance for Quality Assessment                                                      33
Step 3 – Assessment Function



                                                        t(msr) = 1 – (10:15 – 10:00) / (11:00 – 10:00)
                                                               =1–           0.25h / 1h
                                                               = 0.75

           msr                created by                       performed by                sensor1
     type: Data Item                               cExc                                type: Data Creator
                   expiry time              type: Data Creation
                     11:00           creation time
       contained by                      10:00                    Execution Time: 10:00

              doc                retrieved by                                                      store
        type: Document                                                                   type: Data Providing Service
                                                          aExc               accessed
                                                    type: Data Access
              sys                    performed by
      type: Data Accessor                                               Execution Time: 10:13

Olaf Hartig - Using Web Data Provenance for Quality Assessment                                                      34
Conclusion
 ●   Web Data Provenance (data creation + data access)
 ●   General approach for provenance-based IQ assessment
     ●   Impact values: influence of provenance elements on IQ
 ●   Design decisions for actual assessment methods
 ●   Application to timeliness (more in the paper)



 ●   Future work:
     ●   How do we deal with incompleteness?
     ●   Application of the approach to other IQ criteria


Olaf Hartig - Using Web Data Provenance for Quality Assessment   35
These slides have been created by
                                            Olaf Hartig
                                                http://olafhartig.de

                              This work is licensed under a
                Creative Commons Attribution-Share Alike 3.0 License
                    (http://creativecommons.org/licenses/by-sa/3.0/)




                             Attribution:
                             ●   http://www.flickr.com/photos/rrrrred/3809362767/
                             ●   http://www.hasslefreeclipart.com




Olaf Hartig - Using Web Data Provenance for Quality Assessment                      36

More Related Content

What's hot

Augmented Analytics and Automation in the Age of the Data Scientist
Augmented Analytics and Automation in the Age of the Data ScientistAugmented Analytics and Automation in the Age of the Data Scientist
Augmented Analytics and Automation in the Age of the Data Scientist
WhereScape
 
From Information to Insight: Data Storytelling for Organizations
From Information to Insight: Data Storytelling for OrganizationsFrom Information to Insight: Data Storytelling for Organizations
From Information to Insight: Data Storytelling for Organizations
Thinking Machines
 
INSTAGRAM_DATA_ANALYSIS.PPT.pptx
INSTAGRAM_DATA_ANALYSIS.PPT.pptxINSTAGRAM_DATA_ANALYSIS.PPT.pptx
INSTAGRAM_DATA_ANALYSIS.PPT.pptx
surendrapushpupadhya
 
Building a Data-Driven Culture
Building a Data-Driven CultureBuilding a Data-Driven Culture
Building a Data-Driven Culture
Lucas Neo
 
Data Engineering and the Data Science Lifecycle
Data Engineering and the Data Science LifecycleData Engineering and the Data Science Lifecycle
Data Engineering and the Data Science Lifecycle
Adam Doyle
 
How to Build & Sustain a Data Governance Operating Model
How to Build & Sustain a Data Governance Operating Model How to Build & Sustain a Data Governance Operating Model
How to Build & Sustain a Data Governance Operating Model
DATUM LLC
 
Impuls-Vortrag Data Strategy
Impuls-Vortrag Data StrategyImpuls-Vortrag Data Strategy
Impuls-Vortrag Data Strategy
Marco Geuer
 
Cornerstone of Future Growth - Ecosystems
Cornerstone of Future Growth - EcosystemsCornerstone of Future Growth - Ecosystems
Cornerstone of Future Growth - Ecosystems
Accenture Insurance
 
Presentation on Business Intelligence (BI)
Presentation on Business Intelligence (BI)Presentation on Business Intelligence (BI)
Presentation on Business Intelligence (BI)
AkashBorse2
 
Big Data Management: What's New, What's Different, and What You Need To Know
Big Data Management: What's New, What's Different, and What You Need To KnowBig Data Management: What's New, What's Different, and What You Need To Know
Big Data Management: What's New, What's Different, and What You Need To Know
SnapLogic
 
Master Data Management – Aligning Data, Process, and Governance
Master Data Management – Aligning Data, Process, and GovernanceMaster Data Management – Aligning Data, Process, and Governance
Master Data Management – Aligning Data, Process, and Governance
DATAVERSITY
 
Data Modeling is Data Governance
Data Modeling is Data GovernanceData Modeling is Data Governance
Data Modeling is Data Governance
DATAVERSITY
 
State of Data Governance in 2021
State of Data Governance in 2021State of Data Governance in 2021
State of Data Governance in 2021
DATAVERSITY
 
Data modelling 101
Data modelling 101Data modelling 101
Data modelling 101
Christopher Bradley
 
Developing a Data Strategy
Developing a Data StrategyDeveloping a Data Strategy
Developing a Data Strategy
Martha Horler
 
Data Science presentation for elementary school students
Data Science presentation for elementary school studentsData Science presentation for elementary school students
Data Science presentation for elementary school students
Melanie Manning, CFA
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
Knoldus Inc.
 
DAS Slides: Building a Data Strategy — Practical Steps for Aligning with Busi...
DAS Slides: Building a Data Strategy — Practical Steps for Aligning with Busi...DAS Slides: Building a Data Strategy — Practical Steps for Aligning with Busi...
DAS Slides: Building a Data Strategy — Practical Steps for Aligning with Busi...
DATAVERSITY
 
Modern Data Architecture
Modern Data Architecture Modern Data Architecture
Modern Data Architecture
Mark Hewitt
 
Edelman's Social Intelligence Command Center (SICC)
Edelman's Social Intelligence Command Center (SICC)Edelman's Social Intelligence Command Center (SICC)
Edelman's Social Intelligence Command Center (SICC)
Edelman Digital
 

What's hot (20)

Augmented Analytics and Automation in the Age of the Data Scientist
Augmented Analytics and Automation in the Age of the Data ScientistAugmented Analytics and Automation in the Age of the Data Scientist
Augmented Analytics and Automation in the Age of the Data Scientist
 
From Information to Insight: Data Storytelling for Organizations
From Information to Insight: Data Storytelling for OrganizationsFrom Information to Insight: Data Storytelling for Organizations
From Information to Insight: Data Storytelling for Organizations
 
INSTAGRAM_DATA_ANALYSIS.PPT.pptx
INSTAGRAM_DATA_ANALYSIS.PPT.pptxINSTAGRAM_DATA_ANALYSIS.PPT.pptx
INSTAGRAM_DATA_ANALYSIS.PPT.pptx
 
Building a Data-Driven Culture
Building a Data-Driven CultureBuilding a Data-Driven Culture
Building a Data-Driven Culture
 
Data Engineering and the Data Science Lifecycle
Data Engineering and the Data Science LifecycleData Engineering and the Data Science Lifecycle
Data Engineering and the Data Science Lifecycle
 
How to Build & Sustain a Data Governance Operating Model
How to Build & Sustain a Data Governance Operating Model How to Build & Sustain a Data Governance Operating Model
How to Build & Sustain a Data Governance Operating Model
 
Impuls-Vortrag Data Strategy
Impuls-Vortrag Data StrategyImpuls-Vortrag Data Strategy
Impuls-Vortrag Data Strategy
 
Cornerstone of Future Growth - Ecosystems
Cornerstone of Future Growth - EcosystemsCornerstone of Future Growth - Ecosystems
Cornerstone of Future Growth - Ecosystems
 
Presentation on Business Intelligence (BI)
Presentation on Business Intelligence (BI)Presentation on Business Intelligence (BI)
Presentation on Business Intelligence (BI)
 
Big Data Management: What's New, What's Different, and What You Need To Know
Big Data Management: What's New, What's Different, and What You Need To KnowBig Data Management: What's New, What's Different, and What You Need To Know
Big Data Management: What's New, What's Different, and What You Need To Know
 
Master Data Management – Aligning Data, Process, and Governance
Master Data Management – Aligning Data, Process, and GovernanceMaster Data Management – Aligning Data, Process, and Governance
Master Data Management – Aligning Data, Process, and Governance
 
Data Modeling is Data Governance
Data Modeling is Data GovernanceData Modeling is Data Governance
Data Modeling is Data Governance
 
State of Data Governance in 2021
State of Data Governance in 2021State of Data Governance in 2021
State of Data Governance in 2021
 
Data modelling 101
Data modelling 101Data modelling 101
Data modelling 101
 
Developing a Data Strategy
Developing a Data StrategyDeveloping a Data Strategy
Developing a Data Strategy
 
Data Science presentation for elementary school students
Data Science presentation for elementary school studentsData Science presentation for elementary school students
Data Science presentation for elementary school students
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
DAS Slides: Building a Data Strategy — Practical Steps for Aligning with Busi...
DAS Slides: Building a Data Strategy — Practical Steps for Aligning with Busi...DAS Slides: Building a Data Strategy — Practical Steps for Aligning with Busi...
DAS Slides: Building a Data Strategy — Practical Steps for Aligning with Busi...
 
Modern Data Architecture
Modern Data Architecture Modern Data Architecture
Modern Data Architecture
 
Edelman's Social Intelligence Command Center (SICC)
Edelman's Social Intelligence Command Center (SICC)Edelman's Social Intelligence Command Center (SICC)
Edelman's Social Intelligence Command Center (SICC)
 

Viewers also liked

Provenance Information in the Web of Data
Provenance Information in the Web of DataProvenance Information in the Web of Data
Provenance Information in the Web of Data
Olaf Hartig
 
Advertising
AdvertisingAdvertising
Advertising
Julia Vergeles
 
Provenance Analysis and RDF Query Processing: W3C PROV for Data Quality and T...
Provenance Analysis and RDF Query Processing: W3C PROV for Data Quality and T...Provenance Analysis and RDF Query Processing: W3C PROV for Data Quality and T...
Provenance Analysis and RDF Query Processing: W3C PROV for Data Quality and T...
satyasanket
 
Assessment & adjustment for data quality used in the South African DISTRICT ...
Assessment & adjustment for data quality used in the South African DISTRICT ...Assessment & adjustment for data quality used in the South African DISTRICT ...
Assessment & adjustment for data quality used in the South African DISTRICT ...Routine Health Information NetwOrk (RHINO)
 
Data Usability Assessment for Remote Sensing Data: Accuracy of Interactive Da...
Data Usability Assessment for Remote Sensing Data: Accuracy of Interactive Da...Data Usability Assessment for Remote Sensing Data: Accuracy of Interactive Da...
Data Usability Assessment for Remote Sensing Data: Accuracy of Interactive Da...
Beniamino Murgante
 
2014 review of data quality assessment methods
2014 review of data quality assessment methods2014 review of data quality assessment methods
2014 review of data quality assessment methods
Roger Zapata
 
LDIF Lightening Talk
LDIF Lightening TalkLDIF Lightening Talk
LDIF Lightening Talk
William Smith
 
Leveraging DBpedia for Adaptive Crowdsourcing in Linked Data Quality Assessment
Leveraging DBpedia for Adaptive Crowdsourcing in Linked Data Quality AssessmentLeveraging DBpedia for Adaptive Crowdsourcing in Linked Data Quality Assessment
Leveraging DBpedia for Adaptive Crowdsourcing in Linked Data Quality Assessment
Umair ul Hassan
 
LDQ 2014 DQ Methodology
LDQ 2014 DQ MethodologyLDQ 2014 DQ Methodology
LDQ 2014 DQ Methodology
Amrapali Zaveri, PhD
 
Data quality assessment of OSM datasets of Ringroad, Kathmandu, Nepal
Data quality assessment of OSM datasets of Ringroad, Kathmandu, NepalData quality assessment of OSM datasets of Ringroad, Kathmandu, Nepal
Data quality assessment of OSM datasets of Ringroad, Kathmandu, Nepal
Survey Department
 
Mappings Validation
Mappings ValidationMappings Validation
Mappings Validation
andimou
 
Provenance and Reuse of Open Data (PILOD 2.0 June 2014)
Provenance and Reuse of Open Data (PILOD 2.0 June 2014)Provenance and Reuse of Open Data (PILOD 2.0 June 2014)
Provenance and Reuse of Open Data (PILOD 2.0 June 2014)Rinke Hoekstra
 
Assessing and Refining Mappings to RDF to Improve Dataset Quality
Assessing and Refining Mappings to RDF to Improve Dataset QualityAssessing and Refining Mappings to RDF to Improve Dataset Quality
Assessing and Refining Mappings to RDF to Improve Dataset Quality
andimou
 
METHODS, MATHEMATICAL MODELS, DATA QUALITY ASSESSMENT AND RESULT INTERPRETATI...
METHODS, MATHEMATICAL MODELS, DATA QUALITY ASSESSMENT AND RESULT INTERPRETATI...METHODS, MATHEMATICAL MODELS, DATA QUALITY ASSESSMENT AND RESULT INTERPRETATI...
METHODS, MATHEMATICAL MODELS, DATA QUALITY ASSESSMENT AND RESULT INTERPRETATI...
HTAi Bilbao 2012
 
MEASURE Evaluation Data Quality Assessment Methodology and Tools
MEASURE Evaluation Data Quality Assessment Methodology and ToolsMEASURE Evaluation Data Quality Assessment Methodology and Tools
MEASURE Evaluation Data Quality Assessment Methodology and Tools
MEASURE Evaluation
 
Data Quality Rules introduction
Data Quality Rules introductionData Quality Rules introduction
Data Quality Rules introduction
datatovalue
 
Linked Data Quality Assessment: A Survey
Linked Data Quality Assessment: A SurveyLinked Data Quality Assessment: A Survey
Linked Data Quality Assessment: A Survey
Amrapali Zaveri, PhD
 
Data quality overview
Data quality overviewData quality overview
Data quality overviewAlex Meadows
 
FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...
FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...
FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...
Mark Wilkinson
 
Prov-O-Viz: Interactive Provenance Visualization
Prov-O-Viz: Interactive Provenance VisualizationProv-O-Viz: Interactive Provenance Visualization
Prov-O-Viz: Interactive Provenance Visualization
Rinke Hoekstra
 

Viewers also liked (20)

Provenance Information in the Web of Data
Provenance Information in the Web of DataProvenance Information in the Web of Data
Provenance Information in the Web of Data
 
Advertising
AdvertisingAdvertising
Advertising
 
Provenance Analysis and RDF Query Processing: W3C PROV for Data Quality and T...
Provenance Analysis and RDF Query Processing: W3C PROV for Data Quality and T...Provenance Analysis and RDF Query Processing: W3C PROV for Data Quality and T...
Provenance Analysis and RDF Query Processing: W3C PROV for Data Quality and T...
 
Assessment & adjustment for data quality used in the South African DISTRICT ...
Assessment & adjustment for data quality used in the South African DISTRICT ...Assessment & adjustment for data quality used in the South African DISTRICT ...
Assessment & adjustment for data quality used in the South African DISTRICT ...
 
Data Usability Assessment for Remote Sensing Data: Accuracy of Interactive Da...
Data Usability Assessment for Remote Sensing Data: Accuracy of Interactive Da...Data Usability Assessment for Remote Sensing Data: Accuracy of Interactive Da...
Data Usability Assessment for Remote Sensing Data: Accuracy of Interactive Da...
 
2014 review of data quality assessment methods
2014 review of data quality assessment methods2014 review of data quality assessment methods
2014 review of data quality assessment methods
 
LDIF Lightening Talk
LDIF Lightening TalkLDIF Lightening Talk
LDIF Lightening Talk
 
Leveraging DBpedia for Adaptive Crowdsourcing in Linked Data Quality Assessment
Leveraging DBpedia for Adaptive Crowdsourcing in Linked Data Quality AssessmentLeveraging DBpedia for Adaptive Crowdsourcing in Linked Data Quality Assessment
Leveraging DBpedia for Adaptive Crowdsourcing in Linked Data Quality Assessment
 
LDQ 2014 DQ Methodology
LDQ 2014 DQ MethodologyLDQ 2014 DQ Methodology
LDQ 2014 DQ Methodology
 
Data quality assessment of OSM datasets of Ringroad, Kathmandu, Nepal
Data quality assessment of OSM datasets of Ringroad, Kathmandu, NepalData quality assessment of OSM datasets of Ringroad, Kathmandu, Nepal
Data quality assessment of OSM datasets of Ringroad, Kathmandu, Nepal
 
Mappings Validation
Mappings ValidationMappings Validation
Mappings Validation
 
Provenance and Reuse of Open Data (PILOD 2.0 June 2014)
Provenance and Reuse of Open Data (PILOD 2.0 June 2014)Provenance and Reuse of Open Data (PILOD 2.0 June 2014)
Provenance and Reuse of Open Data (PILOD 2.0 June 2014)
 
Assessing and Refining Mappings to RDF to Improve Dataset Quality
Assessing and Refining Mappings to RDF to Improve Dataset QualityAssessing and Refining Mappings to RDF to Improve Dataset Quality
Assessing and Refining Mappings to RDF to Improve Dataset Quality
 
METHODS, MATHEMATICAL MODELS, DATA QUALITY ASSESSMENT AND RESULT INTERPRETATI...
METHODS, MATHEMATICAL MODELS, DATA QUALITY ASSESSMENT AND RESULT INTERPRETATI...METHODS, MATHEMATICAL MODELS, DATA QUALITY ASSESSMENT AND RESULT INTERPRETATI...
METHODS, MATHEMATICAL MODELS, DATA QUALITY ASSESSMENT AND RESULT INTERPRETATI...
 
MEASURE Evaluation Data Quality Assessment Methodology and Tools
MEASURE Evaluation Data Quality Assessment Methodology and ToolsMEASURE Evaluation Data Quality Assessment Methodology and Tools
MEASURE Evaluation Data Quality Assessment Methodology and Tools
 
Data Quality Rules introduction
Data Quality Rules introductionData Quality Rules introduction
Data Quality Rules introduction
 
Linked Data Quality Assessment: A Survey
Linked Data Quality Assessment: A SurveyLinked Data Quality Assessment: A Survey
Linked Data Quality Assessment: A Survey
 
Data quality overview
Data quality overviewData quality overview
Data quality overview
 
FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...
FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...
FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...
 
Prov-O-Viz: Interactive Provenance Visualization
Prov-O-Viz: Interactive Provenance VisualizationProv-O-Viz: Interactive Provenance Visualization
Prov-O-Viz: Interactive Provenance Visualization
 

Similar to Using Web Data Provenance for Quality Assessment

ATAGTR2017 Bee-Hive approach for Big Data Testing [End to End Continuous Test...
ATAGTR2017 Bee-Hive approach for Big Data Testing [End to End Continuous Test...ATAGTR2017 Bee-Hive approach for Big Data Testing [End to End Continuous Test...
ATAGTR2017 Bee-Hive approach for Big Data Testing [End to End Continuous Test...
Agile Testing Alliance
 
Infosys - Supply Chain Analytics Services | Solution
Infosys - Supply Chain Analytics Services | SolutionInfosys - Supply Chain Analytics Services | Solution
Infosys - Supply Chain Analytics Services | Solution
Infosys
 
Data mining
Data miningData mining
Data mining
Akannsha Totewar
 
Ibm big data hadoop summit 2012 james kobielus final 6-13-12(1)
Ibm big data    hadoop summit 2012 james kobielus final 6-13-12(1)Ibm big data    hadoop summit 2012 james kobielus final 6-13-12(1)
Ibm big data hadoop summit 2012 james kobielus final 6-13-12(1)Ajay Ohri
 
Provenance Management to Enable Data Sharing
Provenance Management to Enable Data SharingProvenance Management to Enable Data Sharing
Provenance Management to Enable Data SharingUniversity of Arizona
 
Ibm big data ibm marriage of hadoop and data warehousing
Ibm big dataibm marriage of hadoop and data warehousingIbm big dataibm marriage of hadoop and data warehousing
Ibm big data ibm marriage of hadoop and data warehousing DataWorks Summit
 
Pragmatics Driven Issues in Data and Process Integrity in Enterprises
Pragmatics Driven Issues in Data and Process Integrity in EnterprisesPragmatics Driven Issues in Data and Process Integrity in Enterprises
Pragmatics Driven Issues in Data and Process Integrity in Enterprises
Amit Sheth
 
Linked_Open_Data_Rome_Netcamp_13
Linked_Open_Data_Rome_Netcamp_13Linked_Open_Data_Rome_Netcamp_13
Linked_Open_Data_Rome_Netcamp_13
Michele Piunti
 
The Information Workbench as a Self-Service Platform for Linked Data Applicat...
The Information Workbench as a Self-Service Platform for Linked Data Applicat...The Information Workbench as a Self-Service Platform for Linked Data Applicat...
The Information Workbench as a Self-Service Platform for Linked Data Applicat...Peter Haase
 
Linked Data as a Service
Linked Data as a ServiceLinked Data as a Service
Linked Data as a ServicePeter Haase
 
Provenance and Trust
Provenance and TrustProvenance and Trust
Provenance and Trust
Jose Manuel Gómez-Pérez
 
Neil Mason presents on Data Mining and Predictive Analytics at Emetrics San F...
Neil Mason presents on Data Mining and Predictive Analytics at Emetrics San F...Neil Mason presents on Data Mining and Predictive Analytics at Emetrics San F...
Neil Mason presents on Data Mining and Predictive Analytics at Emetrics San F...Foviance
 
Everything Self-Service:Linked Data Applications with the Information Workbench
Everything Self-Service:Linked Data Applications with the Information WorkbenchEverything Self-Service:Linked Data Applications with the Information Workbench
Everything Self-Service:Linked Data Applications with the Information WorkbenchPeter Haase
 
HCLT Brochure: E-Discovery and Document Review Solutions
HCLT Brochure: E-Discovery and Document Review SolutionsHCLT Brochure: E-Discovery and Document Review Solutions
HCLT Brochure: E-Discovery and Document Review Solutions
HCL Technologies
 
A Role for Provenance in Quality Assessment
A Role for Provenance in Quality AssessmentA Role for Provenance in Quality Assessment
A Role for Provenance in Quality AssessmentChris Baillie
 
Secondary data umesh
Secondary data umeshSecondary data umesh
Secondary data umesh
Umesh Soni
 
Future of test automation tools & infrastructure
Future of test automation tools & infrastructureFuture of test automation tools & infrastructure
Future of test automation tools & infrastructure
Anand Bagmar
 
Sören Eickhoff, Informatica GmbH, "Informatica Intelligent Data Lake – Self S...
Sören Eickhoff, Informatica GmbH, "Informatica Intelligent Data Lake – Self S...Sören Eickhoff, Informatica GmbH, "Informatica Intelligent Data Lake – Self S...
Sören Eickhoff, Informatica GmbH, "Informatica Intelligent Data Lake – Self S...
Dataconomy Media
 
Evaluating Big Data Predictive Analytics Platforms
Evaluating Big Data Predictive Analytics PlatformsEvaluating Big Data Predictive Analytics Platforms
Evaluating Big Data Predictive Analytics Platforms
Teradata Aster
 
Building a Modern Data Architecture by Ben Sharma at Strata + Hadoop World Sa...
Building a Modern Data Architecture by Ben Sharma at Strata + Hadoop World Sa...Building a Modern Data Architecture by Ben Sharma at Strata + Hadoop World Sa...
Building a Modern Data Architecture by Ben Sharma at Strata + Hadoop World Sa...
Zaloni
 

Similar to Using Web Data Provenance for Quality Assessment (20)

ATAGTR2017 Bee-Hive approach for Big Data Testing [End to End Continuous Test...
ATAGTR2017 Bee-Hive approach for Big Data Testing [End to End Continuous Test...ATAGTR2017 Bee-Hive approach for Big Data Testing [End to End Continuous Test...
ATAGTR2017 Bee-Hive approach for Big Data Testing [End to End Continuous Test...
 
Infosys - Supply Chain Analytics Services | Solution
Infosys - Supply Chain Analytics Services | SolutionInfosys - Supply Chain Analytics Services | Solution
Infosys - Supply Chain Analytics Services | Solution
 
Data mining
Data miningData mining
Data mining
 
Ibm big data hadoop summit 2012 james kobielus final 6-13-12(1)
Ibm big data    hadoop summit 2012 james kobielus final 6-13-12(1)Ibm big data    hadoop summit 2012 james kobielus final 6-13-12(1)
Ibm big data hadoop summit 2012 james kobielus final 6-13-12(1)
 
Provenance Management to Enable Data Sharing
Provenance Management to Enable Data SharingProvenance Management to Enable Data Sharing
Provenance Management to Enable Data Sharing
 
Ibm big data ibm marriage of hadoop and data warehousing
Ibm big dataibm marriage of hadoop and data warehousingIbm big dataibm marriage of hadoop and data warehousing
Ibm big data ibm marriage of hadoop and data warehousing
 
Pragmatics Driven Issues in Data and Process Integrity in Enterprises
Pragmatics Driven Issues in Data and Process Integrity in EnterprisesPragmatics Driven Issues in Data and Process Integrity in Enterprises
Pragmatics Driven Issues in Data and Process Integrity in Enterprises
 
Linked_Open_Data_Rome_Netcamp_13
Linked_Open_Data_Rome_Netcamp_13Linked_Open_Data_Rome_Netcamp_13
Linked_Open_Data_Rome_Netcamp_13
 
The Information Workbench as a Self-Service Platform for Linked Data Applicat...
The Information Workbench as a Self-Service Platform for Linked Data Applicat...The Information Workbench as a Self-Service Platform for Linked Data Applicat...
The Information Workbench as a Self-Service Platform for Linked Data Applicat...
 
Linked Data as a Service
Linked Data as a ServiceLinked Data as a Service
Linked Data as a Service
 
Provenance and Trust
Provenance and TrustProvenance and Trust
Provenance and Trust
 
Neil Mason presents on Data Mining and Predictive Analytics at Emetrics San F...
Neil Mason presents on Data Mining and Predictive Analytics at Emetrics San F...Neil Mason presents on Data Mining and Predictive Analytics at Emetrics San F...
Neil Mason presents on Data Mining and Predictive Analytics at Emetrics San F...
 
Everything Self-Service:Linked Data Applications with the Information Workbench
Everything Self-Service:Linked Data Applications with the Information WorkbenchEverything Self-Service:Linked Data Applications with the Information Workbench
Everything Self-Service:Linked Data Applications with the Information Workbench
 
HCLT Brochure: E-Discovery and Document Review Solutions
HCLT Brochure: E-Discovery and Document Review SolutionsHCLT Brochure: E-Discovery and Document Review Solutions
HCLT Brochure: E-Discovery and Document Review Solutions
 
A Role for Provenance in Quality Assessment
A Role for Provenance in Quality AssessmentA Role for Provenance in Quality Assessment
A Role for Provenance in Quality Assessment
 
Secondary data umesh
Secondary data umeshSecondary data umesh
Secondary data umesh
 
Future of test automation tools & infrastructure
Future of test automation tools & infrastructureFuture of test automation tools & infrastructure
Future of test automation tools & infrastructure
 
Sören Eickhoff, Informatica GmbH, "Informatica Intelligent Data Lake – Self S...
Sören Eickhoff, Informatica GmbH, "Informatica Intelligent Data Lake – Self S...Sören Eickhoff, Informatica GmbH, "Informatica Intelligent Data Lake – Self S...
Sören Eickhoff, Informatica GmbH, "Informatica Intelligent Data Lake – Self S...
 
Evaluating Big Data Predictive Analytics Platforms
Evaluating Big Data Predictive Analytics PlatformsEvaluating Big Data Predictive Analytics Platforms
Evaluating Big Data Predictive Analytics Platforms
 
Building a Modern Data Architecture by Ben Sharma at Strata + Hadoop World Sa...
Building a Modern Data Architecture by Ben Sharma at Strata + Hadoop World Sa...Building a Modern Data Architecture by Ben Sharma at Strata + Hadoop World Sa...
Building a Modern Data Architecture by Ben Sharma at Strata + Hadoop World Sa...
 

More from Olaf Hartig

LDQL: A Query Language for the Web of Linked Data
LDQL: A Query Language for the Web of Linked DataLDQL: A Query Language for the Web of Linked Data
LDQL: A Query Language for the Web of Linked Data
Olaf Hartig
 
A Context-Based Semantics for SPARQL Property Paths over the Web
A Context-Based Semantics for SPARQL Property Paths over the WebA Context-Based Semantics for SPARQL Property Paths over the Web
A Context-Based Semantics for SPARQL Property Paths over the Web
Olaf Hartig
 
Rethinking Online SPARQL Querying to Support Incremental Result Visualization
Rethinking Online SPARQL Querying to Support Incremental Result VisualizationRethinking Online SPARQL Querying to Support Incremental Result Visualization
Rethinking Online SPARQL Querying to Support Incremental Result Visualization
Olaf Hartig
 
Tutorial "Linked Data Query Processing" Part 5 "Query Planning and Optimizati...
Tutorial "Linked Data Query Processing" Part 5 "Query Planning and Optimizati...Tutorial "Linked Data Query Processing" Part 5 "Query Planning and Optimizati...
Tutorial "Linked Data Query Processing" Part 5 "Query Planning and Optimizati...
Olaf Hartig
 
Tutorial "Linked Data Query Processing" Part 4 "Execution Process" (WWW 2013 ...
Tutorial "Linked Data Query Processing" Part 4 "Execution Process" (WWW 2013 ...Tutorial "Linked Data Query Processing" Part 4 "Execution Process" (WWW 2013 ...
Tutorial "Linked Data Query Processing" Part 4 "Execution Process" (WWW 2013 ...
Olaf Hartig
 
Tutorial "Linked Data Query Processing" Part 3 "Source Selection Strategies" ...
Tutorial "Linked Data Query Processing" Part 3 "Source Selection Strategies" ...Tutorial "Linked Data Query Processing" Part 3 "Source Selection Strategies" ...
Tutorial "Linked Data Query Processing" Part 3 "Source Selection Strategies" ...
Olaf Hartig
 
Tutorial "Linked Data Query Processing" Part 2 "Theoretical Foundations" (WWW...
Tutorial "Linked Data Query Processing" Part 2 "Theoretical Foundations" (WWW...Tutorial "Linked Data Query Processing" Part 2 "Theoretical Foundations" (WWW...
Tutorial "Linked Data Query Processing" Part 2 "Theoretical Foundations" (WWW...
Olaf Hartig
 
Tutorial "Linked Data Query Processing" Part 1 "Introduction" (WWW 2013 Ed.)
Tutorial "Linked Data Query Processing" Part 1 "Introduction" (WWW 2013 Ed.)Tutorial "Linked Data Query Processing" Part 1 "Introduction" (WWW 2013 Ed.)
Tutorial "Linked Data Query Processing" Part 1 "Introduction" (WWW 2013 Ed.)
Olaf Hartig
 
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...
Olaf Hartig
 
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 2 (...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 2 (...Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 2 (...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 2 (...
Olaf Hartig
 
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 1 (...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 1 (...Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 1 (...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 1 (...
Olaf Hartig
 
An Overview on PROV-AQ: Provenance Access and Query
An Overview on PROV-AQ: Provenance Access and QueryAn Overview on PROV-AQ: Provenance Access and Query
An Overview on PROV-AQ: Provenance Access and Query
Olaf Hartig
 
(An Overview on) Linked Data Management and SPARQL Querying (ISSLOD2011)
(An Overview on) Linked Data Management and SPARQL Querying (ISSLOD2011)(An Overview on) Linked Data Management and SPARQL Querying (ISSLOD2011)
(An Overview on) Linked Data Management and SPARQL Querying (ISSLOD2011)
Olaf Hartig
 
Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversa...
Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversa...Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversa...
Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversa...
Olaf Hartig
 
The Impact of Data Caching of on Query Execution for Linked Data
The Impact of Data Caching of on Query Execution for Linked DataThe Impact of Data Caching of on Query Execution for Linked Data
The Impact of Data Caching of on Query Execution for Linked DataOlaf Hartig
 
How Caching Improves Efficiency and Result Completeness for Querying Linked Data
How Caching Improves Efficiency and Result Completeness for Querying Linked DataHow Caching Improves Efficiency and Result Completeness for Querying Linked Data
How Caching Improves Efficiency and Result Completeness for Querying Linked Data
Olaf Hartig
 
A Main Memory Index Structure to Query Linked Data
A Main Memory Index Structure to Query Linked DataA Main Memory Index Structure to Query Linked Data
A Main Memory Index Structure to Query Linked Data
Olaf Hartig
 
Towards a Data-Centric Notion of Trust in the Semantic Web (A Position Statem...
Towards a Data-Centric Notion of Trust in the Semantic Web (A Position Statem...Towards a Data-Centric Notion of Trust in the Semantic Web (A Position Statem...
Towards a Data-Centric Notion of Trust in the Semantic Web (A Position Statem...
Olaf Hartig
 
Brief Introduction to the Provenance Vocabulary (for W3C prov-xg)
Brief Introduction to the Provenance Vocabulary (for W3C prov-xg)Brief Introduction to the Provenance Vocabulary (for W3C prov-xg)
Brief Introduction to the Provenance Vocabulary (for W3C prov-xg)Olaf Hartig
 
Querying Linked Data with SPARQL (2010)
Querying Linked Data with SPARQL (2010)Querying Linked Data with SPARQL (2010)
Querying Linked Data with SPARQL (2010)
Olaf Hartig
 

More from Olaf Hartig (20)

LDQL: A Query Language for the Web of Linked Data
LDQL: A Query Language for the Web of Linked DataLDQL: A Query Language for the Web of Linked Data
LDQL: A Query Language for the Web of Linked Data
 
A Context-Based Semantics for SPARQL Property Paths over the Web
A Context-Based Semantics for SPARQL Property Paths over the WebA Context-Based Semantics for SPARQL Property Paths over the Web
A Context-Based Semantics for SPARQL Property Paths over the Web
 
Rethinking Online SPARQL Querying to Support Incremental Result Visualization
Rethinking Online SPARQL Querying to Support Incremental Result VisualizationRethinking Online SPARQL Querying to Support Incremental Result Visualization
Rethinking Online SPARQL Querying to Support Incremental Result Visualization
 
Tutorial "Linked Data Query Processing" Part 5 "Query Planning and Optimizati...
Tutorial "Linked Data Query Processing" Part 5 "Query Planning and Optimizati...Tutorial "Linked Data Query Processing" Part 5 "Query Planning and Optimizati...
Tutorial "Linked Data Query Processing" Part 5 "Query Planning and Optimizati...
 
Tutorial "Linked Data Query Processing" Part 4 "Execution Process" (WWW 2013 ...
Tutorial "Linked Data Query Processing" Part 4 "Execution Process" (WWW 2013 ...Tutorial "Linked Data Query Processing" Part 4 "Execution Process" (WWW 2013 ...
Tutorial "Linked Data Query Processing" Part 4 "Execution Process" (WWW 2013 ...
 
Tutorial "Linked Data Query Processing" Part 3 "Source Selection Strategies" ...
Tutorial "Linked Data Query Processing" Part 3 "Source Selection Strategies" ...Tutorial "Linked Data Query Processing" Part 3 "Source Selection Strategies" ...
Tutorial "Linked Data Query Processing" Part 3 "Source Selection Strategies" ...
 
Tutorial "Linked Data Query Processing" Part 2 "Theoretical Foundations" (WWW...
Tutorial "Linked Data Query Processing" Part 2 "Theoretical Foundations" (WWW...Tutorial "Linked Data Query Processing" Part 2 "Theoretical Foundations" (WWW...
Tutorial "Linked Data Query Processing" Part 2 "Theoretical Foundations" (WWW...
 
Tutorial "Linked Data Query Processing" Part 1 "Introduction" (WWW 2013 Ed.)
Tutorial "Linked Data Query Processing" Part 1 "Introduction" (WWW 2013 Ed.)Tutorial "Linked Data Query Processing" Part 1 "Introduction" (WWW 2013 Ed.)
Tutorial "Linked Data Query Processing" Part 1 "Introduction" (WWW 2013 Ed.)
 
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...
 
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 2 (...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 2 (...Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 2 (...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 2 (...
 
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 1 (...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 1 (...Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 1 (...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 1 (...
 
An Overview on PROV-AQ: Provenance Access and Query
An Overview on PROV-AQ: Provenance Access and QueryAn Overview on PROV-AQ: Provenance Access and Query
An Overview on PROV-AQ: Provenance Access and Query
 
(An Overview on) Linked Data Management and SPARQL Querying (ISSLOD2011)
(An Overview on) Linked Data Management and SPARQL Querying (ISSLOD2011)(An Overview on) Linked Data Management and SPARQL Querying (ISSLOD2011)
(An Overview on) Linked Data Management and SPARQL Querying (ISSLOD2011)
 
Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversa...
Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversa...Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversa...
Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversa...
 
The Impact of Data Caching of on Query Execution for Linked Data
The Impact of Data Caching of on Query Execution for Linked DataThe Impact of Data Caching of on Query Execution for Linked Data
The Impact of Data Caching of on Query Execution for Linked Data
 
How Caching Improves Efficiency and Result Completeness for Querying Linked Data
How Caching Improves Efficiency and Result Completeness for Querying Linked DataHow Caching Improves Efficiency and Result Completeness for Querying Linked Data
How Caching Improves Efficiency and Result Completeness for Querying Linked Data
 
A Main Memory Index Structure to Query Linked Data
A Main Memory Index Structure to Query Linked DataA Main Memory Index Structure to Query Linked Data
A Main Memory Index Structure to Query Linked Data
 
Towards a Data-Centric Notion of Trust in the Semantic Web (A Position Statem...
Towards a Data-Centric Notion of Trust in the Semantic Web (A Position Statem...Towards a Data-Centric Notion of Trust in the Semantic Web (A Position Statem...
Towards a Data-Centric Notion of Trust in the Semantic Web (A Position Statem...
 
Brief Introduction to the Provenance Vocabulary (for W3C prov-xg)
Brief Introduction to the Provenance Vocabulary (for W3C prov-xg)Brief Introduction to the Provenance Vocabulary (for W3C prov-xg)
Brief Introduction to the Provenance Vocabulary (for W3C prov-xg)
 
Querying Linked Data with SPARQL (2010)
Querying Linked Data with SPARQL (2010)Querying Linked Data with SPARQL (2010)
Querying Linked Data with SPARQL (2010)
 

Recently uploaded

20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
Kumud Singh
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
James Anderson
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
Neo4j
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 
GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...
ThomasParaiso2
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
Uni Systems S.M.S.A.
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
KAMESHS29
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
Pierluigi Pugliese
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
sonjaschweigert1
 

Recently uploaded (20)

20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 
GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
 

Using Web Data Provenance for Quality Assessment

  • 1. Using Web Data Provenance for Quality Assessment Olaf Hartig* Jun Zhao˚ *Humboldt-Universität zu Berlin ˚University of Oxford
  • 2. Information Quality (IQ) ● Common definition: fitness for use of information ● Multidimensional concept Category* Criteria / Dimensions Intrinsic Accuracy, Believability, Objectivity, ... Contextual Completeness, Relevance, Timeliness, ... Representational Conciseness, Understandability, ... Accessibility Availability, Security, ... *Classification by Wang and Strong, 1996 ● IQ criteria not independent of each other ● Relevancy of criteria determined by task and preferences Olaf Hartig - Using Web Data Provenance for Quality Assessment 2
  • 3. IQ Assessment ● Assigning numerical values (IQ scores) to IQ criteria ● It is difficult! ● Precision vs. Practicality Manual methods Semi-automatic methods ● Questionnaires ● Rating-based ● Reputation-based Olaf Hartig - Using Web Data Provenance for Quality Assessment 3
  • 4. Automated IQ Assessment ● Literature only outlines ideas for automatic methods ● Content analysis ● Comparison (e.g. outlier detection) ● Application of information retrieval methods ● Analysis of results from data cleansing ● Sampling techniques ● Context analysis ● Analysis of metadata ● Utilization of domain knowledge Olaf Hartig - Using Web Data Provenance for Quality Assessment 4
  • 5. Our Goal: Methods to automatically assess IQ criteria of Web data Primary means: Provenance of assessed data Olaf Hartig - Using Web Data Provenance for Quality Assessment 5
  • 6. Outline 1. Web Data Provenance 2. General Assessment Approach 3. Development of Assessment Methods Olaf Hartig - Using Web Data Provenance for Quality Assessment 6
  • 7. Existing Provenance Research ● Main research areas: (scientific) workflows, DBMSs ● General focus: data creation Olaf Hartig - Using Web Data Provenance for Quality Assessment 7
  • 8. Provenance of Web Data Olaf Hartig - Using Web Data Provenance for Quality Assessment 8
  • 9. Provenance of Web Data Web data provenance comprises two dimensions: Data Creation • Data Access Olaf Hartig - Using Web Data Provenance for Quality Assessment 9
  • 10. Model of Web Data Provenance ● Provenance graph describes provenance of a data item ● Nodes: provenance elements – pieces of provenance info ● Edges: relate provenance elements to each other ● Subgraphs for related data items possible Olaf Hartig - Using Web Data Provenance for Quality Assessment 10
  • 11. Model of Web Data Provenance ● Provenance model defines: Actors ● Types of provenance elements Executions ● Relationships Artifacts Olaf Hartig - Using Web Data Provenance for Quality Assessment 11
  • 12. Data Access Dimension Data Item Data Accessor (Non-Human) contains performs retrieved by Document Execution Time Data Access accessed Data Providing Service (Non-Human) controls uses Service Provider Data Publisher (Human) Relation to the provided Information Resource Olaf Hartig - Using Web Data Provenance for Quality Assessment 12
  • 13. Data Access Dimension cont. (Verified) Artifact Integrity Verification Verification Result {incomplete} Signer Signature Verification Relation to the signed Data Signature Method Olaf Hartig - Using Web Data Provenance for Quality Assessment 13
  • 14. Data Creation Dimension Provenance Information Source Data Execution Time Provenance Information Creation Guidelines Data Creator Data Creation (Human or Non-human) {complete,disjoint} Data Creating Device (e.g. Sensor) Data Item Data Creating Service (e.g. Software Agent) part of responsible for responsible for Provenance Data Creating Entity Information (e.g. Person, Group, Orga.) (Encompassing) Data Item Relation to the created Data Olaf Hartig - Using Web Data Provenance for Quality Assessment 14
  • 15. Outline 1. Web Data Provenance 2. General Assessment Approach 3. Development of Assessment Methods Olaf Hartig - Using Web Data Provenance for Quality Assessment 15
  • 16. A General Approach ● Blueprint for actual assessment methods that ● Address specific scenario ● Focus on specific IQ criterion ● Provenance elements have an influence on IQ ● Impact values represent these influences ● Assessment is affected by knowing about the influences ● Calculation of the IQ score with an assessment function that combines all impact values Olaf Hartig - Using Web Data Provenance for Quality Assessment 16
  • 17. General Assessment Procedure Step 1 – Generate a provenance graph for the data item Step 2 – Annotate the provenance graph with impact values Step 3 – Execute the assessment function Olaf Hartig - Using Web Data Provenance for Quality Assessment 17
  • 18. Outline 1. Web Data Provenance 2. General Assessment Approach 3. Development of Assessment Methods Olaf Hartig - Using Web Data Provenance for Quality Assessment 18
  • 19. Designing Assessment Methods ● Developing the general approach into an actual method ● Fundamental design question: For which IQ criterion do we want to apply the method? Olaf Hartig - Using Web Data Provenance for Quality Assessment 19
  • 20. Designing Assessment Methods ● Developing the general approach into an actual method ● Fundamental design question: For which IQ criterion do we want to apply the method? ● Timeliness: degree to which the data item is up-to-date with respect to the task at hand ● Representation* as an absolute measure in [0,1] ● 1 – meeting the most strict timeliness standards ● 0 – unacceptable *Following Ballou et al., 1998 Olaf Hartig - Using Web Data Provenance for Quality Assessment 20
  • 21. 1 Generate the Provenance Graph What types of provenance elements are necessary? What level of detail (i.e. granularity) is necessary? Where and how do we get provenance information? ● Two complementary options: ● Recording ● Analyzing metadata Olaf Hartig - Using Web Data Provenance for Quality Assessment 21
  • 22. 1 Generate the Provenance Graph Example: ● Sensors (e.g. sensor1) hourly take measurement (e.g. msr) ● All msr stored in a Web-accessible storage device (store) ● Our system (sys) accesses them for further processing ● sys assesses the timeliness of all msr Olaf Hartig - Using Web Data Provenance for Quality Assessment 22
  • 23. 1 Generate the Provenance Graph Example: ● Sensors (e.g. sensor1) hourly take measurement (e.g. msr) ● All msr stored in a Web-accessible storage device (store) ● Our system (sys) accesses them for further processing ● sys assesses the timeliness of all msr msr created by performed by sensor1 type: Data Item cExc type: Data Creator type: Data Creation contained by Execution Time: 10:00 doc retrieved by store type: Document type: Data Providing Service aExc accessed type: Data Access sys performed by type: Data Accessor Execution Time: 10:13 Olaf Hartig - Using Web Data Provenance for Quality Assessment 23
  • 24. 2 Annotation with Impact Values How might each provenance element influence the IQ criterion? ● Systematically analyze each type of provenance elements What kind of impact values are necessary? How do we represent the influences by impact values? ● Impact values not necessarily numerical ● Depends on the assessment function in step 3 How do we determine impact values? Olaf Hartig - Using Web Data Provenance for Quality Assessment 24
  • 25. Determining Impact Values ● From the provenance information ● From user input ● Configuration options ● Rating-based, Reputation-based ● By content analysis ● Comparison (e.g. outlier detection) ● Adoption of information retrieval methods ● Adoption of data cleansing techniques ● By context analysis ● Further metadata ● Domain knowledge Olaf Hartig - Using Web Data Provenance for Quality Assessment 25
  • 26. 2 Annotation with Impact Values How might each provenance element influence the IQ criterion? Data Creation Dimension: Prov. Element Type Impact Values Data Creation ● creation time ● weights Creation Guidelines - (Source) Data Item ● expiry time Data Creator - Olaf Hartig - Using Web Data Provenance for Quality Assessment 26
  • 27. 2 Annotation with Impact Values msr created by performed by sensor1 type: Data Item cExc type: Data Creator type: Data Creation contained by Execution Time: 10:00 doc retrieved by store type: Document type: Data Providing Service aExc accessed type: Data Access sys performed by type: Data Accessor Execution Time: 10:13 Prov. Element Type Impact Values Data Creation ● creation time ● weights Creation Guidelines - (Source) Data Item ● expiry time Data Creator - Olaf Hartig - Using Web Data Provenance for Quality Assessment 27
  • 28. 2 Annotation with Impact Values msr created by performed by sensor1 type: Data Item cExc type: Data Creator type: Data Creation creation time contained by 10:00 Execution Time: 10:00 doc retrieved by store type: Document type: Data Providing Service aExc accessed type: Data Access sys performed by type: Data Accessor Execution Time: 10:13 Prov. Element Type Impact Values Data Creation ● creation time ● weights Creation Guidelines - (Source) Data Item ● expiry time Data Creator - Olaf Hartig - Using Web Data Provenance for Quality Assessment 28
  • 29. 2 Annotation with Impact Values msr created by performed by sensor1 type: Data Item cExc type: Data Creator expiry time type: Data Creation 11:00 creation time contained by 10:00 Execution Time: 10:00 doc retrieved by store type: Document type: Data Providing Service aExc accessed type: Data Access sys performed by type: Data Accessor Execution Time: 10:13 Prov. Element Type Impact Values Data Creation ● creation time ● weights Creation Guidelines - (Source) Data Item ● expiry time Data Creator - Olaf Hartig - Using Web Data Provenance for Quality Assessment 29
  • 30. 3 Assessment Function How do we represent the IQ criterion by an IQ score? What does the assessment function look like? ● Develop the function together with the impact values ● Take incompleteness into consideration ● Provenance graphs could be fragmentary ● Annotations could be missing Olaf Hartig - Using Web Data Provenance for Quality Assessment 30
  • 31. Step 3 – Assessment Function Olaf Hartig - Using Web Data Provenance for Quality Assessment 31
  • 32. Step 3 – Assessment Function msr created by performed by sensor1 type: Data Item cExc type: Data Creator expiry time type: Data Creation 11:00 creation time contained by 10:00 Execution Time: 10:00 doc retrieved by store type: Document type: Data Providing Service aExc accessed type: Data Access sys performed by type: Data Accessor Execution Time: 10:13 Olaf Hartig - Using Web Data Provenance for Quality Assessment 32
  • 33. Step 3 – Assessment Function msr created by performed by sensor1 type: Data Item cExc type: Data Creator expiry time type: Data Creation 11:00 creation time contained by 10:00 Execution Time: 10:00 doc retrieved by store type: Document type: Data Providing Service aExc accessed type: Data Access sys performed by type: Data Accessor Execution Time: 10:13 Olaf Hartig - Using Web Data Provenance for Quality Assessment 33
  • 34. Step 3 – Assessment Function t(msr) = 1 – (10:15 – 10:00) / (11:00 – 10:00) =1– 0.25h / 1h = 0.75 msr created by performed by sensor1 type: Data Item cExc type: Data Creator expiry time type: Data Creation 11:00 creation time contained by 10:00 Execution Time: 10:00 doc retrieved by store type: Document type: Data Providing Service aExc accessed type: Data Access sys performed by type: Data Accessor Execution Time: 10:13 Olaf Hartig - Using Web Data Provenance for Quality Assessment 34
  • 35. Conclusion ● Web Data Provenance (data creation + data access) ● General approach for provenance-based IQ assessment ● Impact values: influence of provenance elements on IQ ● Design decisions for actual assessment methods ● Application to timeliness (more in the paper) ● Future work: ● How do we deal with incompleteness? ● Application of the approach to other IQ criteria Olaf Hartig - Using Web Data Provenance for Quality Assessment 35
  • 36. These slides have been created by Olaf Hartig http://olafhartig.de This work is licensed under a Creative Commons Attribution-Share Alike 3.0 License (http://creativecommons.org/licenses/by-sa/3.0/) Attribution: ● http://www.flickr.com/photos/rrrrred/3809362767/ ● http://www.hasslefreeclipart.com Olaf Hartig - Using Web Data Provenance for Quality Assessment 36