A Role for Provenance in Quality Assessment


Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • In this talk I will outline: why the need for quality assessment exists describe how quality is perceived outline our approach to quality assessment provide an example scenario and outline our future work.
  • Don’t know whether information is accuracte: need to assess! Web has evolved. Web = open platform. Web is big, need smaller platform for eval.
  • Consider mobile phones providing passenger information regarding the location of buses. Sometimes we get lucky and observations land right on the bus route. However, there are many different sources of low quality data. Inaccurate GPS readings… Malicious users… someone playing with the app while at home People that make mistakes… someone perhaps on the wrong bus…
  • Animate this ObservationValue ->[Motivate SSN here] Observation + foi -> disruption report
  • DataRequirement1 -> wasAttributedTo -> Agent
  • A Role for Provenance in Quality Assessment

    1. 1. A Role for Provenance in QualityAssessmentChris Baillie, Pete Edwards, and Edoardo Pignottic.baillie@abdn.ac.uk
    2. 2. Overview Motivation Evaluating Data Quality A Role for Provenance Future work c.baillie@abdn.ac.uk
    3. 3. Motivation “we don’t know whether the information we find [on the Web] is accurate or not. So we have to teach people how to assess what they’ve found’’ Vint Cerf, 2010 Web of Documents has become the Web of documents, services, data, and people. Anyone can publish anything so we need a way to evaluate quality. We are investigating these issues within the Internet of Things  Sensors now at the centre of many applications c.baillie@abdn.ac.uk
    4. 4. Example Scenario c.baillie@abdn.ac.uk
    5. 5. Evaluating Data Quality Quality Scores -Quality is a multi-Entity (and context) dimensional constructTo evaluate quality, we - Accuracymust examine the - Timelinesscontext around data - Relevance F(E, R) = QWIQA Frameworkexamines data content, Data Requirementscontext, and external -Furber and Hepp (2011)ratings use rules to identify (Bizer et al. 2009) quality problems c.baillie@abdn.ac.uk
    6. 6. Representing Sensor Observations Linked Data: “recommended best practice for exposing, sharing, and connecting pieces of data using URIs and RDF” c.baillie@abdn.ac.uk
    7. 7. Performing Quality Assessment CONSTRUCT { _:b0 a QualityScore . _:b0 score ?qs . ( E distanceFromRoute X ) _:b0 dqm:ruleViolation _:b1 . Rrelevance = 1- 100 _:b1 a DataRequirementViolation . _:b1 dqm:affectedInstance ?instance . } WHERE { ?instance a Observation . ?instance distanceFromRoute ?distance . LET (?qs := (1 - (?distance / 100))) . } c.baillie@abdn.ac.uk
    8. 8. Quality Assessment Results c.baillie@abdn.ac.uk
    9. 9. Observation Provenance Provenance is a critical part of observation context Describes the entities, agents, and activities involved in data creation:  How was the observation value measured?  Who controlled the sensing process?  How has the observation been transformed since it was created? W3C Prov-O model provides linked data representation of provenance
    10. 10. Observation Provenance Entity "Observation 2" wasGeneratedBy Activity "Map matching" used Agent "Chris" Entity "Observation 1" wasAssociatedWith wasGeneratedBy Activity "Sensing Process" used Entity "iPhoneSensor"
    11. 11. Quality Score Provenance
    12. 12. Work To Date Developed Quality Assessment Framework that enables:  Linked data representation of sensor observations  Definition of quality requirements using SPARQL rules  Generation of quality scores via reasoningFuture Work Implementation of quality rules that examine provenance Investigate quality score re-use
    13. 13. Any questions?Come and see the IRP demo (D9) to see quality assessment in action.
    14. 14. Implementation Quality Rules Observation Reasoner Relevance Triple (SPIN) Rule Store Timeliness Rule Apache Tomcat Accuracy Rule Observation Quality Service Service Availability Rule