Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Customer needs for Data quality by Irene Polikoff

156 views

Published on

https://2016.semantics.cc/satellite-events/data-quality-tutorial

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Customer needs for Data quality by Irene Polikoff

  1. 1. © Copyright 2016 TopQuadrant Inc. Slide 1 Customer Needs for Data Quality Irene Polikoff, CEO Ralph Hodgson, CTO TopQuadrant
  2. 2. © Copyright 2016 TopQuadrant Inc. Slide 2 TopBraid Enterprise Solutions Your Enterprise Solutions Customize/C onfigure Your Own Solutions and Platform IDE TopBraid Platform Solution Engine Search / Content Enrichment through the use of Taxonomies and Ontologies Data Governance: Reference Data Management/Metadata Management/ Data Lineage Data Layer
  3. 3. © Copyright 2016 TopQuadrant Inc. Slide 4 What is Data Quality  The five C’s: – Consistency – Completeness – Correctness – Conformance – Comprehensibility  Plus – Precision – Temporality
  4. 4. © Copyright 2016 TopQuadrant Inc. Slide 5 Examples of where TopQuadrant has met the needs for Data Quality  Consumer Products – Clearance in different markets  Production Reporting – Oil & Gas  Asset Management – V-CON project  Regulatory Compliance – Finance Sector
  5. 5. © Copyright 2016 TopQuadrant Inc. Slide 6 Common Issues with ‘self created’ RDF Data  Careless URIs e.g., skos:label  Incorrect use of predicates e.g, skos:broader with a text value  Missing rdf:type statements  Inconsistent literals e.g., text versus integer  Mal-formed strings  Conflated values  Inconsistent Units of Measure
  6. 6. © Copyright 2016 TopQuadrant Inc. Slide 7 After initial load, data quality is about enforcing “required practices”  Each organization will have its own  Common themes are: – Requiring some fields – Capitalizing names – Enforcing certain patterns (what characters are allowed) – Enforcing “permissible” values – Complex rules with dependencies between fields – Totally “closed world”
  7. 7. © Copyright 2016 TopQuadrant Inc. Slide 8 Quality-enabling tool support  Form generation based on: – class definition – SHACL constraints  Auto-completion of entries  Cardinality enforcement  Data types enforcement – SHACL + QUDT
  8. 8. © Copyright 2016 TopQuadrant Inc. Slide 9 As an example – definition of a class
  9. 9. © Copyright 2016 TopQuadrant Inc. Slide 10 As an example – resulting ‘instance’ form
  10. 10. © Copyright 2016 TopQuadrant Inc. Slide 11 In some cases, enforcement is “soft”
  11. 11. Data Validation happens in real time, but also “after the fact” For information governed by EDG e.g., reference data, glossary terms, etc.
  12. 12. It is an ongoing process summarized in dashboards and metrics
  13. 13. © Copyright 2016 TopQuadrant Inc. Slide 14 We have been transitioning to using SHACL for class definitions and UI customizations • Users can now create not only classes and properties, but also SHACL constraints
  14. 14. Ask Thank You Ralph Hodgson E-mail: rhodgson@topquadrant.com Twitter: @ralphtq, @topquadrant Irene Polikoff E-mail: irene@topquadrant.com

×