The Great Data Debate (3) ISO8000: Systemic and systematic data quality, T.King
Upcoming SlideShare
Loading in...5
×
 

The Great Data Debate (3) ISO8000: Systemic and systematic data quality, T.King

on

  • 451 views

This presentation was from a joint BCS/DAMA event on 20/6/13 discussing different aspects of assessing data quality and the role that data quality dimensions can play. This presentation was by Tim ...

This presentation was from a joint BCS/DAMA event on 20/6/13 discussing different aspects of assessing data quality and the role that data quality dimensions can play. This presentation was by Tim King, LSC Group who provided an overview on ISO8000 and the standards perspectives to assessing data quality.
The video for this presentation is available here https://www.youtube.com/watch?v=kftnEO_A49c

Statistics

Views

Total Views
451
Views on SlideShare
441
Embed Views
10

Actions

Likes
0
Downloads
11
Comments
0

1 Embed 10

http://www.linkedin.com 10

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

The Great Data Debate (3) ISO8000: Systemic and systematic data quality, T.King The Great Data Debate (3) ISO8000: Systemic and systematic data quality, T.King Presentation Transcript

  • The Great Data Debate – Do data quality dimensions have a place in assessing data quality? DAMA UK/ BCS Data Management Specialist Group – 20th June 2013
  • ISO 8000: Systemic and systematic data quality 03 Tim King, LSC Group
  • ISO 8000: Systemic & systematic data quality Dr. Timothy M. KING CEng CITP FIMechE FBCS DIC ACGI IKM Principal Consultant, LSC Group Convenor, ISO/TC184/SC4/WG13 DAMA / BCS DSMG Do data quality dimensions have a place in assessing data quality? 2013-06-20
  • The context • ISO/TC184/SC4 – "Industrial data" – sub-committee of ISO/TC184 – "Automation systems & integration" – founded July 1984 • standards for exchange, sharing & archiving of industrial data – ISO 10303 – Product data representation & exchange – ISO 13584 – Parts library – ISO 15531 – Industrial manufacturing management data – ISO 15926 – Integration of life-cycle data for process plants – ISO 16739 – Data sharing in the construction & facility management industries – ISO 17506 – 3D visualization of industrial data – ISO 18629 – Process specification language – ISO 18876 – Integration of industrial data for exchange, access & sharing – ISO 22745 – Open technical dictionaries & their application to master data – ISO 29002 – Exchange of characteristic data 23
  • The context • standards for exchange, sharing & archiving of industrial data – ISO 10303 – Product data representation & exchange – ISO 13584 – Parts library – ISO 15531 – Industrial manufacturing management data – ISO 15926 – Integration of life-cycle data for process plants – ISO 16739 – Data sharing in the construction & facility management industries – ISO 17506 – 3D visualization of industrial data – ISO 18629 – Process specification language – ISO 18876 – Integration of industrial data for exchange, access & sharing – ISO 22745 – Open technical dictionaries & their application to master data – ISO 29002 – Exchange of characteristic data  ISO/TC184/SC4/WG13 "Industrial data quality" developing ISO 8000 "Data quality" since 2006 24
  • ISO/TC184/SC4/WG13 • "Industrial data" • founded 2006 • three face-to-face meetings per year – two in parallel with parent committee ISO/TC184/SC4 • teleconference calls using Webex – provided by ISO with free dial capability for all participants • e-mail distribution list – 150+ experts (including academics, engineers, scientists, consultants) – 20+ countries – manufacturing, logistics, mining, health, finance • typical attendance at meetings of 15 to 20 individuals 25
  • What is data quality? 26
  • What is data quality? • ... lost upon entry into orbit around Mars • the Executive Summary from the Mishap Investigation Board identified that the primary cause of the accident was a data quality issue … The Mars Climate Orbiter "thruster performance data in English units was used … the data … was required to be in metric units per existing software interface documentation" 27
  • What is data quality? data quality spare part in warehouse but not recorded in computer number in stock = 0 data has no sensible interpretation length of bolt = "green" self-intersecting curve in CAD file 28
  • What is data quality? • ISO/IEC 25012 (Software engineering data quality model) • ISO/IEC 15288 (Systems engineering) • Accenture • US Defense Logistics Information Service • Butler Group • Korean Database Promotion Centre • Shell • UK MOD Acquisition Management System • DGIQ (German Data & Information Quality Association) • IAIDQ (International Association for Information & Data Quality) 29
  • What is data quality? accessibility accessibility / security accuracy appropriate amount of data authenticity availability believability changeability clarity compatibility complete completeness compliance concise representation conciseness confidential confidentiality conformance with business rules congruity consistency consistent representation correctness cost / benefit credibility currency current currentness ease of manipulation efficiency flexibility free of error inaccurate integrity interpretability legible liability necessity objectivity outdated portability precision protection recoverability redundancy redundant referential integrity relevance relevancy relevant reputation retrievability safety security sufficiency timeliness timeliness / timely traceability unanimity understandability usability utility utilization validity validity of data content validity of format value added verifiable 30
  • ISO/IEC 25012 (Software engineering data quality model) accessibility accessibility / security accuracy appropriate amount of data authenticity availability believability changeability clarity compatibility complete completeness compliance concise representation conciseness confidential confidentiality conformance with business rules congruity consistency consistent representation correctness cost / benefit credibility currency current currentness ease of manipulation efficiency flexibility free of error inaccurate integrity interpretability legible liability necessity objectivity outdated portability precision protection recoverability redundancy redundant referential integrity relevance relevancy relevant reputation retrievability safety security sufficiency timeliness timeliness / timely traceability unanimity understandability usability utility utilization validity validity of data content validity of format value added verifiable 31
  • IAIDQ (International Association for Information & Data Quality) accessibility accessibility / security accuracy appropriate amount of data authenticity availability believability changeability clarity compatibility complete completeness compliance concise representation conciseness confidential confidentiality conformance with business rules congruity consistency consistent representation correctness cost / benefit credibility currency current currentness ease of manipulation efficiency flexibility free of error inaccurate integrity interpretability legible liability necessity objectivity outdated portability precision protection recoverability redundancy redundant referential integrity relevance relevancy relevant reputation retrievability safety security sufficiency timeliness timeliness / timely traceability unanimity understandability usability utility utilization validity validity of data content validity of format value added verifiable 32
  • What is data quality? ISO/IEC 25012 Software engineering data quality model IAIDQ International Association for Information & Data Quality accessibility accessibility / security accuracy appropriate amount of data authenticity availability believability changeability clarity compatibility complete completeness compliance concise representation conciseness confidential confidentiality conformance with business rules congruity consistency consistent representation correctness cost / benefit credibility currency current currentness ease of manipulation efficiency flexibility free of error inaccurate integrity interpretability legible liability necessity objectivity outdated portability precision protection recoverability redundancy redundant referential integrity relevance relevancy relevant reputation retrievability safety security sufficiency timeliness timeliness / timely traceability unanimity understandability usability utility utilization validity validity of data content validity of format value added verifiable accessibility accessibility / security accuracy appropriate amount of data authenticity availability believability changeability clarity compatibility complete completeness compliance concise representation conciseness confidential confidentiality conformance with business rules congruity consistency consistent representation correctness cost / benefit credibility currency current currentness ease of manipulation efficiency flexibility free of error inaccurate integrity interpretability legible liability necessity objectivity outdated portability precision protection recoverability redundancy redundant referential integrity relevance relevancy relevant reputation retrievability safety security sufficiency timeliness timeliness / timely traceability unanimity understandability usability utility utilization validity validity of data content validity of format value added verifiable 33
  • What is data quality? 34
  • The fundamentals of quality continual improvement of the quality management system customer ISO 9000:2005 A process-based quality management systemaccountability measurement, analysis & improvement management responsibility resource management satisfaction output input requirements product product realization 35
  • Information & data quality continual improvement of the quality management system customer ISO 9000:2005 A process-based quality management systemaccountability measurement, analysis & improvement management responsibility resource management satisfaction output input requirements product product realization for data processes, "product" is data product quality is conformance to requirements, data quality is conformance to data requirements requirements a process focus is the basis on which to build in quality product realization 36
  • The different perspectives on information & data quality business processes • the primary, core processes of interest to the user, involving making decisions & achieving outcomes for which the user is responsible • examples of these processes include designing an aircraft, recruiting a new member of staff, extinguishing a fire, manufacturing ice cream etc. 37
  • The different perspectives on information & data quality business processes information management • the means by which data are made available to ensure the right person at the right time can make the right decision as part of a particular business process • ISO 15288 identifies the following tasks as forming information management: generate, collect, transform, retain, retrieve, disseminate & dispose DAMA-DMBOK Guide • data governance • data architecture management • data development • database operations management • data security management • reference & master data management • data warehousing & business intelligence management • document & content management • meta data management • data quality management 38
  • The different perspectives on information & data quality business processes information management data enable processes processes create data resources enable information management • any component by which to achieve the required outcomes of information management • these resources include people, software & hardware 39
  • The different perspectives on information & data quality business processes information management data enable processes processes create data resources enable information management process focus quality management & process maturity data focus quality = conformance of data to requirements ISO 9000 ISO 15504 (ISO 33000) three types of quality • syntactic • semantic • pragmatic 40
  • ISO 8000 – In-scope list • The following are within the scope of ISO 8000: – principles of data quality; – characteristics of data that determine its quality; – requirements for achieving data quality; – requirements for the representation of data requirements, measurement methods, and inspection results for the purposes of data quality; – frameworks for measuring and improving data quality. 41
  • The parts of ISO 8000 General Information & data focus Process focus 42
  • The parts of ISO 8000 General Information & data focus Process focus 1 Overview, principles & general requirements 2 Terminology 3 Taxonomy 43
  • The parts of ISO 8000 General Information & data focus Process focus 8 Information quality: Concepts & measuring 9 Information quality: Relationship to other standards 10 Exchange of data: Syntax, semantic encoding & conformance to data specification 20 Exchange of data: Provenance 30 Exchange of data: Accuracy 40 Exchange of data: Completeness 100 Master data: Overview 102 Master data: Terminology 110 Master data: Exchange of characteristic data: Syntax, semantic encoding & conformance to data specification 120 Master data: Provenance 130 Master data: Accuracy 140 Master data: Completeness 311 Usage guide for ISO 10303-59 (Product data quality-shape) 44
  • The parts of ISO 8000 General Information & data focus Process focus 60 Data quality management: The overview of process assessment 61 Data quality management: Process reference model 62 Data quality management: Process maturity assessment model 63 Data quality management: Measurement framework 150 Master data: Quality management framework 45
  • Some complications • "information" & "data" – definitions from ISO/IEC 2382-1:1993 • data: "re-interpretable representation of information in a formalized manner suitable for communication, interpretation, or processing" • information: "knowledge concerning objects, such as facts, events, things, processes, or ideas, including concepts, that within a certain context has a particular meaning" • attributes? dimensions? does data have colour? – try reading warning notices in red text when wearing night vision goggles … – multiple layers to the issue • ISO/IEC 25012: "Software engineering data quality model" 46
  • Case study Data quality requirements in master data management 47
  • ISO 8000-120 Master Data Warehouse Portable master data with provenance Load Data  Capture provenance data  Map metadata to eOTD  Convert to ISO 22745-40 data stream ERP ISO 22745 Managed Ontology  Terminology  Data requirements  Classifications  Description rules Data Integration Master Data Cleansing 1. Identify reference data 2. Identify or assign class 3. Assign data requirement 4. Map properties (attributes) 5. Identify & standardize values 6. Obtain missing data (enrich) 7. Validate data Create multilingual descriptions Identify potential duplicates ECCMA Managed Ontology  Terminology (eOTD)  Data requirements (eDRR)  Classifications (eCLR) ISO 8000 in implementation form Courtesy of PiLog 48
  • Rigorous statement & exchange of requirements Data requester Data provider Sub Request for data eOTD-q-xml ISO 22745-35 Data exchange eOTD-r-xml ISO 22745-40 Request for data eOTD-q-xml ISO 22745-35 Data exchange eOTD-r-xml ISO 22745-40 Data requirement eOTD-i-xml ISO 22745-30 49
  • 52368965412 – Tire Bridgestone 435/95 R25 56329845 – Tyre BS 435/R25 Standard Purpose E3 2 Star Radial 125435 – Bridge Stone 25inch 435/95 965123465 – Tyre Bridgestone Part Number 12345 Inventory rationalization as a result of ISO 8000 Common ERP descriptions Standardised Long Description: Tire: Pneumatic, Vehicular: Service Type for Which Designed: Loader Tire Rim Nominal Diameter: 25' Tire Width: 445mm Aspect Ratio: 0.95 Tire Ply Arrangement: Radial Ply Rating: 2* Tire & Rim Association Number: E3 Tread Material: Standard Tire Air Retention Method: Tubeless Tire Load Index and Speed Symbol: NA Tread Pattern: VHB TKPH Rating: 80 Standardised Short Description: Tire Pneumatic: Loader 25‘ 445mm 0.95 2* 50
  • The benefits of ISO 8000 vague data requirements human-readable requirements requirements differ from project to project repeated cleansing of same non-conformances ad hoc approaches to validation explicit, measurable data requirements computer-processable requirements classified, common types of requirement data right, first & every time recommended types of validation 51
  • Conclusions • systematic – alignment with ISO 9000 principles of quality – driven by explicit, robust data requirements • systemic – errors in data fields as a symptom of the real problem – sustainable quality from the enterprise strategy downwards 52
  • Useful links • ISO – http://www.iso.org/iso/home.html • ISO/TC184/SC4/WG13 – http://isotc.iso.org/livelink/livelink?func=ll&objId=8838237&objAction=brows e&sort=name • BSI AMT/4 "Industrial data & manufacturing interfaces" – http://standardsdevelopment.bsigroup.com/Home/Committee/50001757 • LSC Group – http://www.lsc.co.uk/ 53