SlideShare a Scribd company logo
1 of 12
DWH-Ahsan AbdullahDWH-Ahsan Abdullah
11
Data WarehousingData Warehousing
Lecture-23Lecture-23
Total DQMTotal DQM
Virtual University of PakistanVirtual University of Pakistan
Ahsan Abdullah
Assoc. Prof. & Head
Center for Agro-Informatics Research
www.nu.edu.pk/cairindex.asp
National University of Computers & Emerging Sciences, Islamabad
Email: ahsan101@yahoo.com
2
Data Quality Management ProcessData Quality Management Process
Establish TDQM
Environment
Scope Data Quality Projects &
Develop Implementation Plans
Implement Data Quality Projects
(Define, Measure, Analyze, Improve)
Evaluate Data Quality
Management Methods
3
Data Quality Management ProcessData Quality Management Process
1.1.Establish Data Quality ManagementEstablish Data Quality Management
EnvironmentEnvironment
• IS project managers
• Development professionals.
• Functional users of legacy information
systems with domain knowledge
• IS developers know solutions but don’t
know how and where to modify
Sub-bullets will not go to graphics
4
Data Quality Management ProcessData Quality Management Process
2. Scope Data Quality Projects & Develop2. Scope Data Quality Projects & Develop
Implementation PlansImplementation Plans
• Task Summary: Project goals, scope, and potential benefits
• Task Description: Describe data quality analysis tasks
• Project Approach: Summarize tasks and tools used to provide a
baseline of existing data quality
• Schedule: Identify task start, completion dates, and project
milestones
• Resources: Include costs connected with tools acquisition, labor
hours (by labor category), training, travel, and other direct and
indirect costs
yellow will go to graphics
5
Data Quality Management ProcessData Quality Management Process
3.3. Implement Data Quality Projects (Define, Measure,Implement Data Quality Projects (Define, Measure,
Analyze, Improve)Analyze, Improve)
• Define: Identify functional user DQ requirements and
establish DQ metrics
• Measure: conformance to current business rules and
develop exception reports
• Analyze: Verify, validate, and assess poor DQ causes.
Define improvement opportunities
• Improve: Select/prioritize DQ improvement
opportunities i.e. data entry procedures, updating data
validation rules, and/or company data standards.
Yellow will go to graphics
6
Data Quality Management ProcessData Quality Management Process
4.4. Evaluate Data Quality Management MethodsEvaluate Data Quality Management Methods
• modifying or rejuvenating existing methods of
DQ management
• determining if DQ projects have helped to
achieve demonstrable goals and benefits.
Evaluating and assessing DQ work as, it is not a
program, but a new way of doing business.
Sub-bullets will not go to graphics
DWH-Ahsan AbdullahDWH-Ahsan Abdullah
77
The House of Quality MatrixThe House of Quality Matrix
Customer
Requirements
Interrelationship
Matrix
Technical Correlation
Matrix
Technical Design
Requirements
8
How to improve Data Quality?How to improve Data Quality?
The four categories of Data Quality ImprovementThe four categories of Data Quality Improvement
 ProcessProcess
 SystemSystem
 Policy & ProcedurePolicy & Procedure
 Data DesignData Design
9
Quality Management Maturity GridQuality Management Maturity Grid
CMM Level-1
Uncertainty
CMM Level-2
Awakening
CMM Level-3
Enlightenment
CMM Level-4
Wisdom
CMM Level-5
Certainity
10
Misconceptions on Data QualityMisconceptions on Data Quality
 You Can Fix DataYou Can Fix Data
 Problem NOT in data, but how it was used.Problem NOT in data, but how it was used.
 It is NOT a one time process.It is NOT a one time process.
 Buying a cleansing tool is NOT the solution.Buying a cleansing tool is NOT the solution.
 Some live with the problem, cant afford the tool.Some live with the problem, cant afford the tool.
 Data Quality is an IT ProblemData Quality is an IT Problem
 It is the company problem.It is the company problem.
 Define the metrics of quality.Define the metrics of quality.
 Business has to strike a balance between quality and ROI.Business has to strike a balance between quality and ROI.
 Joint business and IT effort.Joint business and IT effort.
Sub-bullets will not go to graphics
11
Misconceptions on Data QualityMisconceptions on Data Quality
 (All) Problem is in the Data Sources or Data Entry(All) Problem is in the Data Sources or Data Entry
 NOT the only problem.NOT the only problem.
 Systems could be responsible, but actually it is the metrics.Systems could be responsible, but actually it is the metrics.
 Two divisions using different codes for same entity.Two divisions using different codes for same entity.
 Need to track, trace, check data from creation to usage.Need to track, trace, check data from creation to usage.
 The Data Warehouse will provide a single source of truthThe Data Warehouse will provide a single source of truth
 In ideal world it is indeed true.In ideal world it is indeed true.
 In real world maybe multiple data warehouses, data marts, externalIn real world maybe multiple data warehouses, data marts, external
source i.e. silos of data resulting in multiple sources of “truth”.source i.e. silos of data resulting in multiple sources of “truth”.
 Even with single source of truth, if transformations and interpretationsEven with single source of truth, if transformations and interpretations
are different, an issue.are different, an issue.
Sub-bullets will not go to graphics
12
GIGO

More Related Content

What's hot

Тестирование данных с помощью Data Quality Services (MS SQL 12)
Тестирование данных с помощью Data Quality Services (MS SQL 12)Тестирование данных с помощью Data Quality Services (MS SQL 12)
Тестирование данных с помощью Data Quality Services (MS SQL 12)SQALab
 
Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...
Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...
Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...Edureka!
 
Leverage Big Data Analytics to Enhance Clinical Trials from Planning to Execu...
Leverage Big Data Analytics to Enhance Clinical Trials from Planning to Execu...Leverage Big Data Analytics to Enhance Clinical Trials from Planning to Execu...
Leverage Big Data Analytics to Enhance Clinical Trials from Planning to Execu...Saama
 
The Hive Data Virtualization Introduction - Sanjay Krishnamurti, Chief Archit...
The Hive Data Virtualization Introduction - Sanjay Krishnamurti, Chief Archit...The Hive Data Virtualization Introduction - Sanjay Krishnamurti, Chief Archit...
The Hive Data Virtualization Introduction - Sanjay Krishnamurti, Chief Archit...The Hive
 
Have Data—Need Analysts. Lessons Learned From The Woodworking Industry
Have Data—Need Analysts. Lessons Learned From The Woodworking IndustryHave Data—Need Analysts. Lessons Learned From The Woodworking Industry
Have Data—Need Analysts. Lessons Learned From The Woodworking IndustryHealth Catalyst
 
Aa proj assited-living_iot
Aa proj assited-living_iotAa proj assited-living_iot
Aa proj assited-living_iotIshanDhoble1
 
Data Scientist Roles and Responsibilities | Data Scientist Career | Data Scie...
Data Scientist Roles and Responsibilities | Data Scientist Career | Data Scie...Data Scientist Roles and Responsibilities | Data Scientist Career | Data Scie...
Data Scientist Roles and Responsibilities | Data Scientist Career | Data Scie...Edureka!
 
Random Forest Tutorial | Random Forest in R | Machine Learning | Data Science...
Random Forest Tutorial | Random Forest in R | Machine Learning | Data Science...Random Forest Tutorial | Random Forest in R | Machine Learning | Data Science...
Random Forest Tutorial | Random Forest in R | Machine Learning | Data Science...Edureka!
 
Paradigm4 Research Report: Leaving Data on the table
Paradigm4 Research Report: Leaving Data on the tableParadigm4 Research Report: Leaving Data on the table
Paradigm4 Research Report: Leaving Data on the tableParadigm4
 
Data Science Training | Data Science For Beginners | Data Science With Python...
Data Science Training | Data Science For Beginners | Data Science With Python...Data Science Training | Data Science For Beginners | Data Science With Python...
Data Science Training | Data Science For Beginners | Data Science With Python...Simplilearn
 
Business Applications of Predictive Modeling at Scale - KDD 2016 Tutorial
Business Applications of Predictive Modeling at Scale - KDD 2016 TutorialBusiness Applications of Predictive Modeling at Scale - KDD 2016 Tutorial
Business Applications of Predictive Modeling at Scale - KDD 2016 TutorialQiang Zhu
 
Differential Privacy Case Studies (CMU-MSR Mindswap on Privacy 2007)
Differential Privacy Case Studies (CMU-MSR Mindswap on Privacy 2007)Differential Privacy Case Studies (CMU-MSR Mindswap on Privacy 2007)
Differential Privacy Case Studies (CMU-MSR Mindswap on Privacy 2007)Denny Lee
 
Leveraging Data Science in the Automotive Industry
Leveraging Data Science in the Automotive IndustryLeveraging Data Science in the Automotive Industry
Leveraging Data Science in the Automotive IndustryDomino Data Lab
 

What's hot (20)

Тестирование данных с помощью Data Quality Services (MS SQL 12)
Тестирование данных с помощью Data Quality Services (MS SQL 12)Тестирование данных с помощью Data Quality Services (MS SQL 12)
Тестирование данных с помощью Data Quality Services (MS SQL 12)
 
Data analytics
Data analyticsData analytics
Data analytics
 
Life Science Analytics
Life Science AnalyticsLife Science Analytics
Life Science Analytics
 
Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...
Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...
Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...
 
Machine Learning in Healthcare: A Case Study
Machine Learning in Healthcare: A Case StudyMachine Learning in Healthcare: A Case Study
Machine Learning in Healthcare: A Case Study
 
Data analytics
Data analyticsData analytics
Data analytics
 
Leverage Big Data Analytics to Enhance Clinical Trials from Planning to Execu...
Leverage Big Data Analytics to Enhance Clinical Trials from Planning to Execu...Leverage Big Data Analytics to Enhance Clinical Trials from Planning to Execu...
Leverage Big Data Analytics to Enhance Clinical Trials from Planning to Execu...
 
The Hive Data Virtualization Introduction - Sanjay Krishnamurti, Chief Archit...
The Hive Data Virtualization Introduction - Sanjay Krishnamurti, Chief Archit...The Hive Data Virtualization Introduction - Sanjay Krishnamurti, Chief Archit...
The Hive Data Virtualization Introduction - Sanjay Krishnamurti, Chief Archit...
 
Have Data—Need Analysts. Lessons Learned From The Woodworking Industry
Have Data—Need Analysts. Lessons Learned From The Woodworking IndustryHave Data—Need Analysts. Lessons Learned From The Woodworking Industry
Have Data—Need Analysts. Lessons Learned From The Woodworking Industry
 
Predictive analytics
Predictive analytics Predictive analytics
Predictive analytics
 
Aa proj assited-living_iot
Aa proj assited-living_iotAa proj assited-living_iot
Aa proj assited-living_iot
 
Data Analytics Life Cycle
Data Analytics Life CycleData Analytics Life Cycle
Data Analytics Life Cycle
 
Data Scientist Roles and Responsibilities | Data Scientist Career | Data Scie...
Data Scientist Roles and Responsibilities | Data Scientist Career | Data Scie...Data Scientist Roles and Responsibilities | Data Scientist Career | Data Scie...
Data Scientist Roles and Responsibilities | Data Scientist Career | Data Scie...
 
Machine Learning and Multi Drug Resistant(MDR) Infections case study
Machine Learning and Multi Drug Resistant(MDR) Infections case studyMachine Learning and Multi Drug Resistant(MDR) Infections case study
Machine Learning and Multi Drug Resistant(MDR) Infections case study
 
Random Forest Tutorial | Random Forest in R | Machine Learning | Data Science...
Random Forest Tutorial | Random Forest in R | Machine Learning | Data Science...Random Forest Tutorial | Random Forest in R | Machine Learning | Data Science...
Random Forest Tutorial | Random Forest in R | Machine Learning | Data Science...
 
Paradigm4 Research Report: Leaving Data on the table
Paradigm4 Research Report: Leaving Data on the tableParadigm4 Research Report: Leaving Data on the table
Paradigm4 Research Report: Leaving Data on the table
 
Data Science Training | Data Science For Beginners | Data Science With Python...
Data Science Training | Data Science For Beginners | Data Science With Python...Data Science Training | Data Science For Beginners | Data Science With Python...
Data Science Training | Data Science For Beginners | Data Science With Python...
 
Business Applications of Predictive Modeling at Scale - KDD 2016 Tutorial
Business Applications of Predictive Modeling at Scale - KDD 2016 TutorialBusiness Applications of Predictive Modeling at Scale - KDD 2016 Tutorial
Business Applications of Predictive Modeling at Scale - KDD 2016 Tutorial
 
Differential Privacy Case Studies (CMU-MSR Mindswap on Privacy 2007)
Differential Privacy Case Studies (CMU-MSR Mindswap on Privacy 2007)Differential Privacy Case Studies (CMU-MSR Mindswap on Privacy 2007)
Differential Privacy Case Studies (CMU-MSR Mindswap on Privacy 2007)
 
Leveraging Data Science in the Automotive Industry
Leveraging Data Science in the Automotive IndustryLeveraging Data Science in the Automotive Industry
Leveraging Data Science in the Automotive Industry
 

Viewers also liked

Viewers also liked (20)

Lecture 2
Lecture 2Lecture 2
Lecture 2
 
Lecture 29
Lecture 29Lecture 29
Lecture 29
 
Lecture 37
Lecture 37Lecture 37
Lecture 37
 
Lecture 33
Lecture 33Lecture 33
Lecture 33
 
Lecture 17
Lecture 17Lecture 17
Lecture 17
 
Lecture 27
Lecture 27Lecture 27
Lecture 27
 
Lecture 16
Lecture 16Lecture 16
Lecture 16
 
Lecture 3
Lecture 3Lecture 3
Lecture 3
 
Lecture 7
Lecture 7Lecture 7
Lecture 7
 
Lecture 32
Lecture 32Lecture 32
Lecture 32
 
Lecture 26
Lecture 26Lecture 26
Lecture 26
 
Lecture 30
Lecture 30Lecture 30
Lecture 30
 
Lecture 34
Lecture 34Lecture 34
Lecture 34
 
Lecture 38
Lecture 38Lecture 38
Lecture 38
 
Lecture 40
Lecture 40Lecture 40
Lecture 40
 
Lecture 19
Lecture 19Lecture 19
Lecture 19
 
Lecture 35
Lecture 35Lecture 35
Lecture 35
 
Lecture 5
Lecture 5Lecture 5
Lecture 5
 
Lecture 39
Lecture 39Lecture 39
Lecture 39
 
Lecture 4
Lecture 4Lecture 4
Lecture 4
 

Similar to Lecture 23

Data-Ed Webinar: Data Quality Engineering
Data-Ed Webinar: Data Quality EngineeringData-Ed Webinar: Data Quality Engineering
Data-Ed Webinar: Data Quality EngineeringDATAVERSITY
 
Data summit connect fall 2020 - rise of data ops
Data summit connect fall 2020 - rise of data opsData summit connect fall 2020 - rise of data ops
Data summit connect fall 2020 - rise of data opsRyan Gross
 
OAUG 05-2009-MDM-1683-A Fiteni CPA, CMA
OAUG 05-2009-MDM-1683-A Fiteni CPA, CMAOAUG 05-2009-MDM-1683-A Fiteni CPA, CMA
OAUG 05-2009-MDM-1683-A Fiteni CPA, CMAAlex Fiteni
 
Neoaug 2013 critical success factors for data quality management-chain-sys-co...
Neoaug 2013 critical success factors for data quality management-chain-sys-co...Neoaug 2013 critical success factors for data quality management-chain-sys-co...
Neoaug 2013 critical success factors for data quality management-chain-sys-co...Chain Sys Corporation
 
Building a Robust Big Data QA Ecosystem to Mitigate Data Integrity Challenges
Building a Robust Big Data QA Ecosystem to Mitigate Data Integrity ChallengesBuilding a Robust Big Data QA Ecosystem to Mitigate Data Integrity Challenges
Building a Robust Big Data QA Ecosystem to Mitigate Data Integrity ChallengesCognizant
 
Qlik wp 2021_q3_data_governance_in_the_modern_data_analytics_pipeline
Qlik wp 2021_q3_data_governance_in_the_modern_data_analytics_pipelineQlik wp 2021_q3_data_governance_in_the_modern_data_analytics_pipeline
Qlik wp 2021_q3_data_governance_in_the_modern_data_analytics_pipelineSrikanth Sharma Boddupalli
 
Engineering Machine Learning Data Pipelines Series: Big Data Quality - Cleans...
Engineering Machine Learning Data Pipelines Series: Big Data Quality - Cleans...Engineering Machine Learning Data Pipelines Series: Big Data Quality - Cleans...
Engineering Machine Learning Data Pipelines Series: Big Data Quality - Cleans...Precisely
 
Data Quality in Data Warehouse and Business Intelligence Environments - Disc...
Data Quality in  Data Warehouse and Business Intelligence Environments - Disc...Data Quality in  Data Warehouse and Business Intelligence Environments - Disc...
Data Quality in Data Warehouse and Business Intelligence Environments - Disc...Alan D. Duncan
 
Dw19 t1+ +dq+fundamentals-cvs+template
Dw19 t1+ +dq+fundamentals-cvs+templateDw19 t1+ +dq+fundamentals-cvs+template
Dw19 t1+ +dq+fundamentals-cvs+templateMILLER A. ZAMBRANO T.
 
Data-Ed: Unlock Business Value through Data Quality Engineering
Data-Ed: Unlock Business Value through Data Quality Engineering Data-Ed: Unlock Business Value through Data Quality Engineering
Data-Ed: Unlock Business Value through Data Quality Engineering Data Blueprint
 
Data-Ed: Unlock Business Value through Data Quality Engineering
Data-Ed: Unlock Business Value through Data Quality EngineeringData-Ed: Unlock Business Value through Data Quality Engineering
Data-Ed: Unlock Business Value through Data Quality EngineeringDATAVERSITY
 
[DSC Europe 23] Milos Solujic - Data Lakehouse Revolutionizing Data Managemen...
[DSC Europe 23] Milos Solujic - Data Lakehouse Revolutionizing Data Managemen...[DSC Europe 23] Milos Solujic - Data Lakehouse Revolutionizing Data Managemen...
[DSC Europe 23] Milos Solujic - Data Lakehouse Revolutionizing Data Managemen...DataScienceConferenc1
 
Why Data Science Projects Fail
Why Data Science Projects FailWhy Data Science Projects Fail
Why Data Science Projects FailSense Corp
 
Why Data Science Projects Fail
Why Data Science Projects FailWhy Data Science Projects Fail
Why Data Science Projects FailSense Corp
 
Data Governance Maturity Model
Data Governance Maturity ModelData Governance Maturity Model
Data Governance Maturity ModelBasuki Rahmad
 
Beyond Firefighting: A Leaders Guide to Proactive Data Quality Management
Beyond Firefighting: A Leaders Guide to Proactive Data Quality ManagementBeyond Firefighting: A Leaders Guide to Proactive Data Quality Management
Beyond Firefighting: A Leaders Guide to Proactive Data Quality ManagementHarley Capewell
 
¿En qué se parece el Gobierno del Dato a un parque de atracciones?
¿En qué se parece el Gobierno del Dato a un parque de atracciones?¿En qué se parece el Gobierno del Dato a un parque de atracciones?
¿En qué se parece el Gobierno del Dato a un parque de atracciones?Denodo
 
OberservePoint - The Digital Data Quality Playbook
OberservePoint - The Digital Data Quality  PlaybookOberservePoint - The Digital Data Quality  Playbook
OberservePoint - The Digital Data Quality PlaybookObservePoint
 

Similar to Lecture 23 (20)

Data-Ed Webinar: Data Quality Engineering
Data-Ed Webinar: Data Quality EngineeringData-Ed Webinar: Data Quality Engineering
Data-Ed Webinar: Data Quality Engineering
 
2014 dqe handouts
2014 dqe handouts2014 dqe handouts
2014 dqe handouts
 
Data summit connect fall 2020 - rise of data ops
Data summit connect fall 2020 - rise of data opsData summit connect fall 2020 - rise of data ops
Data summit connect fall 2020 - rise of data ops
 
OAUG 05-2009-MDM-1683-A Fiteni CPA, CMA
OAUG 05-2009-MDM-1683-A Fiteni CPA, CMAOAUG 05-2009-MDM-1683-A Fiteni CPA, CMA
OAUG 05-2009-MDM-1683-A Fiteni CPA, CMA
 
Neoaug 2013 critical success factors for data quality management-chain-sys-co...
Neoaug 2013 critical success factors for data quality management-chain-sys-co...Neoaug 2013 critical success factors for data quality management-chain-sys-co...
Neoaug 2013 critical success factors for data quality management-chain-sys-co...
 
Building a Robust Big Data QA Ecosystem to Mitigate Data Integrity Challenges
Building a Robust Big Data QA Ecosystem to Mitigate Data Integrity ChallengesBuilding a Robust Big Data QA Ecosystem to Mitigate Data Integrity Challenges
Building a Robust Big Data QA Ecosystem to Mitigate Data Integrity Challenges
 
Qlik wp 2021_q3_data_governance_in_the_modern_data_analytics_pipeline
Qlik wp 2021_q3_data_governance_in_the_modern_data_analytics_pipelineQlik wp 2021_q3_data_governance_in_the_modern_data_analytics_pipeline
Qlik wp 2021_q3_data_governance_in_the_modern_data_analytics_pipeline
 
Engineering Machine Learning Data Pipelines Series: Big Data Quality - Cleans...
Engineering Machine Learning Data Pipelines Series: Big Data Quality - Cleans...Engineering Machine Learning Data Pipelines Series: Big Data Quality - Cleans...
Engineering Machine Learning Data Pipelines Series: Big Data Quality - Cleans...
 
Data Quality in Data Warehouse and Business Intelligence Environments - Disc...
Data Quality in  Data Warehouse and Business Intelligence Environments - Disc...Data Quality in  Data Warehouse and Business Intelligence Environments - Disc...
Data Quality in Data Warehouse and Business Intelligence Environments - Disc...
 
Dw19 t1+ +dq+fundamentals-cvs+template
Dw19 t1+ +dq+fundamentals-cvs+templateDw19 t1+ +dq+fundamentals-cvs+template
Dw19 t1+ +dq+fundamentals-cvs+template
 
Data-Ed: Unlock Business Value through Data Quality Engineering
Data-Ed: Unlock Business Value through Data Quality Engineering Data-Ed: Unlock Business Value through Data Quality Engineering
Data-Ed: Unlock Business Value through Data Quality Engineering
 
Data-Ed: Unlock Business Value through Data Quality Engineering
Data-Ed: Unlock Business Value through Data Quality EngineeringData-Ed: Unlock Business Value through Data Quality Engineering
Data-Ed: Unlock Business Value through Data Quality Engineering
 
Data Science and Analytics
Data Science and Analytics Data Science and Analytics
Data Science and Analytics
 
[DSC Europe 23] Milos Solujic - Data Lakehouse Revolutionizing Data Managemen...
[DSC Europe 23] Milos Solujic - Data Lakehouse Revolutionizing Data Managemen...[DSC Europe 23] Milos Solujic - Data Lakehouse Revolutionizing Data Managemen...
[DSC Europe 23] Milos Solujic - Data Lakehouse Revolutionizing Data Managemen...
 
Why Data Science Projects Fail
Why Data Science Projects FailWhy Data Science Projects Fail
Why Data Science Projects Fail
 
Why Data Science Projects Fail
Why Data Science Projects FailWhy Data Science Projects Fail
Why Data Science Projects Fail
 
Data Governance Maturity Model
Data Governance Maturity ModelData Governance Maturity Model
Data Governance Maturity Model
 
Beyond Firefighting: A Leaders Guide to Proactive Data Quality Management
Beyond Firefighting: A Leaders Guide to Proactive Data Quality ManagementBeyond Firefighting: A Leaders Guide to Proactive Data Quality Management
Beyond Firefighting: A Leaders Guide to Proactive Data Quality Management
 
¿En qué se parece el Gobierno del Dato a un parque de atracciones?
¿En qué se parece el Gobierno del Dato a un parque de atracciones?¿En qué se parece el Gobierno del Dato a un parque de atracciones?
¿En qué se parece el Gobierno del Dato a un parque de atracciones?
 
OberservePoint - The Digital Data Quality Playbook
OberservePoint - The Digital Data Quality  PlaybookOberservePoint - The Digital Data Quality  Playbook
OberservePoint - The Digital Data Quality Playbook
 

More from Shani729

Python tutorialfeb152012
Python tutorialfeb152012Python tutorialfeb152012
Python tutorialfeb152012Shani729
 
Python tutorial
Python tutorialPython tutorial
Python tutorialShani729
 
Interaction design _beyond_human_computer_interaction
Interaction design _beyond_human_computer_interactionInteraction design _beyond_human_computer_interaction
Interaction design _beyond_human_computer_interactionShani729
 
Fm lecturer 13(final)
Fm lecturer 13(final)Fm lecturer 13(final)
Fm lecturer 13(final)Shani729
 
Lecture slides week14-15
Lecture slides week14-15Lecture slides week14-15
Lecture slides week14-15Shani729
 
Frequent itemset mining using pattern growth method
Frequent itemset mining using pattern growth methodFrequent itemset mining using pattern growth method
Frequent itemset mining using pattern growth methodShani729
 
Dwh lecture slides-week15
Dwh lecture slides-week15Dwh lecture slides-week15
Dwh lecture slides-week15Shani729
 
Dwh lecture slides-week10
Dwh lecture slides-week10Dwh lecture slides-week10
Dwh lecture slides-week10Shani729
 
Dwh lecture slidesweek7&8
Dwh lecture slidesweek7&8Dwh lecture slidesweek7&8
Dwh lecture slidesweek7&8Shani729
 
Dwh lecture slides-week5&6
Dwh lecture slides-week5&6Dwh lecture slides-week5&6
Dwh lecture slides-week5&6Shani729
 
Dwh lecture slides-week3&4
Dwh lecture slides-week3&4Dwh lecture slides-week3&4
Dwh lecture slides-week3&4Shani729
 
Dwh lecture slides-week2
Dwh lecture slides-week2Dwh lecture slides-week2
Dwh lecture slides-week2Shani729
 
Dwh lecture slides-week1
Dwh lecture slides-week1Dwh lecture slides-week1
Dwh lecture slides-week1Shani729
 
Dwh lecture slides-week 13
Dwh lecture slides-week 13Dwh lecture slides-week 13
Dwh lecture slides-week 13Shani729
 
Dwh lecture slides-week 12&13
Dwh lecture slides-week 12&13Dwh lecture slides-week 12&13
Dwh lecture slides-week 12&13Shani729
 
Data warehousing and mining furc
Data warehousing and mining furcData warehousing and mining furc
Data warehousing and mining furcShani729
 
Lecture 36
Lecture 36Lecture 36
Lecture 36Shani729
 
Lecture 31
Lecture 31Lecture 31
Lecture 31Shani729
 
Lecture 28
Lecture 28Lecture 28
Lecture 28Shani729
 

More from Shani729 (19)

Python tutorialfeb152012
Python tutorialfeb152012Python tutorialfeb152012
Python tutorialfeb152012
 
Python tutorial
Python tutorialPython tutorial
Python tutorial
 
Interaction design _beyond_human_computer_interaction
Interaction design _beyond_human_computer_interactionInteraction design _beyond_human_computer_interaction
Interaction design _beyond_human_computer_interaction
 
Fm lecturer 13(final)
Fm lecturer 13(final)Fm lecturer 13(final)
Fm lecturer 13(final)
 
Lecture slides week14-15
Lecture slides week14-15Lecture slides week14-15
Lecture slides week14-15
 
Frequent itemset mining using pattern growth method
Frequent itemset mining using pattern growth methodFrequent itemset mining using pattern growth method
Frequent itemset mining using pattern growth method
 
Dwh lecture slides-week15
Dwh lecture slides-week15Dwh lecture slides-week15
Dwh lecture slides-week15
 
Dwh lecture slides-week10
Dwh lecture slides-week10Dwh lecture slides-week10
Dwh lecture slides-week10
 
Dwh lecture slidesweek7&8
Dwh lecture slidesweek7&8Dwh lecture slidesweek7&8
Dwh lecture slidesweek7&8
 
Dwh lecture slides-week5&6
Dwh lecture slides-week5&6Dwh lecture slides-week5&6
Dwh lecture slides-week5&6
 
Dwh lecture slides-week3&4
Dwh lecture slides-week3&4Dwh lecture slides-week3&4
Dwh lecture slides-week3&4
 
Dwh lecture slides-week2
Dwh lecture slides-week2Dwh lecture slides-week2
Dwh lecture slides-week2
 
Dwh lecture slides-week1
Dwh lecture slides-week1Dwh lecture slides-week1
Dwh lecture slides-week1
 
Dwh lecture slides-week 13
Dwh lecture slides-week 13Dwh lecture slides-week 13
Dwh lecture slides-week 13
 
Dwh lecture slides-week 12&13
Dwh lecture slides-week 12&13Dwh lecture slides-week 12&13
Dwh lecture slides-week 12&13
 
Data warehousing and mining furc
Data warehousing and mining furcData warehousing and mining furc
Data warehousing and mining furc
 
Lecture 36
Lecture 36Lecture 36
Lecture 36
 
Lecture 31
Lecture 31Lecture 31
Lecture 31
 
Lecture 28
Lecture 28Lecture 28
Lecture 28
 

Recently uploaded

Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxpurnimasatapathy1234
 
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxDeepakSakkari2
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfAsst.prof M.Gokilavani
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSKurinjimalarL3
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )Tsuyoshi Horigome
 
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEINFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEroselinkalist12
 
HARMONY IN THE HUMAN BEING - Unit-II UHV-2
HARMONY IN THE HUMAN BEING - Unit-II UHV-2HARMONY IN THE HUMAN BEING - Unit-II UHV-2
HARMONY IN THE HUMAN BEING - Unit-II UHV-2RajaP95
 
Churning of Butter, Factors affecting .
Churning of Butter, Factors affecting  .Churning of Butter, Factors affecting  .
Churning of Butter, Factors affecting .Satyam Kumar
 
Introduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxIntroduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxvipinkmenon1
 
Current Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLCurrent Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLDeelipZope
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort servicejennyeacort
 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxk795866
 
power system scada applications and uses
power system scada applications and usespower system scada applications and uses
power system scada applications and usesDevarapalliHaritha
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...VICTOR MAESTRE RAMIREZ
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidNikhilNagaraju
 

Recently uploaded (20)

Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptx
 
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptx
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )
 
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEINFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
 
HARMONY IN THE HUMAN BEING - Unit-II UHV-2
HARMONY IN THE HUMAN BEING - Unit-II UHV-2HARMONY IN THE HUMAN BEING - Unit-II UHV-2
HARMONY IN THE HUMAN BEING - Unit-II UHV-2
 
Churning of Butter, Factors affecting .
Churning of Butter, Factors affecting  .Churning of Butter, Factors affecting  .
Churning of Butter, Factors affecting .
 
Introduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxIntroduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptx
 
Design and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdfDesign and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdf
 
Current Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLCurrent Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCL
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptx
 
power system scada applications and uses
power system scada applications and usespower system scada applications and uses
power system scada applications and uses
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfid
 

Lecture 23

  • 1. DWH-Ahsan AbdullahDWH-Ahsan Abdullah 11 Data WarehousingData Warehousing Lecture-23Lecture-23 Total DQMTotal DQM Virtual University of PakistanVirtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for Agro-Informatics Research www.nu.edu.pk/cairindex.asp National University of Computers & Emerging Sciences, Islamabad Email: ahsan101@yahoo.com
  • 2. 2 Data Quality Management ProcessData Quality Management Process Establish TDQM Environment Scope Data Quality Projects & Develop Implementation Plans Implement Data Quality Projects (Define, Measure, Analyze, Improve) Evaluate Data Quality Management Methods
  • 3. 3 Data Quality Management ProcessData Quality Management Process 1.1.Establish Data Quality ManagementEstablish Data Quality Management EnvironmentEnvironment • IS project managers • Development professionals. • Functional users of legacy information systems with domain knowledge • IS developers know solutions but don’t know how and where to modify Sub-bullets will not go to graphics
  • 4. 4 Data Quality Management ProcessData Quality Management Process 2. Scope Data Quality Projects & Develop2. Scope Data Quality Projects & Develop Implementation PlansImplementation Plans • Task Summary: Project goals, scope, and potential benefits • Task Description: Describe data quality analysis tasks • Project Approach: Summarize tasks and tools used to provide a baseline of existing data quality • Schedule: Identify task start, completion dates, and project milestones • Resources: Include costs connected with tools acquisition, labor hours (by labor category), training, travel, and other direct and indirect costs yellow will go to graphics
  • 5. 5 Data Quality Management ProcessData Quality Management Process 3.3. Implement Data Quality Projects (Define, Measure,Implement Data Quality Projects (Define, Measure, Analyze, Improve)Analyze, Improve) • Define: Identify functional user DQ requirements and establish DQ metrics • Measure: conformance to current business rules and develop exception reports • Analyze: Verify, validate, and assess poor DQ causes. Define improvement opportunities • Improve: Select/prioritize DQ improvement opportunities i.e. data entry procedures, updating data validation rules, and/or company data standards. Yellow will go to graphics
  • 6. 6 Data Quality Management ProcessData Quality Management Process 4.4. Evaluate Data Quality Management MethodsEvaluate Data Quality Management Methods • modifying or rejuvenating existing methods of DQ management • determining if DQ projects have helped to achieve demonstrable goals and benefits. Evaluating and assessing DQ work as, it is not a program, but a new way of doing business. Sub-bullets will not go to graphics
  • 7. DWH-Ahsan AbdullahDWH-Ahsan Abdullah 77 The House of Quality MatrixThe House of Quality Matrix Customer Requirements Interrelationship Matrix Technical Correlation Matrix Technical Design Requirements
  • 8. 8 How to improve Data Quality?How to improve Data Quality? The four categories of Data Quality ImprovementThe four categories of Data Quality Improvement  ProcessProcess  SystemSystem  Policy & ProcedurePolicy & Procedure  Data DesignData Design
  • 9. 9 Quality Management Maturity GridQuality Management Maturity Grid CMM Level-1 Uncertainty CMM Level-2 Awakening CMM Level-3 Enlightenment CMM Level-4 Wisdom CMM Level-5 Certainity
  • 10. 10 Misconceptions on Data QualityMisconceptions on Data Quality  You Can Fix DataYou Can Fix Data  Problem NOT in data, but how it was used.Problem NOT in data, but how it was used.  It is NOT a one time process.It is NOT a one time process.  Buying a cleansing tool is NOT the solution.Buying a cleansing tool is NOT the solution.  Some live with the problem, cant afford the tool.Some live with the problem, cant afford the tool.  Data Quality is an IT ProblemData Quality is an IT Problem  It is the company problem.It is the company problem.  Define the metrics of quality.Define the metrics of quality.  Business has to strike a balance between quality and ROI.Business has to strike a balance between quality and ROI.  Joint business and IT effort.Joint business and IT effort. Sub-bullets will not go to graphics
  • 11. 11 Misconceptions on Data QualityMisconceptions on Data Quality  (All) Problem is in the Data Sources or Data Entry(All) Problem is in the Data Sources or Data Entry  NOT the only problem.NOT the only problem.  Systems could be responsible, but actually it is the metrics.Systems could be responsible, but actually it is the metrics.  Two divisions using different codes for same entity.Two divisions using different codes for same entity.  Need to track, trace, check data from creation to usage.Need to track, trace, check data from creation to usage.  The Data Warehouse will provide a single source of truthThe Data Warehouse will provide a single source of truth  In ideal world it is indeed true.In ideal world it is indeed true.  In real world maybe multiple data warehouses, data marts, externalIn real world maybe multiple data warehouses, data marts, external source i.e. silos of data resulting in multiple sources of “truth”.source i.e. silos of data resulting in multiple sources of “truth”.  Even with single source of truth, if transformations and interpretationsEven with single source of truth, if transformations and interpretations are different, an issue.are different, an issue. Sub-bullets will not go to graphics