1. DWH-Ahsan AbdullahDWH-Ahsan Abdullah
11
Data WarehousingData Warehousing
Lecture-23Lecture-23
Total DQMTotal DQM
Virtual University of PakistanVirtual University of Pakistan
Ahsan Abdullah
Assoc. Prof. & Head
Center for Agro-Informatics Research
www.nu.edu.pk/cairindex.asp
National University of Computers & Emerging Sciences, Islamabad
Email: ahsan101@yahoo.com
2. 2
Data Quality Management ProcessData Quality Management Process
Establish TDQM
Environment
Scope Data Quality Projects &
Develop Implementation Plans
Implement Data Quality Projects
(Define, Measure, Analyze, Improve)
Evaluate Data Quality
Management Methods
3. 3
Data Quality Management ProcessData Quality Management Process
1.1.Establish Data Quality ManagementEstablish Data Quality Management
EnvironmentEnvironment
• IS project managers
• Development professionals.
• Functional users of legacy information
systems with domain knowledge
• IS developers know solutions but don’t
know how and where to modify
Sub-bullets will not go to graphics
4. 4
Data Quality Management ProcessData Quality Management Process
2. Scope Data Quality Projects & Develop2. Scope Data Quality Projects & Develop
Implementation PlansImplementation Plans
• Task Summary: Project goals, scope, and potential benefits
• Task Description: Describe data quality analysis tasks
• Project Approach: Summarize tasks and tools used to provide a
baseline of existing data quality
• Schedule: Identify task start, completion dates, and project
milestones
• Resources: Include costs connected with tools acquisition, labor
hours (by labor category), training, travel, and other direct and
indirect costs
yellow will go to graphics
5. 5
Data Quality Management ProcessData Quality Management Process
3.3. Implement Data Quality Projects (Define, Measure,Implement Data Quality Projects (Define, Measure,
Analyze, Improve)Analyze, Improve)
• Define: Identify functional user DQ requirements and
establish DQ metrics
• Measure: conformance to current business rules and
develop exception reports
• Analyze: Verify, validate, and assess poor DQ causes.
Define improvement opportunities
• Improve: Select/prioritize DQ improvement
opportunities i.e. data entry procedures, updating data
validation rules, and/or company data standards.
Yellow will go to graphics
6. 6
Data Quality Management ProcessData Quality Management Process
4.4. Evaluate Data Quality Management MethodsEvaluate Data Quality Management Methods
• modifying or rejuvenating existing methods of
DQ management
• determining if DQ projects have helped to
achieve demonstrable goals and benefits.
Evaluating and assessing DQ work as, it is not a
program, but a new way of doing business.
Sub-bullets will not go to graphics
7. DWH-Ahsan AbdullahDWH-Ahsan Abdullah
77
The House of Quality MatrixThe House of Quality Matrix
Customer
Requirements
Interrelationship
Matrix
Technical Correlation
Matrix
Technical Design
Requirements
8. 8
How to improve Data Quality?How to improve Data Quality?
The four categories of Data Quality ImprovementThe four categories of Data Quality Improvement
ProcessProcess
SystemSystem
Policy & ProcedurePolicy & Procedure
Data DesignData Design
10. 10
Misconceptions on Data QualityMisconceptions on Data Quality
You Can Fix DataYou Can Fix Data
Problem NOT in data, but how it was used.Problem NOT in data, but how it was used.
It is NOT a one time process.It is NOT a one time process.
Buying a cleansing tool is NOT the solution.Buying a cleansing tool is NOT the solution.
Some live with the problem, cant afford the tool.Some live with the problem, cant afford the tool.
Data Quality is an IT ProblemData Quality is an IT Problem
It is the company problem.It is the company problem.
Define the metrics of quality.Define the metrics of quality.
Business has to strike a balance between quality and ROI.Business has to strike a balance between quality and ROI.
Joint business and IT effort.Joint business and IT effort.
Sub-bullets will not go to graphics
11. 11
Misconceptions on Data QualityMisconceptions on Data Quality
(All) Problem is in the Data Sources or Data Entry(All) Problem is in the Data Sources or Data Entry
NOT the only problem.NOT the only problem.
Systems could be responsible, but actually it is the metrics.Systems could be responsible, but actually it is the metrics.
Two divisions using different codes for same entity.Two divisions using different codes for same entity.
Need to track, trace, check data from creation to usage.Need to track, trace, check data from creation to usage.
The Data Warehouse will provide a single source of truthThe Data Warehouse will provide a single source of truth
In ideal world it is indeed true.In ideal world it is indeed true.
In real world maybe multiple data warehouses, data marts, externalIn real world maybe multiple data warehouses, data marts, external
source i.e. silos of data resulting in multiple sources of “truth”.source i.e. silos of data resulting in multiple sources of “truth”.
Even with single source of truth, if transformations and interpretationsEven with single source of truth, if transformations and interpretations
are different, an issue.are different, an issue.
Sub-bullets will not go to graphics