Copyright Global Data Strategy, Ltd. 2020
Data Quality Best Practices
Donna Burbank and Nigel Turner
Global Data Strategy, Ltd.
August 27th, 2020
Follow on Twitter @donnaburbank, @nigelturner8
@GlobalDataStrat
Twitter Event hashtag: #DAStrategies
Global Data Strategy, Ltd. 2020
Donna Burbank
2
Donna is a recognised industry expert in
information management with over 20 years
of experience in data strategy, information
management, data modeling, metadata
management, and enterprise architecture.
Her background is multi-faceted across
consulting, product development, product
management, brand strategy, marketing,
and business leadership.
She is currently the Managing Director at
Global Data Strategy, Ltd., an international
information management consulting
company that specializes in the alignment of
business drivers with data-centric
technology. In past roles, she has served in
key brand strategy and product
management roles at CA Technologies and
Embarcadero Technologies for several of the
leading data management products in the
market.
As an active contributor to the data
management community, she is a long time
DAMA International member, Past President
and Advisor to the DAMA Rocky Mountain
chapter, and was awarded the Excellence in
Data Management Award from DAMA
International.
Donna is also an analyst at the Boulder BI
Train Trust (BBBT) where she provides advice
and gains insight on the latest BI and
Analytics software in the market. She was on
several review committees for the Object
Management Group’s for key information
management and process modeling
notations.
She has worked with dozens of Fortune 500
companies worldwide in the Americas,
Europe, Asia, and Africa and speaks regularly
at industry conferences. She has co-
authored two books: Data Modeling for the
Business and Data Modeling Made Simple
with ERwin Data Modeler and is a regular
contributor to industry publications. She can
be reached at
donna.burbank@globaldatastrategy.com
Donna is based in Boulder, Colorado, USA.
Follow on Twitter @donnaburbank
@GlobalDataStrat
Twitter Event hashtag: #DAStrategies
Global Data Strategy, Ltd. 2020
Nigel Turner
Nigel Turner has worked in Information
Management (IM) and related areas for
over 25 years. This experience has
embraced Data Governance,
Information Strategy, Data Quality, Data
Governance, Master Data Management,
& Business Intelligence.
He spent much of his career in British
Telecommunications Group (BT) where
he led a series of enterprise wide IM &
data governance initiatives.
After leaving BT in 2010 Nigel became
VP of Information Management Strategy
at Harte Hanks Trillium Software, a
leading global provider of Data Quality
& Data Governance tools and
consultancy. Here he engaged with over
150 customer organizations from all
parts of the globe.
Currently Principal Consultant for EMEA
at Global Data Strategy, Ltd, he has been
a principal consultant at such firms as
FromHereOn and IPL, where he has led
Data Governance engagement with
customers such as First Great Western.
Nigel is a well known thought leader in
Information Management and has
presented at many international
conferences. Until recently he also
worked part time at Cardiff University,
where he set up a Student Software
Enterprise company. In addition he has
also been a part time Associate Lecturer
at the UK Open University where he
taught Systems & Management.
Nigel is very active in professional Data
Management organizations and is an
elected Data Management Association
(DAMA) UK Committee member. He
was the joint winner of DAMA
International’s 2015 Community Award
for the work he initiated and led in
setting up a mentoring scheme in the
UK where experienced DAMA
professionals coach and support newer
data management professionals.
Nigel is based in Cardiff, Wales, UK.
Follow on Twitter @NigelTurner8
Today’s hashtag: # DAStrategies
Global Data Strategy, Ltd. 2020
DATAVERSITY Data Architecture Strategies
• January 23 Emerging Trends in Data Architecture – What’s the Next Big Thing?
• February 27 Building a Data Strategy - Practical Steps for Aligning with Business Goals
• March 26 Cloud-Based Data Warehousing – What's New and What Stays the Same
• April 23 Master Data Management – Aligning Data, Process, and Governance
• May 28 Data Governance and Data Architecture – Alignment and Synergies
• June 25 Enterprise Architecture vs. Data Architecture
• July 22 Best Practices in Metadata Management
• August 27 Data Quality Best Practices – with Nigel Turner
• September 24 Data Virtualization – Separating Myth from Reality
• October 22 Data Architect vs. Data Engineer vs. Data Modeler
• December 1 Graph Databases: Practical Use Cases
4
This Year’s Lineup
Global Data Strategy, Ltd. 2020
What We’ll Cover Today
5
• Tackling data quality problems requires more than a series of tactical, one off improvement
projects.
• By their nature, many data quality problems extend across and often beyond an
organization.
• Addressing these issues requires a holistic architectural approach combining people,
process and technology.
• This webinar provides practical ways to control data quality issues in your organization.
Global Data Strategy, Ltd. 2020 6
A Successful Data Strategy links Business Goals with Technology Solutions
“Top-Down” alignment with
business priorities
“Bottom-Up” management &
inventory of data sources
Managing the people, process,
policies & culture around data
Coordinating & integrating
disparate data sources
Leveraging & managing data for
strategic advantage
Data Quality is Part of a Wider Data Strategy
www.globaldatastrategy.com
Global Data Strategy, Ltd. 2020
Data Quality: Some Common Misconceptions
7
Data Quality is a stand alone discipline
NOT TRUE – Data Quality is closely interdependent
with other disciplines, e.g. Data Governance, MDM,
Data Architecture, BI, Analytics, etc.
Data Quality is an IT problem & so IT tools can fix it
NOT TRUE – Data Quality is multi-faceted, caused by
process, people and IT issues, so solutions must be holistic
and business-driven
Data Quality improvement is a choice
NOT TRUE – all organizations continually do data
quality improvement; it’s not about IF you do it
but HOW you do it
Data Quality improvement is a project
NOT TRUE – it may start with a project, but it has
no end; it must evolve into a Business As Usual
(BaU) continuous process of improvement
Global Data Strategy, Ltd. 2020
Data Quality – A Simple Definition
8
Data that is demonstrably fit
for purpose.
Demonstrably: Implies that improvement
can be measured and business impact
demonstrated
Fit for Purpose: Data quality must meet
the needs of the business
Global Data Strategy, Ltd. 2020
Recent Data Quality Horror Stories
9
January 2020:
UK insurance company
sent a marketing email to
all its contact base.
Every email started ‘Dear
Michael’…
N Turner
111 Happy Close
Cardiff,UK
Since May 2012:
UK pharmacy convinced Nigel
is female (despite frequent
feedback to the contrary).
He still gets many cosmetics
offers…
November 2019:
UK Retail bank undertook
disastrous customer data
migration in 2018 and did no DQ
analysis before doing so. Total
cost to fix problems and
compensate customers: $480
million
April 2020:
UK governments sent
shielding letters to
vulnerable people. 975k
sent; 600k people missed
and / or 17% of letters
sent to wrong addresses
Global Data Strategy, Ltd. 2020 10
ANNOYANCE:
Creates anger & frustration
On Companies & Organizations
Poor Data Quality: Overall Impact
On Individuals
ECONOMIC IMPACT:
Hits Revenues, Costs, Profits
REPUTATION:
Impacts Brand & Customer Loyalty
LAW & REGULATION:
Increases risk & exposure
PERSONAL HARM:
Physical, mental or emotional
DESIRE FOR RETRIBUTION:
Social media gives individuals voice
and influence
Global Data Strategy, Ltd. 2020
Data Quality – a Holistic Approach
Improving Data Quality requires a combination of People, Process, and Technology.
11
People
Process
Technology
• Data Governance & Stewardship
• Business Rules
• Business Process Alignment
• Data Management Best Practices
• Data Management Tools
• Data Architecture Best Practices
Global Data Strategy, Ltd. 2020
Tackling Data Quality: the A2E approach
12
Assess
Baseline
ConvergeDevelop
Evaluate
CYCLE OF CONTINUOUS DATA
QUALITY
IMPROVEMENT
Step Purpose
Assess Business
Usage
Understand what data exists and how it is used
within the organization
Baseline Data
Sources
Baseline the current quality of the data and
assess how well it is meeting business needs
Converge on
Business Critical Areas
Focus priorities to optimise early business
benefits and set ‘fit for purpose’ quality targets
to guide improvement activities
Develop
Improvements
Design & deploy improvement initiatives
(encompassing people, process, and technology)
and measure the impact against targets
Evaluate Benefits &
ROI
Regularly measure the data and continue to
improve it so that it continues to meet current
and future business needs
Global Data Strategy, Ltd. 2020
A2E Step 1: Assess
• Understand the business and its primary goals & objectives
• Analyze what data the business:
• Relies on today
• Will need to support its future aspirations
• Identify the primary data stakeholders:
• Business
• IT
• External parties (e.g. customers, suppliers, partners)
• Work with them to evaluate current data ‘fitness for
purpose’ and establish:
• Where / how it is captured, stored and processed
• What’s working well
• What needs to be improved
• The potential benefits of better data quality
• Create a Data Quality Issues (& Opportunities) Log
• Highlight:
• Most important business critical data domains
• Business impact
• Main data creators and consumers
• Accountability for the data
• Current problems and issues with the data
• Opportunities & potential benefits
• Outputs may include:
• RACI Stakeholder Matrix
• Rich Picture highlighting real-world issues
• Data Quality Issues Log
• Business Data Model
• Business Process Model
• ROI analysis
13
ASSESS THE BUSINESS LANDSCAPE POTENTIAL OUTPUTS & TOOLS
Global Data Strategy, Ltd. 2020
Data Quality Complexity & Value of Rich Pictures
• Data Quality is a ‘messy’ and complex issue:
• Problems often poorly understood (e.g. data flows and lineage)
• Lack of information & hard facts (e.g. measures)
• Large numbers of people involved with differing perspectives (e.g. data producers, data
consumers, senior executives, customers, suppliers)
• Problem ownership unclear (e.g. problem origins and impacts)
• Rich Pictures have great value:
• Ideal starting point for complex (messy) organizational problems like data quality
• Holistic, embracing people, process & technology
• Highlight interconnectedness of problems
• Best initially created in a workshop (whiteboard and coloured pens ideal!) -
encourage participants to contribute
• Primary use is to derive ‘problem themes’ to enable focus on key issues
14
Global Data Strategy, Ltd. 2020
Our
details
again?!
RICH PICTURE OF DATA QUALITY PROBLEMS AT ACME HOTEL & CASINO GROUP
CFO
CMO
CIO
ACME TRAVEL
MAGAZINE
COO
Our
details
again?!
Our
details
again?!
CEO
(NEW)
BUSINESS
Untrusted
financial results
Data?
It’s not
my
problem
Help! I
can’t
improve
data on
my own
POOR DATA
QUALITY
Finance data
rework &
delay
BUSINESS & IT
MEETINGS
SHAREHOLDERS
1 HOTEL =
1 DATABASE
GROWTH
LOYALTY
SCHEME
STOCK PRICE
COST
REDUCTION
OUTMODED IT
6 CASINOS
60 HOTELS
Poor
data?
Blame
the CIO
DUPLICATE
CUSTOMERS
10 NIGHTCLUBS
We
know
our data
stinks
Stubs?
Who
cares?
We don’t.
Valet Parking:
Stubs not
submitted loss
$2.5M pa
41% of all
supplies are
Emergency
Supplies; cost
$21.7m
406 email
addresses for
Mickey Mouse
in CRM
$315k lost on
returned
magazines
Global Data Strategy, Ltd. 2020
Our
details
again?!
CFO
CMO
CIO
ACME TRAVEL
MAGAZINE
COO
Our
details
again?!
Our
details
again?!
CEO
(NEW)
BUSINESS
Untrusted
financial results
Data?
It’s not
my
problem
Help! I
can’t
improve
data on
my own
POOR DATA
QUALITY
Finance data
rework &
delay
BUSINESS & IT
MEETINGS
SHAREHOLDERS
1 HOTEL =
1 DATABASE
GROWTH
LOYALTY
SCHEME
STOCK PRICE
COST
REDUCTION
OUTMODED IT
6 CASINOS
60 HOTELS
Poor
data?
Blame
the CIO
DUPLICATE
CUSTOMERS
10 NIGHTCLUBS
We
know
our data
stinks
Stubs?
Who
cares?
We don’t.
Valet Parking:
Stubs not
submitted loss
$2.5M pa
41% of all
supplies are
Emergency
Supplies; cost
$21.7m
406 email
addresses for
Mickey Mouse
in CRM
$315k lost on
returned
magazines
Supply
management
problems
PROBLEM THEMES
Lack of business
accountability for
data
Cultural
issues about
data capture
Uncontrolled
customer data
duplication
No single
customer
view
Financial
data trust
and rework
Potential need
for IT
investment
Poor
marketing
data quality
RICH PICTURE OF DATA QUALITY PROBLEMS AT ACME HOTEL & CASINO GROUP
Global Data Strategy, Ltd. 2020
A2E Step 2: Baseline
• Gives a quantitative view of key data quality problems
• Measure the baseline quality of key data sources to
quantify the issues
• To do this:
• Select the key data sources and data domains identified in
the Step 1 Assessment
• Profile the data (ideally use a data profiling tool) and focus
on key objects and attributes
• Assess the data according to the 7 Dimensions of Data
Quality – see next slide
• Present the results to relevant stakeholders - gain
consensus on the business impact of the problems found
• Expand and refine the Data Quality issues log
• Data Quality Report(s)
• Data Profiling outputs – derived metadata
• Updated Issues Log, with quantification of
financial costs and other business impacts
17
BASELINE CRITICAL DATA SOURCES POTENTIAL OUTPUTS & TOOLS
Example partial Data Profiling report
Global Data Strategy, Ltd. 2020
Baselining & Setting KPIs: the 7 Dimensions of Data Quality
18
Completeness
Accuracy
Uniqueness
ValidityConsistency
Accessibility
Timeliness
CONTENT
DIMENSIONS
CONTEXT
DIMENSIONS
Key:
Is all the required data present?
(e.g. date of birth in a DoB field)
In a data source, is the entry
unique or are there unintended
duplicate records?
(e.g. same client organization
spelled several different ways in
multiple CRM records)
Does the data reflect the real
world?
(e.g. current customer address)
Do the users who need to use
the data have access to it?
(e.g. Finance team and invoice
data held in data warehouse)
Is the data available to users when
they need it and is it sufficiently
timely to meet their needs?
(e.g. invoices sent in last 24 hours
available on the data warehouse by
9am the next day)
Where data is held in different
sources, are the sources consistent?
(e.g. current customer address)
Does the data conform to a
specified or expected format and /
or business rule?
(e.g. date of birth as DD/MM/YYYY;
age between 18 and 120 years)
THE SEVEN
DIMENSIONS
OF DATA
QUALITY
Global Data Strategy, Ltd. 2020
Measuring Data Improvements
• KPIs & Measures aligned with concrete business drivers
• Helps prioritize efforts
• Assists with the “Why do I Care?” issue
• Basis for showing benefits and results
19
Align Data Quality Metrics to Business Improvements
KPI Current Target Status Business Benefits Type
Number of duplicate
customer records
2,000,000 1,000 • Correct # of customers for sales estimations
• Better single view of customer for integrated social
media campaign
• Reduce cost of physical mailing by $20K
• Cost savings
• Brand Reputation
• Marketing Innovation
Incorrect Salutation (Mr,
Ms, etc.)
5,000 1,000 •Customer satisfaction & Brand reputation harmed by
incorrect salutation.
•Targeted marketing campaigns by gender.
• Brand Reputation
• Campaign Effectiveness
Incorrect address/location 10,000 500 • Lower return rate on physical mailings
• Better targeted marketing by region.
• Cost Savings
• Campaign Effectiveness
Missing Sales Rep Assigned 500 100 • Ability for Sales to execute on customer leads
• Revenue growth
• Sales Effectiveness
Etc.
Business Driver: Improving Customer Data for Marketing Launch Campaign
Global Data Strategy, Ltd. 2020
The Importance of KPIs
• Most businesses set strategic goals they desire to achieve, and measure these goals against Key
Performance Indicators (KPIs).
• These KPIs provide a concrete, objective way to measure progress towards these goals
• To again use Finance as a comparison, they have a number of KPIs they use
to manage financial assets.
• Revenue Projections
• Budget Goals & Limits
• Expense Ration, etc.
• We need to do the same with data assets.
• % complete
• % accuracy
• Timeliness
• Etc.
20
“You Can’t Manage What You Can’t Measure”
Global Data Strategy, Ltd. 2020
A2E Step 3: Converge
• Determine initial data quality improvement projects;
focus in on two things:
• Potential pilot / proof of concept data quality
improvement project(s)
• Data quality improvement projects with the largest net
benefits
• Note: these are often NOT the same thing; in the early
stages of a DQ initiative it’s important to establish
credibility and prove the potential benefits of wider
adoption via a PoC
• Work with stakeholders to identify priorities from the
Data Quality Issues log
• Prioritize projects (e.g. Priority Grid)
• Run pilots / proofs of concept
• Identify and run initial DQ improvement projects
• Prioritised Data Quality Issues Log
• Priority Grid
• Agreed pilot project(s)
• Agreed potential DQ projects
• Business cases
KEY MESSAGE:
Focus & Purpose: the Pareto Principle
80% of business benefit can often be delivered through
improving the quality of 20% of the data – concentrate on
the 20% that really matters (good candidates are often
shared master data, reference data etc.)
21
PRIORITIZE & FOCUS ON SPECIFIC
ISSUES & OPPORTUNTIES
POTENTIAL OUTPUTS & TOOLS
Global Data Strategy, Ltd. 2020
Setting Priorities: Priority Grid
High Benefits – Low Difficulty
PRIORITY
1
Low Benefits – High Difficulty
PRIORITY
4
High Benefits – High Difficulty
PRIORITY
2
Low Benefits – Low Difficulty
PRIORITY
3
LEVEL OF DIFFICULTY
BENEFITS
22
• Priorities based on Benefits vs. Level of Difficulty can often be easily determined via a workshop
activity using a Priority Grid.
Global Data Strategy, Ltd. 2020
A2E Step 4: Develop
• Root Cause Analysis diagrams
• Updated business cases & case study
• Data Quality KPIs and thresholds based on the
7 Data Quality Dimensions
• Data Improvement Plans
23
DESIGN & IMPLEMENT IMPROVEMENTS POTENTIAL OUTPUTS & TOOLS
• Create data quality improvement team to include:
• Business stakeholders (Data producers, consumers and
others, e.g. process owners)
• IT stakeholders – SMEs, DBAs etc.
• Other specialists as required (e.g. Data Protection
Officer if Personal Data involved)
• Note: It is important to align with Data Governance
Initiatives & Roles (e.g. Data Owners, Data Stewards)
• Re-analyze current problems
• Perform root cause analysis
• Design and implement improvements
• Design and implement changes
• Set data quality KPIs
• Measure improvements against KPIs
• Revisit the business case to log benefits
• Identify future improvements
• Produce case study
Global Data Strategy, Ltd. 2020
Overall Problem Themes, Impact & Interconnections
Root Cause Analysis
Poor
Data
Quality
Data
Resource /
Skill
Shortages
Process
Inefficiencies
High
Rework &
Failure
Costs
Multiple
Versions
of Truth
Regulatory
Risks
Ineffective
Data
Integration
No Formal
Accountability
for Data
Siloed Data
Problem
Fixes
No Data
Strategy or
Architecture
Bad
Customer /
Member
Experience
Poorly
Integrated IT
Platforms &
Tools
Lack of
prioritisation
of data
improvement
efforts
Poor
Customer
Segmentation
Ineffective
Marketing
Campaigns
Lack of
Investment
in Data Skills
Revenue
Loss
24
Key: CAUSE /
EFFECT
Causes or contributes to
Global Data Strategy, Ltd. 2020
Overall Problem Themes, Impact & Interconnections
Poor
Data
Quality
Data
Resource /
Skill
Shortages
Process
Inefficiencies
High
Rework &
Failure
Costs
Multiple
Versions
of Truth
Regulatory
Risks
Ineffective
Data
Integration
No Formal
Accountability
for Data
Siloed Data
Problem
Fixes
No Data
Strategy or
Architecture
Bad
Customer /
Member
Experience
Poorly
Integrated IT
Platforms &
Tools
Lack of
prioritisation
of data
improvement
efforts
Poor
Customer
Segmentation
Ineffective
Marketing
Campaigns
Lack of
Investment
in Data Skills
Revenue
Loss
ROOT
CAUSE
25
END
RESULT
Key: CAUSE /
EFFECT
Causes or
contributes toRoot Cause Analysis
Global Data Strategy, Ltd. 2020
Data Improvement Plan
A Data Improvement Plan is a formal plan to
specify and manage improvements to a
specified data domain and / or data problem
area
26
The benefits of a Data Improvement Plan are
that it:
• Sets out goals and expectations for data
improvement
• Acts as a focal point for all data improvement
activities
• Prioritizes improvement activities
• Can be used to track improvements and
communicate successes
• Can evolve to align with the changing needs of
the business
Data domain DIPs can be rolled up to form the core
of a company wide Data Quality Improvement
Program
Global Data Strategy, Ltd. 2020
A2E Step 5: Evaluate
• Embed Data Quality improvement as a business as
usual activity
• Evolve Data Quality improvement teams into wider
Data Governance structure:
• Track Data Quality improvements via Data Quality
Dashboards
• Monitor financial and business benefits over time
• Evangelising benefits – part of your job is
marketing!
• Evolving & incremental Data Improvement Plans
• Regular Data Quality Dashboard updates and analysis
• Business Process Change
• Continued ROI and financial benefits
• Communication Plan and Organizational Change Efforts
27
EVALUATE & SUSTAIN GAINS POTENTIAL OUTPUTS & TOOLS
Global Data Strategy, Ltd. 2020
Summary
• Data quality is complex because businesses and organizations are complex
• Addressing data quality issues requires a holistic approach combining people, process, and
technology change
• Data governance is needed to sustain data quality improvement – it orchestrates the people,
processes and organizational structures required to improve data quality
• Build quantifiable Data Improvement Plans to show demonstrable ROI and implement a culture
of continuous data quality improvement
• It’s vital to deliver frequent incremental improvements to maintain business interest and backing
• Data quality is a multi-dimensional issue for organizations so tackle it by using multi-dimensional
approaches
28
Global Data Strategy, Ltd. 2020
About Global Data Strategy™, Ltd
• Global Data Strategy™ is an international information management consulting company that
specializes in the alignment of business drivers with data-centric technology.
• Our passion is data, and helping organizations enrich their business opportunities through data and
information.
• Our core values center around providing solutions that are:
• Business-Driven: We put the needs of your business first, before we look at any technology solution.
• Clear & Relevant: We provide clear explanations using real-world examples.
• Customized & Right-Sized: Our implementations are based on the unique needs of your organization’s
size, corporate culture, and geography.
• High Quality & Technically Precise: We pride ourselves in excellence of execution, with years of
technical expertise in the industry.
29
Data-Driven Business Transformation
Business Strategy
Aligned With
Data Strategy
Visit www.globaldatastrategy.com for more information
Global Data Strategy, Ltd. 2020
Check Out Nigel’s Last Blog
To read more on the topic, check out Nigel’s latest blog at:
https://globaldatastrategy.com/global-data-strategy-blogs/data-quality-multidimensional/
30
Global Data Strategy, Ltd. 2020
DATAVERSITY Data Architecture Strategies
• January 23 Emerging Trends in Data Architecture – What’s the Next Big Thing?
• February 27 Building a Data Strategy - Practical Steps for Aligning with Business Goals
• March 26 Cloud-Based Data Warehousing – What's New and What Stays the Same
• April 23 Master Data Management – Aligning Data, Process, and Governance
• May 28 Data Governance and Data Architecture – Alignment and Synergies
• June 25 Enterprise Architecture vs. Data Architecture
• July 22 Best Practices in Metadata Management
• August 27 Data Quality Best Practices – with Nigel Turner
• September 24 Data Virtualization – Separating Myth from Reality
• October 22 Data Architect vs. Data Engineer vs. Data Modeler
• December 1 Graph Databases: Practical Use Cases
31
Join us next month
Global Data Strategy, Ltd. 2020
Questions?
32
• Thoughts? Ideas?
www.globaldatastrategy.com

DAS Slides: Data Quality Best Practices

  • 1.
    Copyright Global DataStrategy, Ltd. 2020 Data Quality Best Practices Donna Burbank and Nigel Turner Global Data Strategy, Ltd. August 27th, 2020 Follow on Twitter @donnaburbank, @nigelturner8 @GlobalDataStrat Twitter Event hashtag: #DAStrategies
  • 2.
    Global Data Strategy,Ltd. 2020 Donna Burbank 2 Donna is a recognised industry expert in information management with over 20 years of experience in data strategy, information management, data modeling, metadata management, and enterprise architecture. Her background is multi-faceted across consulting, product development, product management, brand strategy, marketing, and business leadership. She is currently the Managing Director at Global Data Strategy, Ltd., an international information management consulting company that specializes in the alignment of business drivers with data-centric technology. In past roles, she has served in key brand strategy and product management roles at CA Technologies and Embarcadero Technologies for several of the leading data management products in the market. As an active contributor to the data management community, she is a long time DAMA International member, Past President and Advisor to the DAMA Rocky Mountain chapter, and was awarded the Excellence in Data Management Award from DAMA International. Donna is also an analyst at the Boulder BI Train Trust (BBBT) where she provides advice and gains insight on the latest BI and Analytics software in the market. She was on several review committees for the Object Management Group’s for key information management and process modeling notations. She has worked with dozens of Fortune 500 companies worldwide in the Americas, Europe, Asia, and Africa and speaks regularly at industry conferences. She has co- authored two books: Data Modeling for the Business and Data Modeling Made Simple with ERwin Data Modeler and is a regular contributor to industry publications. She can be reached at donna.burbank@globaldatastrategy.com Donna is based in Boulder, Colorado, USA. Follow on Twitter @donnaburbank @GlobalDataStrat Twitter Event hashtag: #DAStrategies
  • 3.
    Global Data Strategy,Ltd. 2020 Nigel Turner Nigel Turner has worked in Information Management (IM) and related areas for over 25 years. This experience has embraced Data Governance, Information Strategy, Data Quality, Data Governance, Master Data Management, & Business Intelligence. He spent much of his career in British Telecommunications Group (BT) where he led a series of enterprise wide IM & data governance initiatives. After leaving BT in 2010 Nigel became VP of Information Management Strategy at Harte Hanks Trillium Software, a leading global provider of Data Quality & Data Governance tools and consultancy. Here he engaged with over 150 customer organizations from all parts of the globe. Currently Principal Consultant for EMEA at Global Data Strategy, Ltd, he has been a principal consultant at such firms as FromHereOn and IPL, where he has led Data Governance engagement with customers such as First Great Western. Nigel is a well known thought leader in Information Management and has presented at many international conferences. Until recently he also worked part time at Cardiff University, where he set up a Student Software Enterprise company. In addition he has also been a part time Associate Lecturer at the UK Open University where he taught Systems & Management. Nigel is very active in professional Data Management organizations and is an elected Data Management Association (DAMA) UK Committee member. He was the joint winner of DAMA International’s 2015 Community Award for the work he initiated and led in setting up a mentoring scheme in the UK where experienced DAMA professionals coach and support newer data management professionals. Nigel is based in Cardiff, Wales, UK. Follow on Twitter @NigelTurner8 Today’s hashtag: # DAStrategies
  • 4.
    Global Data Strategy,Ltd. 2020 DATAVERSITY Data Architecture Strategies • January 23 Emerging Trends in Data Architecture – What’s the Next Big Thing? • February 27 Building a Data Strategy - Practical Steps for Aligning with Business Goals • March 26 Cloud-Based Data Warehousing – What's New and What Stays the Same • April 23 Master Data Management – Aligning Data, Process, and Governance • May 28 Data Governance and Data Architecture – Alignment and Synergies • June 25 Enterprise Architecture vs. Data Architecture • July 22 Best Practices in Metadata Management • August 27 Data Quality Best Practices – with Nigel Turner • September 24 Data Virtualization – Separating Myth from Reality • October 22 Data Architect vs. Data Engineer vs. Data Modeler • December 1 Graph Databases: Practical Use Cases 4 This Year’s Lineup
  • 5.
    Global Data Strategy,Ltd. 2020 What We’ll Cover Today 5 • Tackling data quality problems requires more than a series of tactical, one off improvement projects. • By their nature, many data quality problems extend across and often beyond an organization. • Addressing these issues requires a holistic architectural approach combining people, process and technology. • This webinar provides practical ways to control data quality issues in your organization.
  • 6.
    Global Data Strategy,Ltd. 2020 6 A Successful Data Strategy links Business Goals with Technology Solutions “Top-Down” alignment with business priorities “Bottom-Up” management & inventory of data sources Managing the people, process, policies & culture around data Coordinating & integrating disparate data sources Leveraging & managing data for strategic advantage Data Quality is Part of a Wider Data Strategy www.globaldatastrategy.com
  • 7.
    Global Data Strategy,Ltd. 2020 Data Quality: Some Common Misconceptions 7 Data Quality is a stand alone discipline NOT TRUE – Data Quality is closely interdependent with other disciplines, e.g. Data Governance, MDM, Data Architecture, BI, Analytics, etc. Data Quality is an IT problem & so IT tools can fix it NOT TRUE – Data Quality is multi-faceted, caused by process, people and IT issues, so solutions must be holistic and business-driven Data Quality improvement is a choice NOT TRUE – all organizations continually do data quality improvement; it’s not about IF you do it but HOW you do it Data Quality improvement is a project NOT TRUE – it may start with a project, but it has no end; it must evolve into a Business As Usual (BaU) continuous process of improvement
  • 8.
    Global Data Strategy,Ltd. 2020 Data Quality – A Simple Definition 8 Data that is demonstrably fit for purpose. Demonstrably: Implies that improvement can be measured and business impact demonstrated Fit for Purpose: Data quality must meet the needs of the business
  • 9.
    Global Data Strategy,Ltd. 2020 Recent Data Quality Horror Stories 9 January 2020: UK insurance company sent a marketing email to all its contact base. Every email started ‘Dear Michael’… N Turner 111 Happy Close Cardiff,UK Since May 2012: UK pharmacy convinced Nigel is female (despite frequent feedback to the contrary). He still gets many cosmetics offers… November 2019: UK Retail bank undertook disastrous customer data migration in 2018 and did no DQ analysis before doing so. Total cost to fix problems and compensate customers: $480 million April 2020: UK governments sent shielding letters to vulnerable people. 975k sent; 600k people missed and / or 17% of letters sent to wrong addresses
  • 10.
    Global Data Strategy,Ltd. 2020 10 ANNOYANCE: Creates anger & frustration On Companies & Organizations Poor Data Quality: Overall Impact On Individuals ECONOMIC IMPACT: Hits Revenues, Costs, Profits REPUTATION: Impacts Brand & Customer Loyalty LAW & REGULATION: Increases risk & exposure PERSONAL HARM: Physical, mental or emotional DESIRE FOR RETRIBUTION: Social media gives individuals voice and influence
  • 11.
    Global Data Strategy,Ltd. 2020 Data Quality – a Holistic Approach Improving Data Quality requires a combination of People, Process, and Technology. 11 People Process Technology • Data Governance & Stewardship • Business Rules • Business Process Alignment • Data Management Best Practices • Data Management Tools • Data Architecture Best Practices
  • 12.
    Global Data Strategy,Ltd. 2020 Tackling Data Quality: the A2E approach 12 Assess Baseline ConvergeDevelop Evaluate CYCLE OF CONTINUOUS DATA QUALITY IMPROVEMENT Step Purpose Assess Business Usage Understand what data exists and how it is used within the organization Baseline Data Sources Baseline the current quality of the data and assess how well it is meeting business needs Converge on Business Critical Areas Focus priorities to optimise early business benefits and set ‘fit for purpose’ quality targets to guide improvement activities Develop Improvements Design & deploy improvement initiatives (encompassing people, process, and technology) and measure the impact against targets Evaluate Benefits & ROI Regularly measure the data and continue to improve it so that it continues to meet current and future business needs
  • 13.
    Global Data Strategy,Ltd. 2020 A2E Step 1: Assess • Understand the business and its primary goals & objectives • Analyze what data the business: • Relies on today • Will need to support its future aspirations • Identify the primary data stakeholders: • Business • IT • External parties (e.g. customers, suppliers, partners) • Work with them to evaluate current data ‘fitness for purpose’ and establish: • Where / how it is captured, stored and processed • What’s working well • What needs to be improved • The potential benefits of better data quality • Create a Data Quality Issues (& Opportunities) Log • Highlight: • Most important business critical data domains • Business impact • Main data creators and consumers • Accountability for the data • Current problems and issues with the data • Opportunities & potential benefits • Outputs may include: • RACI Stakeholder Matrix • Rich Picture highlighting real-world issues • Data Quality Issues Log • Business Data Model • Business Process Model • ROI analysis 13 ASSESS THE BUSINESS LANDSCAPE POTENTIAL OUTPUTS & TOOLS
  • 14.
    Global Data Strategy,Ltd. 2020 Data Quality Complexity & Value of Rich Pictures • Data Quality is a ‘messy’ and complex issue: • Problems often poorly understood (e.g. data flows and lineage) • Lack of information & hard facts (e.g. measures) • Large numbers of people involved with differing perspectives (e.g. data producers, data consumers, senior executives, customers, suppliers) • Problem ownership unclear (e.g. problem origins and impacts) • Rich Pictures have great value: • Ideal starting point for complex (messy) organizational problems like data quality • Holistic, embracing people, process & technology • Highlight interconnectedness of problems • Best initially created in a workshop (whiteboard and coloured pens ideal!) - encourage participants to contribute • Primary use is to derive ‘problem themes’ to enable focus on key issues 14
  • 15.
    Global Data Strategy,Ltd. 2020 Our details again?! RICH PICTURE OF DATA QUALITY PROBLEMS AT ACME HOTEL & CASINO GROUP CFO CMO CIO ACME TRAVEL MAGAZINE COO Our details again?! Our details again?! CEO (NEW) BUSINESS Untrusted financial results Data? It’s not my problem Help! I can’t improve data on my own POOR DATA QUALITY Finance data rework & delay BUSINESS & IT MEETINGS SHAREHOLDERS 1 HOTEL = 1 DATABASE GROWTH LOYALTY SCHEME STOCK PRICE COST REDUCTION OUTMODED IT 6 CASINOS 60 HOTELS Poor data? Blame the CIO DUPLICATE CUSTOMERS 10 NIGHTCLUBS We know our data stinks Stubs? Who cares? We don’t. Valet Parking: Stubs not submitted loss $2.5M pa 41% of all supplies are Emergency Supplies; cost $21.7m 406 email addresses for Mickey Mouse in CRM $315k lost on returned magazines
  • 16.
    Global Data Strategy,Ltd. 2020 Our details again?! CFO CMO CIO ACME TRAVEL MAGAZINE COO Our details again?! Our details again?! CEO (NEW) BUSINESS Untrusted financial results Data? It’s not my problem Help! I can’t improve data on my own POOR DATA QUALITY Finance data rework & delay BUSINESS & IT MEETINGS SHAREHOLDERS 1 HOTEL = 1 DATABASE GROWTH LOYALTY SCHEME STOCK PRICE COST REDUCTION OUTMODED IT 6 CASINOS 60 HOTELS Poor data? Blame the CIO DUPLICATE CUSTOMERS 10 NIGHTCLUBS We know our data stinks Stubs? Who cares? We don’t. Valet Parking: Stubs not submitted loss $2.5M pa 41% of all supplies are Emergency Supplies; cost $21.7m 406 email addresses for Mickey Mouse in CRM $315k lost on returned magazines Supply management problems PROBLEM THEMES Lack of business accountability for data Cultural issues about data capture Uncontrolled customer data duplication No single customer view Financial data trust and rework Potential need for IT investment Poor marketing data quality RICH PICTURE OF DATA QUALITY PROBLEMS AT ACME HOTEL & CASINO GROUP
  • 17.
    Global Data Strategy,Ltd. 2020 A2E Step 2: Baseline • Gives a quantitative view of key data quality problems • Measure the baseline quality of key data sources to quantify the issues • To do this: • Select the key data sources and data domains identified in the Step 1 Assessment • Profile the data (ideally use a data profiling tool) and focus on key objects and attributes • Assess the data according to the 7 Dimensions of Data Quality – see next slide • Present the results to relevant stakeholders - gain consensus on the business impact of the problems found • Expand and refine the Data Quality issues log • Data Quality Report(s) • Data Profiling outputs – derived metadata • Updated Issues Log, with quantification of financial costs and other business impacts 17 BASELINE CRITICAL DATA SOURCES POTENTIAL OUTPUTS & TOOLS Example partial Data Profiling report
  • 18.
    Global Data Strategy,Ltd. 2020 Baselining & Setting KPIs: the 7 Dimensions of Data Quality 18 Completeness Accuracy Uniqueness ValidityConsistency Accessibility Timeliness CONTENT DIMENSIONS CONTEXT DIMENSIONS Key: Is all the required data present? (e.g. date of birth in a DoB field) In a data source, is the entry unique or are there unintended duplicate records? (e.g. same client organization spelled several different ways in multiple CRM records) Does the data reflect the real world? (e.g. current customer address) Do the users who need to use the data have access to it? (e.g. Finance team and invoice data held in data warehouse) Is the data available to users when they need it and is it sufficiently timely to meet their needs? (e.g. invoices sent in last 24 hours available on the data warehouse by 9am the next day) Where data is held in different sources, are the sources consistent? (e.g. current customer address) Does the data conform to a specified or expected format and / or business rule? (e.g. date of birth as DD/MM/YYYY; age between 18 and 120 years) THE SEVEN DIMENSIONS OF DATA QUALITY
  • 19.
    Global Data Strategy,Ltd. 2020 Measuring Data Improvements • KPIs & Measures aligned with concrete business drivers • Helps prioritize efforts • Assists with the “Why do I Care?” issue • Basis for showing benefits and results 19 Align Data Quality Metrics to Business Improvements KPI Current Target Status Business Benefits Type Number of duplicate customer records 2,000,000 1,000 • Correct # of customers for sales estimations • Better single view of customer for integrated social media campaign • Reduce cost of physical mailing by $20K • Cost savings • Brand Reputation • Marketing Innovation Incorrect Salutation (Mr, Ms, etc.) 5,000 1,000 •Customer satisfaction & Brand reputation harmed by incorrect salutation. •Targeted marketing campaigns by gender. • Brand Reputation • Campaign Effectiveness Incorrect address/location 10,000 500 • Lower return rate on physical mailings • Better targeted marketing by region. • Cost Savings • Campaign Effectiveness Missing Sales Rep Assigned 500 100 • Ability for Sales to execute on customer leads • Revenue growth • Sales Effectiveness Etc. Business Driver: Improving Customer Data for Marketing Launch Campaign
  • 20.
    Global Data Strategy,Ltd. 2020 The Importance of KPIs • Most businesses set strategic goals they desire to achieve, and measure these goals against Key Performance Indicators (KPIs). • These KPIs provide a concrete, objective way to measure progress towards these goals • To again use Finance as a comparison, they have a number of KPIs they use to manage financial assets. • Revenue Projections • Budget Goals & Limits • Expense Ration, etc. • We need to do the same with data assets. • % complete • % accuracy • Timeliness • Etc. 20 “You Can’t Manage What You Can’t Measure”
  • 21.
    Global Data Strategy,Ltd. 2020 A2E Step 3: Converge • Determine initial data quality improvement projects; focus in on two things: • Potential pilot / proof of concept data quality improvement project(s) • Data quality improvement projects with the largest net benefits • Note: these are often NOT the same thing; in the early stages of a DQ initiative it’s important to establish credibility and prove the potential benefits of wider adoption via a PoC • Work with stakeholders to identify priorities from the Data Quality Issues log • Prioritize projects (e.g. Priority Grid) • Run pilots / proofs of concept • Identify and run initial DQ improvement projects • Prioritised Data Quality Issues Log • Priority Grid • Agreed pilot project(s) • Agreed potential DQ projects • Business cases KEY MESSAGE: Focus & Purpose: the Pareto Principle 80% of business benefit can often be delivered through improving the quality of 20% of the data – concentrate on the 20% that really matters (good candidates are often shared master data, reference data etc.) 21 PRIORITIZE & FOCUS ON SPECIFIC ISSUES & OPPORTUNTIES POTENTIAL OUTPUTS & TOOLS
  • 22.
    Global Data Strategy,Ltd. 2020 Setting Priorities: Priority Grid High Benefits – Low Difficulty PRIORITY 1 Low Benefits – High Difficulty PRIORITY 4 High Benefits – High Difficulty PRIORITY 2 Low Benefits – Low Difficulty PRIORITY 3 LEVEL OF DIFFICULTY BENEFITS 22 • Priorities based on Benefits vs. Level of Difficulty can often be easily determined via a workshop activity using a Priority Grid.
  • 23.
    Global Data Strategy,Ltd. 2020 A2E Step 4: Develop • Root Cause Analysis diagrams • Updated business cases & case study • Data Quality KPIs and thresholds based on the 7 Data Quality Dimensions • Data Improvement Plans 23 DESIGN & IMPLEMENT IMPROVEMENTS POTENTIAL OUTPUTS & TOOLS • Create data quality improvement team to include: • Business stakeholders (Data producers, consumers and others, e.g. process owners) • IT stakeholders – SMEs, DBAs etc. • Other specialists as required (e.g. Data Protection Officer if Personal Data involved) • Note: It is important to align with Data Governance Initiatives & Roles (e.g. Data Owners, Data Stewards) • Re-analyze current problems • Perform root cause analysis • Design and implement improvements • Design and implement changes • Set data quality KPIs • Measure improvements against KPIs • Revisit the business case to log benefits • Identify future improvements • Produce case study
  • 24.
    Global Data Strategy,Ltd. 2020 Overall Problem Themes, Impact & Interconnections Root Cause Analysis Poor Data Quality Data Resource / Skill Shortages Process Inefficiencies High Rework & Failure Costs Multiple Versions of Truth Regulatory Risks Ineffective Data Integration No Formal Accountability for Data Siloed Data Problem Fixes No Data Strategy or Architecture Bad Customer / Member Experience Poorly Integrated IT Platforms & Tools Lack of prioritisation of data improvement efforts Poor Customer Segmentation Ineffective Marketing Campaigns Lack of Investment in Data Skills Revenue Loss 24 Key: CAUSE / EFFECT Causes or contributes to
  • 25.
    Global Data Strategy,Ltd. 2020 Overall Problem Themes, Impact & Interconnections Poor Data Quality Data Resource / Skill Shortages Process Inefficiencies High Rework & Failure Costs Multiple Versions of Truth Regulatory Risks Ineffective Data Integration No Formal Accountability for Data Siloed Data Problem Fixes No Data Strategy or Architecture Bad Customer / Member Experience Poorly Integrated IT Platforms & Tools Lack of prioritisation of data improvement efforts Poor Customer Segmentation Ineffective Marketing Campaigns Lack of Investment in Data Skills Revenue Loss ROOT CAUSE 25 END RESULT Key: CAUSE / EFFECT Causes or contributes toRoot Cause Analysis
  • 26.
    Global Data Strategy,Ltd. 2020 Data Improvement Plan A Data Improvement Plan is a formal plan to specify and manage improvements to a specified data domain and / or data problem area 26 The benefits of a Data Improvement Plan are that it: • Sets out goals and expectations for data improvement • Acts as a focal point for all data improvement activities • Prioritizes improvement activities • Can be used to track improvements and communicate successes • Can evolve to align with the changing needs of the business Data domain DIPs can be rolled up to form the core of a company wide Data Quality Improvement Program
  • 27.
    Global Data Strategy,Ltd. 2020 A2E Step 5: Evaluate • Embed Data Quality improvement as a business as usual activity • Evolve Data Quality improvement teams into wider Data Governance structure: • Track Data Quality improvements via Data Quality Dashboards • Monitor financial and business benefits over time • Evangelising benefits – part of your job is marketing! • Evolving & incremental Data Improvement Plans • Regular Data Quality Dashboard updates and analysis • Business Process Change • Continued ROI and financial benefits • Communication Plan and Organizational Change Efforts 27 EVALUATE & SUSTAIN GAINS POTENTIAL OUTPUTS & TOOLS
  • 28.
    Global Data Strategy,Ltd. 2020 Summary • Data quality is complex because businesses and organizations are complex • Addressing data quality issues requires a holistic approach combining people, process, and technology change • Data governance is needed to sustain data quality improvement – it orchestrates the people, processes and organizational structures required to improve data quality • Build quantifiable Data Improvement Plans to show demonstrable ROI and implement a culture of continuous data quality improvement • It’s vital to deliver frequent incremental improvements to maintain business interest and backing • Data quality is a multi-dimensional issue for organizations so tackle it by using multi-dimensional approaches 28
  • 29.
    Global Data Strategy,Ltd. 2020 About Global Data Strategy™, Ltd • Global Data Strategy™ is an international information management consulting company that specializes in the alignment of business drivers with data-centric technology. • Our passion is data, and helping organizations enrich their business opportunities through data and information. • Our core values center around providing solutions that are: • Business-Driven: We put the needs of your business first, before we look at any technology solution. • Clear & Relevant: We provide clear explanations using real-world examples. • Customized & Right-Sized: Our implementations are based on the unique needs of your organization’s size, corporate culture, and geography. • High Quality & Technically Precise: We pride ourselves in excellence of execution, with years of technical expertise in the industry. 29 Data-Driven Business Transformation Business Strategy Aligned With Data Strategy Visit www.globaldatastrategy.com for more information
  • 30.
    Global Data Strategy,Ltd. 2020 Check Out Nigel’s Last Blog To read more on the topic, check out Nigel’s latest blog at: https://globaldatastrategy.com/global-data-strategy-blogs/data-quality-multidimensional/ 30
  • 31.
    Global Data Strategy,Ltd. 2020 DATAVERSITY Data Architecture Strategies • January 23 Emerging Trends in Data Architecture – What’s the Next Big Thing? • February 27 Building a Data Strategy - Practical Steps for Aligning with Business Goals • March 26 Cloud-Based Data Warehousing – What's New and What Stays the Same • April 23 Master Data Management – Aligning Data, Process, and Governance • May 28 Data Governance and Data Architecture – Alignment and Synergies • June 25 Enterprise Architecture vs. Data Architecture • July 22 Best Practices in Metadata Management • August 27 Data Quality Best Practices – with Nigel Turner • September 24 Data Virtualization – Separating Myth from Reality • October 22 Data Architect vs. Data Engineer vs. Data Modeler • December 1 Graph Databases: Practical Use Cases 31 Join us next month
  • 32.
    Global Data Strategy,Ltd. 2020 Questions? 32 • Thoughts? Ideas? www.globaldatastrategy.com