1. A HITCHHIKER'S GUIDE TO
DATA QUALITY
Tatiana Stebakova
The Data & InformationAssembly Australia April 2015
2. Evolution of DQ Governance approach over the past 10
years
How to make a quantum leap from DQ theory to
execution, personal view
You’ve done it all by the book, but there is little traction
in Data quality. DQ and system’s thinking. Don’t panic!
Content
3. Evolution of DQ Governance approach
over the past 10 years
Data Duplicates – still magic words
Data Quality Frameworks - from emergence to maturity
Senior Management Support - a breakthrough
Senior Architects Support – little change
Data Quality Governance - from novelty to mainstream
Data QualityTools andTechnology – from luxury to BAU
Metadata - from “what is it?” to “new black”
4. How to make a quantum leap from DQ theory
to execution, personal view
5. Step1. Data Quality Justification
DQ Horror stories
About 6.5 million Americans are 112 or
older. The US Social Security office has 6.5
million people on record as having reached
the age of 112, even though only 42 people
are known to be that old globally
"Studies in cost analysis show that
between 15% to > 20% of a company’s operating
revenue is spent doing things to get around or fix
data quality issues"
Larry English
Option 1 –What can we
gain?
Option 2 – Scare technique
6. Option 3(my favourite)–Risks
"Poor data is like a dirty windscreen. You can continue driving as your
vision degrades, but at some point you must stop and clear the
windscreen or risk everything"
Ken Orr
7. Step2. Build DQ requirements into solution
architecture and system’s development contract
Example of DQ requirements
ETL solution SHALLhave capability to perform Column integrity screening/ profiling
ETL solution SHALLhave capability to perform Data Structure screening/ profiling
ETL solution SHALLhave capability to perform Compliance to Business rule screening/ profiling
Quality should be built into the product, and testing alone
cannot be relied to ensure product quality (FDA,Current
Good Manufacturing Practice)
The … ETL controls solution SHALL perform a periodic full snapshot
of the same data for reconciliation purposes, if Delta files are used.
The … ETL solution SHALL have capability to perform Data
Structure screening/profiling
The … data extract process SHALL support logical data
consistency (temporal relationship of data).
8. Step3. Build data quality requirements into
system’s operation contract + DQ KPIs
“I’ve never been a good
spectator.
Either I’m playing the
game or I’m not
interested.”
Christiaan Barnard, the first surgeon,
performed heart transplant
…..solution shall have a capability to measure and report on the data quality Key Performance Indicators
(KPIs) as defined by the Governance authority.
KPI Examples:
• customer record uniqueness
• directory currency and accessibility
• information provenance.
• uptake rate - coverage
• quality of records per DQ dimensions and characteristics
• response time for typical transactions.
9. You’ve done it all by the book, but there
is little traction in Data quality.
Don’t be afraid
From Hitchhiker to Hijacker
Become a driver. Apply for the architect’s, project lead or data
management jobs
Drop your “data quality bugs/requirements” anywhere you can
Look for opportunities.Change your strategy all the time
Mimic your requirements, do not call them DQ requirements
Lean on standards
Do not reference DQ gurus. ReferenceTechnology gurus instead
Befriend architects
Be patient, keep cool
““Success is not final,
failure is not fatal: it is
the courage to continue
that counts.”
Winston Churchill
10. Complex adaptive systems (CAS) - are dynamic systems able to
adapt with a changing environment where all participants are closely
linked with each other making up an “IT ecosystem” (MIT)
Within such ecosystem, change becomes not so much as adaptation,
but co-evolution with all other related systems
Rules of flocking:
Follow the leader
Align with neighbours
Avoid overcrowding
Data Quality and system’s thinking
11. System’s thinking – delayed response
Launch date - 2 March 2004
Mission duration 10 years, 11
months and 23 days
6.5 billion Kilometres
“After 10 years, and a journey of more than six
billion kilometres, the Rosetta spacecraft sent
its fridge-sized Philae lander down to Comet
67P/Churyumov-Gerasimenko”.