Do you spend hours struggling to manually produce the reports management demands? Are you working with disparate islands of outdated data? And, after all that hard work, are the reports produced inaccurate and untrustworthy?
One of the easiest ways to improve the quality of information that you are able to provide is by simply sourcing good data. This presentation will show you the best practices for sourcing data to ensure that it is trusted, credible and reliable.
2. Welcome
Why is data quality
important?
Our 10 best practices
Agenda:
3. Data Quality Story
Overbooked 10,000 tickets for event
Manual spreadsheet error
- telegraph.co.uk
4. Your data has reach…
Where data from a report is used: % of data in spreadsheets that influences CEO
* Panko and Port, 2012
Inter-departmental
69%
Within
department
31%
42%
5. Just how much of an issue is data quality?
1 in 10 organisations rate their data
quality as “excellent”
Poor data quality accounts for
20% of business process costs
$611bn The cost of poor data quality to US
companies each year
* Gartner, TDWI
6. And we want more…
2009 – enough data to fill a stack of DVDs
to the moon and back
2020 – Grow by 44x
Less than 1% of available data is
analysed
93% of execs believe they are losing
revenue as a result of not fully leveraging
the information they collect
* IDC, Oracle and EMC
1%
x44 by 2020
7. What is data quality?
HOW
RELIABLE
IS YOUR
DATA?
TRUSTED
AND
CREDIBLE
Complete
Accurate
Available
Consistent
8. Why is data quality important?
“It supports accountability”
“It gives us accurate and timely
information to manage our business”
“It ensures the best use of our resources”
“It increases our efficiency”
“It reduces the cost of rework”
“It can increase customer satisfaction”
“It ensures we have the best possible
understanding of our customers and employees”
“It improves the success rate of enterprise initiatives
like Business Intelligence…”
9. Building high quality “supply chains” of data
MEASURE
FOR QUALITY
GET THE
RIGHT DATA
BE AGILE
10. Focus on the outcome
Analysis Paralysis
Letting data dictate what is
“important”
Limited time and energy
to focus
1
ISSUES
11. 1 Focus on the outcome
Start with
the
outcome…
…then the
data.
Focus on
what matters
RECOMMENDATIONS
12. 2 Profile your data
Data supplier doesn’t know
your data needs
The data you source is as
good as the information
you provide to the
supplier…
ISSUES
13. 2 Profile your data
Write your data profile
Structure, Format, Frequency, Age, Delivery Method
Communicate it to data providers
Opportunity to identify issues and gaps
RECOMMENDATIONS
14. 3 Get as close to the source as possible
When your source data is somebody else’s
spreadsheet….
Human Error Risk
Availability of data
Unexpected Changes
Additional effort and complexity
ISSUES
15. 3 Get as close to the source as possible
CAUTION
Be cautious of
manual
spreadsheets
Skip the
spreadsheet as a
source
PLAN
Communicate and
measure for quality
RECOMMENDATIONS
16. 4 Streamline data sources
Using multiple sources
Redundant data
Increased complexity and quality risk
ISSUES
17. 4 Streamline data sources
Identify redundant data
Focus on the essentials
Cut out the stuff you don’t need
RECOMMENDATIONS
18. 5 Set data quality expectations
Perfectionism Burnout
You can’t expect to focus on everything
ISSUES
19. 5 Set data quality expectations
Focus on high impact data
Employ tolerances and ranges for quality and accuracy
RECOMMENDATIONS
RELAX
(a little)
20. 6 Catch data quality issues early
Early
$1
$10
$100
If found in the
middle of the
journey
If found at the end
Late of the journey
* Total Quality Management
If found at the
start of journey
1-10-100 Rule:
ISSUES
21. 6 Catch data quality issues early
Implement quality measures near the start of
the data supply chain
Use the “start” as a reference point when
checking data further down the journey
RECOMMENDATIONS
22. 7 Actively measure quality
ISSUES
Invalid Assumption:
If the data meets our expectations today, it will
going forward
No simple way to identify if data is correct
What happens when we do find an issue?
23. 7 Actively measure quality
OK
GOOD
NOT GOOD
Define metrics for your data quality
Measure for quality on a consistent basis
Address consistent issues with strategic
solutions (e.g. data cleansing)
RECOMMENDATIONS
24. 8 Expect Change. Embrace It.
We all know change is coming
Business activity, changes in
strategies and systems
So rigid that you need to “reset”
ISSUES
25. 8 Expect Change. Embrace It.
Likelihood
Impact
L
H
L
H
Score and rank potential changes
Focus on high likelihood/impact
changes
Have a plan in place for high risk items
RECOMMENDATIONS
26. 9 Plan for change
A change occurs, then what?
Lack of clear policies and rules on who
needs to do what…
Knowledge resting in the minds of key
individuals
ISSUES
27. 9 Plan for change
RECOMMENDATIONS
CAUTION
In the event
of a change
the following
people will…
Policies and rules Documentation Tracking Changes
28. 10 Controlled human interaction
Value of human interaction with data…
… at the cost of data quality
Uncontrolled manipulation of data
ISSUES
29. 10 Controlled human interaction
Avoid uncontrolled manipulation
Facilitate controlled and discrete changes
Make sure it is traceable
RECOMMENDATIONS
30. Recap
1 Focus on the outcome
2 Profile your data
3 Get close to the source
4 Streamline data sources
5 Set data quality expectations
31. Recap
6 Catch data quality issues early
7 Measure quality
8 Expect and embrace change
9 Plan for change
10 Controlled human interaction