This is the presentation given by Business Development Manager Helen Woodcock from KETL at the second London Jaspersoft Community User Group event. Exploring the importance of creating the right data quality KPIs in your data warehouse environment prior to focusing on BI reporting.
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
London Jaspersoft Community User Group Event 2 KETL presentation
1. Welcome to London Jaspersoft Community User
Group Thursday 16th June 2016
Introductions and update themes for next event
KETL: Why DQ is important for BI
Implementation case study: Andy Fenn and Alexander
McGuire from Workplace Systems
Break
Ernesto: Complex Report Designs with Jaspersoft Studio
http://www.jiem.org/index.php/jiem/article/view/232/130
2. by 2017, 33% of Fortune 100
organisations will experience an
information crisis, due to their
inability to to effectively value,
govern and trust their enterprise
information.
Gartner
3. www.ketl.co.uk
Impact of poor DQ
Estimates vary on the impact of bad
data on revenue (10 to 30%!). Audit
your own revenue losses from poor
data. Factor in opportunity costs
too.
4. Measuring the cost of poor DQ
http://www.jiem.org/index.php/jiem/article/view/232/130
5. Impact of poor DQ in a BI environment
Make DQ part of your BI PoC. It is much harder to go in after
the event to address data quality issues.
DQ and the resulting ETL issues will likely slow down your BI
reporting and put extra strain on your data stores.
Who owns data quality for your BI source systems? This
needs to be established and ideally it should be the BI project
team that takes responsibility for ensuring the data that they
are providing in their reports is accurate and consistent.
Get involved in data governance and implement DQ as a KPIs
for the BI team.
http://www.jiem.org/index.php/jiem/article/view/232/130
6. www.ketl.co.uk
How is ‘bad’ data
entering our systems?
People. Poorly designed data entry
fields. Duplicate entries. Multiple
data sources. Self-service user
entry.
9. www.ketl.co.uk
Getting better data.
Don’t try ‘big bang’ approach – too
daunting. Profile your data. Use
familiar datasets that you know you
can improve easily. Quick gains.
10. You have to start with a very
basic idea: data is super
messy, and data cleanup will
always be literally 80 percent of
the work. In other words, data
is the problem.
DJ Patil, Chief Data Scientist of the White House
11. www.ketl.co.uk
13-14 Orchard Street, Bristol BS1 5EH
+44 (0)117 905 5323
info@ketl.co.uk @KETL_BI
Get in touch
For further information or help with
your data project speak to Helen to
see how we can help >
Helen Woodcock
LinkedIn: /in/helenwoodcock
email: helen@ketl.co.uk
12. References and Further Reading
Data disasters
http://blogs.mazars.com/the-model-auditor/files/2014/01/12-Modelling-Horror-Stories-and-Spreadsheet-Disasters-Mazars-UK.pdf
https://www.sas.com/content/dam/SAS/en_us/doc/whitepaper1/bad-data-good-companies-106465.pdf
Research on corporate data quality
https://www.edq.com/globalassets/uk/papers/global-research-2015_20pp-ext-apr15.pdf
https://www.gartner.com/doc/2636315/state-data-quality-current-practices
https://www.edq.com/uk/resources/infographics/data-machine/
Cost of data quality
http://betanews.com/2015/02/17/why-data-quality-is-essential-to-your-analytics-strategy/
http://www.itbusinessedge.com/interviews/how-to-measure-the-cost-of-data-quality-problems.html
http://www.itbusinessedge.com/blogs/integration/what-does-bad-data-cost.html
http://techcrunch.com/2015/07/01/enterprises-dont-have-big-data-they-just-have-bad-data/
https://www.experian.com/assets/decision-analytics/white-papers/the%20state%20of%20data%20quality.pdf
Data quality in the BI environment
http://searchdatamanagement.techtarget.com/tip/Data-quality-management-for-business-intelligence-projects
http://www.quistor.com/en/blog/entry/why-has-my-bi-become-slow
Editor's Notes
Customer’s perception of you as a brand is key and its easy for people to go elsewhere – DQ paramount - for each company to decide just how important it is for their brand – measuring impact
Impact: don’t forget to consider the opportunity costs. There is also the ‘weariness’ factor in staff. Why both to craft yet another campaign that will reach less then half of the recipients due to incorrect or outdated email addresses. The reputational costs of getting things badly wrong. Customer service issues. Unable to segment properly – not knowing high cost low value and low cost high value customers.
Garbage in garbage out still holds true. Especially significant for marketers. Company reputation. Often the first contact point that customers have with a business.
Some areas of DQ in your BI reporting are going to be more important than others. Financial forecasts for example – you want to know how far from your target projections you are each week. Strategic decisions may be influenced by even small margins of error.
Use some examples here. No gender assigned. Mr Charge Dodger. Need to incentivise good data handling/entry. Improve data entry field design. Automate data cleansing routines. Establish KPIs against data quality.
These are the 6 main tenants of DQ.
What is easily achievable in DQ, how and why using KPIs to measure DQ will improve customer insight and add value. Technology has improved a great deal in the last few years and marketers need to know what they can do within their own team and what they will need to get IT to help with. We will use some demonstrations of quick data verification checks to explore what is possible either as batch reporting or in near real-time web integrated data verification look-ups. Depending on the scale and resources of your company you can make a decision about what is achievable within your own team and or within your company.
Any campaign, any software upgrade project, any new product launch – all will be impacted if you have poor data quality. There is no point investing in data analytics if you can’t be sure about sending out an email campaign without addressing your customer by the right name (Mr Charge dodger) Reputation: Age UK – tidal wave of abuse and drop in income with data protection issues – lack of data cleansing.