Welcome to London Jaspersoft Community User
Group Thursday 16th June 2016
Introductions and update themes for next event
KETL: Why DQ is important for BI
Implementation case study: Andy Fenn and Alexander
McGuire from Workplace Systems
Break
Ernesto: Complex Report Designs with Jaspersoft Studio
http://www.jiem.org/index.php/jiem/article/view/232/130
by 2017, 33% of Fortune 100
organisations will experience an
information crisis, due to their
inability to to effectively value,
govern and trust their enterprise
information.
Gartner
www.ketl.co.uk
Impact of poor DQ
Estimates vary on the impact of bad
data on revenue (10 to 30%!). Audit
your own revenue losses from poor
data. Factor in opportunity costs
too.
Measuring the cost of poor DQ
http://www.jiem.org/index.php/jiem/article/view/232/130
Impact of poor DQ in a BI environment
Make DQ part of your BI PoC. It is much harder to go in after
the event to address data quality issues.
DQ and the resulting ETL issues will likely slow down your BI
reporting and put extra strain on your data stores.
Who owns data quality for your BI source systems? This
needs to be established and ideally it should be the BI project
team that takes responsibility for ensuring the data that they
are providing in their reports is accurate and consistent.
Get involved in data governance and implement DQ as a KPIs
for the BI team.
http://www.jiem.org/index.php/jiem/article/view/232/130
www.ketl.co.uk
How is ‘bad’ data
entering our systems?
People. Poorly designed data entry
fields. Duplicate entries. Multiple
data sources. Self-service user
entry.
www.ketl.co.uk
Data profiling measures
1. Accuracy
2. Completeness
3. Timeliness
4. Validity
5. Consistency
6. Uniqueness
Experian survey on data accuracy
www.ketl.co.uk
Getting better data.
Don’t try ‘big bang’ approach – too
daunting. Profile your data. Use
familiar datasets that you know you
can improve easily. Quick gains.
You have to start with a very
basic idea: data is super
messy, and data cleanup will
always be literally 80 percent of
the work. In other words, data
is the problem.
DJ Patil, Chief Data Scientist of the White House
www.ketl.co.uk
13-14 Orchard Street, Bristol BS1 5EH
+44 (0)117 905 5323
info@ketl.co.uk @KETL_BI
Get in touch
For further information or help with
your data project speak to Helen to
see how we can help >
Helen Woodcock
LinkedIn: /in/helenwoodcock
email: helen@ketl.co.uk
References and Further Reading
Data disasters
http://blogs.mazars.com/the-model-auditor/files/2014/01/12-Modelling-Horror-Stories-and-Spreadsheet-Disasters-Mazars-UK.pdf
https://www.sas.com/content/dam/SAS/en_us/doc/whitepaper1/bad-data-good-companies-106465.pdf
Research on corporate data quality
https://www.edq.com/globalassets/uk/papers/global-research-2015_20pp-ext-apr15.pdf
https://www.gartner.com/doc/2636315/state-data-quality-current-practices
https://www.edq.com/uk/resources/infographics/data-machine/
Cost of data quality
http://betanews.com/2015/02/17/why-data-quality-is-essential-to-your-analytics-strategy/
http://www.itbusinessedge.com/interviews/how-to-measure-the-cost-of-data-quality-problems.html
http://www.itbusinessedge.com/blogs/integration/what-does-bad-data-cost.html
http://techcrunch.com/2015/07/01/enterprises-dont-have-big-data-they-just-have-bad-data/
https://www.experian.com/assets/decision-analytics/white-papers/the%20state%20of%20data%20quality.pdf
Data quality in the BI environment
http://searchdatamanagement.techtarget.com/tip/Data-quality-management-for-business-intelligence-projects
http://www.quistor.com/en/blog/entry/why-has-my-bi-become-slow

London Jaspersoft Community User Group Event 2 KETL presentation

  • 1.
    Welcome to LondonJaspersoft Community User Group Thursday 16th June 2016 Introductions and update themes for next event KETL: Why DQ is important for BI Implementation case study: Andy Fenn and Alexander McGuire from Workplace Systems Break Ernesto: Complex Report Designs with Jaspersoft Studio http://www.jiem.org/index.php/jiem/article/view/232/130
  • 2.
    by 2017, 33%of Fortune 100 organisations will experience an information crisis, due to their inability to to effectively value, govern and trust their enterprise information. Gartner
  • 3.
    www.ketl.co.uk Impact of poorDQ Estimates vary on the impact of bad data on revenue (10 to 30%!). Audit your own revenue losses from poor data. Factor in opportunity costs too.
  • 4.
    Measuring the costof poor DQ http://www.jiem.org/index.php/jiem/article/view/232/130
  • 5.
    Impact of poorDQ in a BI environment Make DQ part of your BI PoC. It is much harder to go in after the event to address data quality issues. DQ and the resulting ETL issues will likely slow down your BI reporting and put extra strain on your data stores. Who owns data quality for your BI source systems? This needs to be established and ideally it should be the BI project team that takes responsibility for ensuring the data that they are providing in their reports is accurate and consistent. Get involved in data governance and implement DQ as a KPIs for the BI team. http://www.jiem.org/index.php/jiem/article/view/232/130
  • 6.
    www.ketl.co.uk How is ‘bad’data entering our systems? People. Poorly designed data entry fields. Duplicate entries. Multiple data sources. Self-service user entry.
  • 7.
    www.ketl.co.uk Data profiling measures 1.Accuracy 2. Completeness 3. Timeliness 4. Validity 5. Consistency 6. Uniqueness
  • 8.
    Experian survey ondata accuracy
  • 9.
    www.ketl.co.uk Getting better data. Don’ttry ‘big bang’ approach – too daunting. Profile your data. Use familiar datasets that you know you can improve easily. Quick gains.
  • 10.
    You have tostart with a very basic idea: data is super messy, and data cleanup will always be literally 80 percent of the work. In other words, data is the problem. DJ Patil, Chief Data Scientist of the White House
  • 11.
    www.ketl.co.uk 13-14 Orchard Street,Bristol BS1 5EH +44 (0)117 905 5323 info@ketl.co.uk @KETL_BI Get in touch For further information or help with your data project speak to Helen to see how we can help > Helen Woodcock LinkedIn: /in/helenwoodcock email: helen@ketl.co.uk
  • 12.
    References and FurtherReading Data disasters http://blogs.mazars.com/the-model-auditor/files/2014/01/12-Modelling-Horror-Stories-and-Spreadsheet-Disasters-Mazars-UK.pdf https://www.sas.com/content/dam/SAS/en_us/doc/whitepaper1/bad-data-good-companies-106465.pdf Research on corporate data quality https://www.edq.com/globalassets/uk/papers/global-research-2015_20pp-ext-apr15.pdf https://www.gartner.com/doc/2636315/state-data-quality-current-practices https://www.edq.com/uk/resources/infographics/data-machine/ Cost of data quality http://betanews.com/2015/02/17/why-data-quality-is-essential-to-your-analytics-strategy/ http://www.itbusinessedge.com/interviews/how-to-measure-the-cost-of-data-quality-problems.html http://www.itbusinessedge.com/blogs/integration/what-does-bad-data-cost.html http://techcrunch.com/2015/07/01/enterprises-dont-have-big-data-they-just-have-bad-data/ https://www.experian.com/assets/decision-analytics/white-papers/the%20state%20of%20data%20quality.pdf Data quality in the BI environment http://searchdatamanagement.techtarget.com/tip/Data-quality-management-for-business-intelligence-projects http://www.quistor.com/en/blog/entry/why-has-my-bi-become-slow

Editor's Notes

  • #3 Customer’s perception of you as a brand is key and its easy for people to go elsewhere – DQ paramount - for each company to decide just how important it is for their brand – measuring impact
  • #4 Impact: don’t forget to consider the opportunity costs. There is also the ‘weariness’ factor in staff. Why both to craft yet another campaign that will reach less then half of the recipients due to incorrect or outdated email addresses. The reputational costs of getting things badly wrong. Customer service issues. Unable to segment properly – not knowing high cost low value and low cost high value customers.
  • #5 Garbage in garbage out still holds true. Especially significant for marketers. Company reputation. Often the first contact point that customers have with a business.
  • #6 Some areas of DQ in your BI reporting are going to be more important than others. Financial forecasts for example – you want to know how far from your target projections you are each week. Strategic decisions may be influenced by even small margins of error.
  • #8 Use some examples here. No gender assigned. Mr Charge Dodger. Need to incentivise good data handling/entry. Improve data entry field design. Automate data cleansing routines. Establish KPIs against data quality.
  • #9 These are the 6 main tenants of DQ.
  • #11 What is easily achievable in DQ, how and why using KPIs to measure DQ will improve customer insight and add value. Technology has improved a great deal in the last few years and marketers need to know what they can do within their own team and what they will need to get IT to help with. We will use some demonstrations of quick data verification checks to explore what is possible either as batch reporting or in near real-time web integrated data verification look-ups. Depending on the scale and resources of your company you can make a decision about what is achievable within your own team and or within your company.
  • #12 Any campaign, any software upgrade project, any new product launch – all will be impacted if you have poor data quality. There is no point investing in data analytics if you can’t be sure about sending out an email campaign without addressing your customer by the right name (Mr Charge dodger) Reputation: Age UK – tidal wave of abuse and drop in income with data protection issues – lack of data cleansing.