Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Copyright Kenoconnordata.com 2012
Do You Know Whats in the Data Youre Consuming?     Ken O’Connor – Kenoconnordata.com               6th Nov 2012         Co...
As food consumers, we are provided with factsabout the food we’re consuming – it’s the law                                ...
We know where food such as beef comes from…Traceability –Hugely important to restoreconfidence in beef following theMad Co...
We know that our food has not been tampered with, since itleft its “trusted source”                                       ...
What do you know about the data you depend on?• Data consumers are seldom provided with facts about  the data feeding thei...
In order to trust data; in order to confidently basebusiness decisions on data, I believe…As data consumers, you and I hav...
What basic facts do you need to know about the data youconsume?Let’s look at a profile of “Customer Date of birth” as an e...
Data profiling helps – but does it provide the facts we need?           Data Content FactsTotal number of customer records...
Data content facts add a “smell” to data defectsOne thing worse than a square pegnot fitting in a round hole… a squarepeg ...
Where does your data come from?Nicola Askham wrote an excellent blog post recently about  “The data faeries” – does this s...
The FSA expects you to know where your data    comes from… “data provenance”http://www.dmsg.bcs.org/web/images/stories/201...
The FSA expects you to understand how your data  is transformed…  But don’t sweat the small stuff – the FSA advice is to f...
Where does your data come from? Data Provenance /Traceability / Lineage – The “bucket brigade” model Imagine if someone in...
Turn your data supply chain into a “bucket brigade”Everyone must understand:• Why the data is ultimately required• The imp...
Where to start…Learn from Chilean mine rescue…Trace a single critical data element end to end through yourdata supply chai...
You know what’s in the data and where it comes      from… now what do you do with it?                                     ...
All industries have critical data…•   Health•   Pharmaceutical•   Banking•   Insurance•   Aviation•   …Data consumers in a...
The food industry reacted to crises…• Tylenol poisonings• Mad Cow disease (BSE) crisisRegulators are reacting to the 2008 ...
JFK quoted George Bernard Shaw…   “Other people, he said "see things and . . . say Why? . . .   But I dream things that ne...
Your new approach to data…When you return to your office, I would like you to start  asserting your rights• Ask for facts ...
Ken O’ConnorEmail: Ken@Kenoconnordata.comTwitter: KenoconnordataLinkedin: ie.linkedin.com/in/kenoconnor00Ken OConnor is an...
Upcoming SlideShare
Loading in …5
×

Do you know what's in the data you're consuming

964 views

Published on

Presented at the Data Management & Information Quality Conference Europe 2012 in London - November 2012..

Do you know what's in the data you're consuming

  1. 1. Copyright Kenoconnordata.com 2012
  2. 2. Do You Know Whats in the Data Youre Consuming? Ken O’Connor – Kenoconnordata.com 6th Nov 2012 Copyright Kenoconnordata.com 2012
  3. 3. As food consumers, we are provided with factsabout the food we’re consuming – it’s the law Ingredients – the basic facts Allergy Information – Can mean life or death to some Nutrition Information Enables us to make “informed choices” about the food we buy We don’t all use the food facts given to us – Those who choose/need to control their diet are in a position to do so Copyright Kenoconnordata.com 2012
  4. 4. We know where food such as beef comes from…Traceability –Hugely important to restoreconfidence in beef following theMad Cow disease (BSE) crisis Copyright Kenoconnordata.com 2012
  5. 5. We know that our food has not been tampered with, since itleft its “trusted source” Tamperproof lids and seals – Introduced following Tylenol poisonings killed 7 people in Chicago in 1982 Best Before / Use by date Copyright Kenoconnordata.com 2012
  6. 6. What do you know about the data you depend on?• Data consumers are seldom provided with facts about the data feeding their critical business processes• Most data consumers assume the data input to their business processes is “right”, or “OK”.• They often assume it is the job of the IT function to ensure the data is “right”.• Almost all data consumers are also data producers – unaware of their role in the data supply chain Copyright Kenoconnordata.com 2012
  7. 7. In order to trust data; in order to confidently basebusiness decisions on data, I believe…As data consumers, you and I have the right to expect facts about the data provided to us. We should:• Know what’s in the data we’re consuming• Know where it comes from• Know the quality controls applied to it Copyright Kenoconnordata.com 2012
  8. 8. What basic facts do you need to know about the data youconsume?Let’s look at a profile of “Customer Date of birth” as an example…Do you spot anything unusual about these dates of birth? Data Content FactsTotal number of customer records: 2,500,000 Could Marketing useData field name: Date of Birth this data to target 20Age ranges - based on date of birth to 59 year olds?Age Range Count Percentage0-19 200,000 8.00%20-59 1,800,000 72.00% Could this data be60-99 310,000 12.40% used to calculate100-119 44,000 1.76% pension annuities?120-169 20,500 0.82%170+ 500 0.02%No Date of birth 125,000 5.00%Total 2,500,000 100.00%Data that may be fit for one purpose may not be fit for a different purposeArmed with basic facts – the data consumer can make an informed choice Copyright Kenoconnordata.com 2012
  9. 9. Data profiling helps – but does it provide the facts we need? Data Content FactsTotal number of customer records: 2,500,000Data field name: Date of BirthAge ranges - based on date of birthAge Range Count Percentage0-19 200,000 8.00%20-59 1,800,000 72.00%60-99 310,000 12.40%100-119 44,000 1.76%120-169 20,500 0.82%170+ 500 0.02%No Date of birth 125,000 5.00%Total 2,500,000 100.00%1. Accuracy? No – we cannot tell if the dates of birth are accurate2. Completeness? Yes – 95% complete3. Validity? Perhaps valid dates – but could a customer be 170+4. Timeliness? No – No indication of the currency of the data5. Consistency? No – No indication Copyright Kenoconnordata.com 2012
  10. 10. Data content facts add a “smell” to data defectsOne thing worse than a square pegnot fitting in a round hole… a squarepeg that does fit in a round hole…It’s not “fit for purpose” – Data DefectData defects are not like s/w bugs –they seldom cause a system to fail.Data defects are more like natural gas:• Colourless• Odourless• Potentially deadly Copyright Kenoconnordata.com 2012
  11. 11. Where does your data come from?Nicola Askham wrote an excellent blog post recently about “The data faeries” – does this sound familiar to you?• Team A: Our data is loaded up by IT• IT: No we dont touch that data, its a manual data load by Team B• Team B: We just send the spreadsheet to Team A - were sure that they load the data• Team A: No we really dont load up that data…• Most people don’t know where their data comes from• They assume it is always there, and is “OK”• Too few are aware of their role in the data supply chain Copyright Kenoconnordata.com 2012
  12. 12. The FSA expects you to know where your data comes from… “data provenance”http://www.dmsg.bcs.org/web/images/stories/2012-03-29-dean-buckner.pdf Copyright Kenoconnordata.com 2012
  13. 13. The FSA expects you to understand how your data is transformed… But don’t sweat the small stuff – the FSA advice is to focus on your most critical datahttp://www.dmsg.bcs.org/web/images/stories/2012-03-29-dean-buckner.pdf Copyright Kenoconnordata.com 2012
  14. 14. Where does your data come from? Data Provenance /Traceability / Lineage – The “bucket brigade” model Imagine if someone in the “bucket brigade” chain • Thought the water was for him and drank it • Used the water on his garden • Turned off the tap • Started the fire deliberately… • Useless if bucket is empty when it reaches the fire Copyright Kenoconnordata.com 2012
  15. 15. Turn your data supply chain into a “bucket brigade”Everyone must understand:• Why the data is ultimately required• The importance of their role & their dependence on others• Where they get their data from and who they provide it to• What the data should contain and what it does contain• If the data is not right – they should raise a data defect ! Copyright Kenoconnordata.com 2012
  16. 16. Where to start…Learn from Chilean mine rescue…Trace a single critical data element end to end through yourdata supply chain – this will highlight challenges to overcome • How do we assign data ownership? • How do we agree data definitions? • How do we specify business rules? • How do we measure data quality? • How do we govern the above? Copyright Kenoconnordata.com 2012
  17. 17. You know what’s in the data and where it comes from… now what do you do with it? Apply appropriate controls to your spreadsheetshttp://www.clusterseven.com/external-research/2010/7/20/spreadsheets-and-solvency-ii-financial-services-authority-uk.html Copyright Kenoconnordata.com 2012
  18. 18. All industries have critical data…• Health• Pharmaceutical• Banking• Insurance• Aviation• …Data consumers in all industries need to know:• What’s in the data they’re consuming• Where it comes from• What quality controls have been applied to it Copyright Kenoconnordata.com 2012
  19. 19. The food industry reacted to crises…• Tylenol poisonings• Mad Cow disease (BSE) crisisRegulators are reacting to the 2008 financial crisis…They increasingly expect evidence that: - You can trust your data - They can trust your dataSolvency II, Basel III, Dodd Frank, UCITs, MiFID II, CRD IV...A perfect storm - a Frankenstorm of regulation…- all expecting evidence of data provenance- all expecting evidence of DQ management process Copyright Kenoconnordata.com 2012
  20. 20. JFK quoted George Bernard Shaw… “Other people, he said "see things and . . . say Why? . . . But I dream things that never were-- and I say: Why not?" I dream of a time… - When all critical data is accompanied by facts about that data (Data Quality Information / provenance). - When we will look back on the days when data consumers had few facts about the data they were consuming – and regulators tolerated it. and I say: Why not now? Visit www.clearinformation.org - a good place to startJohn F Kennedy – Address before Irish parliament June 28 th 1963http://www.jfklibrary.org/Research/Ready-Reference/JFK-Speeches/Address-Before-the-Irish-Parliament-June-28-1963.aspx Copyright Kenoconnordata.com 2012
  21. 21. Your new approach to data…When you return to your office, I would like you to start asserting your rights• Ask for facts about the data provided to you• Provide facts about the data you provide to othersThe norm must become “Here is the data, and here are the facts (Data Quality Information / provenance) about the data” Copyright Kenoconnordata.com 2012
  22. 22. Ken O’ConnorEmail: Ken@Kenoconnordata.comTwitter: KenoconnordataLinkedin: ie.linkedin.com/in/kenoconnor00Ken OConnor is an independent data consultant with over 30 years ofhands on experience in the field. Ken specialises in helping organisationsmeet the data quality management challenges presented by data intensiveprogrammes such as data conversions, data migrations, data population andregulatory programmes such as Solvency II, Basel II / III, Single CustomerView and Anti Money Laundering. Ken provides practical data quality anddata governance advice at his popular blog: http://kenoconnordata.com Copyright Kenoconnordata.com 2012

×