Managing Blind Chapter 1


Published on

Published in: Technology, Business
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Managing Blind Chapter 1

  1. 1. Managing Blind A Data Quality and Data Governance Vade Mecum By Peter R. Benson Project Leader for ISO 8000, the International Standard for Data Quality Edited by Melissa M. Hildebrand rev 2012.06.28 Copyright 2012 by Peter R. Benson ECCMA Edition ECCMA Edition License Notes: This eBook is licensed for your personal enjoyment only. ThiseBook may not be re-sold or given away to other people. If you would like to share this eBook with another person, please purchase an additional copy for each recipient. If you’re reading this eBook and did not purchase it, or it was not purchased for your use only, then please visit and purchase your own copy. It is also available at Thank you for respecting the hard work of this author. ***~~~*** 2
  2. 2. Table of ContentsPrefaceBasic principlesChapter 1: Show me the moneyChapter 2: The law of unintended consequencesChapter 3: Defining data and informationChapter 4: The characteristics of data and informationChapter 5: A simplified taxonomy of dataChapter 6: Defining data qualityChapter 7: Stating requirements for dataChapter 8: Building a corporate business languageChapter 9: ClassificationsChapter 10: Master data record duplicationChapter 11: Data governanceChapter 12: Where do we go from here?Appendix 1: Managing a data cleansing process for assets,materials or servicesFurther readings ***~~~*** 3
  3. 3. Chapter 1: Show me the moneyBusiness is about profit and profit is generated in the shortterm by reducing cost and increasing revenue but in the longerterm by managing risk.Risk management is fundamental to the finance and insuranceindustries where the ability to “predict” is at the core of thebusiness. The difference between an actuary and a gambler isdata. The actuary promotes their ability to record and analyzedata and the gambler must hide any such ability or risk beingasked to leave the casino.It is not surprising that data plays a key role in riskmanagement. Taking a “calculated risk” implies there is somedata upon which you can actually perform the calculation.Other than in the finance and insurance industries, riskmanagement is a hard sell to all but the most sophisticatedmanagers. Cost reduction is a management favorite and aneasier sell, but if you can associate data quality andgovernance with revenue growth you’ve scored a home run.Most recorded examples of failures due to missing or incorrectdata fall into the catastrophic loss category. This is onlybecause of the enormity of the loss compared to the ease withwhich the error was made, or the tiny amount of data involved.There are whole websites devoted to listing the financialconsequences of data errors. Some of my favorites include;Timo Elliott’s YouTube account of a simple error in the property 17
  4. 4. tax records that resulted in school budget cutbacks, as well as,the Mars Climate Orbiter. The Mars Climate Orbiter was a $327million project that came to an untimely end because of whathas become known as the “metric mix-up.” The software on theMars Climate Orbiter used the metric system, while the groundcrew was entering data using the imperial system. There is alsothe story of Napoleon’s army who was able to force thesurrender of the Austrian army at Ulm when the Russians failedto turn up as scheduled purportedly because they were usingthe Julian calendar and not the Gregorian calendar used by theAustrians; now that is what I call being stood up!We all have personal stories in having to deal with theconsequences of data errors but my absolute personal favorite,at least in hindsight, involves the IRS. It all began one morningwhen I was handed a crisp envelope from the IRS. Inside theenvelope was a letter explaining that I was going to be audited.This sort of letter sends chills up your spine. When I recoveredand mustered the courage to call the number on the letter, Iwas surprised to be speaking to an eminently reasonableinspector. She asked me to confirm that I was claiming adeduction for alimony paid to my ex-wife. Not exactly the sortof thing you wanted to be reminded of, but I was happy toconfirm that this was indeed the case. “According to ourrecords you have been claiming this deduction for over tenyears,” again not something I cared to be reminded of, but theanswer was an easy “yes”. There was a worrying silence,followed by, “I am afraid this is not possible.” The chills quickly 18
  5. 5. rolled up my spine again. “The social security number you haveentered on your tax return belongs to a fourteen year oldfemale living in Utah.” To my utter surprise and after a longexhale, I was glad to be able to correct the error which turnedout to be no more that a reversal of two digits in the socialsecurity number. You have to be impressed by the ability of theIRS to connect the dots. I know I was, and I should have quitwhile I was ahead. There had been recent news reports aboutchild brides in Utah, so my reply was “Well at least she wasfrom Utah.” It did not impress the IRS agent who reminded methat the IRS office I was speaking to was in Utah; apparentlyhumor is not a requirement for an IRS agent.What jumps out from these examples is the multiplier effect. Asimple data error can easily, and all too often does, mushroominto larger, far reaching and lasting economic fallout. Dataerrors are rarely benign; more often than not they arecatastrophic.As a general rule, most managers are natural risk takers, andunless you are in the insurance industry, it is an uphill struggleto associate data quality and governance with meaningful valuein the form of risk management or loss mitigation with onenotable exception. By focusing on resolving frequent smalllosses, rather than larger catastrophic losses, it is usuallypossible to correlate data quality and governance with reducingloss. Examples include, reducing production down time anddelivery delays. These are most often considered to be revenue 19
  6. 6. generation and not cost reduction. The correlation betweendata quality and delivered production capacity or on timedelivery is generally accepted, and the calculation of theadditional revenue generated is straightforward.The role quality data plays in reducing cost is also generallyaccepted, although the specifics are poorly understood. Thereis clear evidence that simple vendor rationalization or grouppurchasing will drive down price. However this can be easilyoverdone to the point of exchanging short term priceadvantage for long term reliance on larger suppliers able toreclaim the price advantage over the longer term. The ultimategoal is to commoditize goods and services to the point wherethere are many competing suppliers. This requires excellentvendor, material and service master data. The rewards can behuge, not only in highly competitive pricing but also in aflexible and resilient supply chain.As a general rule most companies can save 10% of their totalexpenditure on materials and services simply by goodprocurement practices which include maintaining up to datematerial and service masters supported by negotiatedcontracts. The challenge is to maintain the discipline in the faceof urgent and unpredictable requirements for goods or services.Most companies make it difficult and time consuming to add anew item to their material or service masters and the result is“free text” or “maverick spend." These are off contractpurchases where the item purchased is not in the material or 20
  7. 7. service master, instead a “free text” description is entered inthe purchase order. Free text descriptions are rarelyaccompanied by rigorous classification and as a resultmanagement reports start to lose accuracy as an everincreasing percentage of spend appears under the“miscellaneous” or “unclassified” headings, hardly amanagement confidence builder. It is interesting that most ERPsystems require absolute unambiguous identification of theparty to be paid, on the pretext that it is required by law, whichit is, but they do not require the unambiguous identification ofthe items purchased. As many have found out at theirconsiderable expense, the law also requires the identificationand unambiguous description of the goods or servicepurchased. As federal and state governments go on the huntfor more tax revenue, we can expect to see greater scrutiny ofpurchase order line item descriptions to determine what is andwhat is not accepted as an "ordinary and necessary” businessexpense.The most common scenario is a big effort to rationalizeprocurement, which is then accompanied by a substantial dropin free text spend. A big part of this effort is the identificationof duplicates. Vendor master duplicates are actually rare interms of the identification of the legal entity that needs to bepaid, but less rare is a lack of understanding of the relationshipbetween suppliers and how this impacts pricing. Customerrecord duplication is actually surprisingly common, and worst ofall is material master duplication. Material master record 21
  8. 8. duplication all by itself can easily be responsible for up to a30% price differential. Chapter 10 deals specifically with theissue of the identification and resolution of duplicate recordsbut suffice to say it is not as straight forward of an issue asmany believe. Duplication is a matter of perspective andtiming.Without good data governance that keeps the master data upto date, data quality degrades and free text purchasing risesagain. Free text spend is actually a great indicator of thesuccess of a data quality and data governance program; thelower the free text spend the more successful the program. Itis not hard to justify a data quality and data governanceprogram based on the initial measurable savings, but it isharder to maintain a program as a cost avoidance initiative.The ultimate goal is to associate a data quality and governanceprogram with revenue growth, preferably profitable revenuegrowth. This can appear challenging but in reality it is not.In 2010, The Economist Intelligence Unit’s editorial teamconducted a survey of 602 senior executives. Of which, 96% ofthe executives surveyed considered data either “extremely(69%) or somewhat (27%) valuable in creating andmaintaining a competitive advantage.”Debra DAgostino, Managing Editor of Business Research at theEconomist Intelligence Unit and editor of the report also states"Its not enough to merely collect the data; companies need tocreate strategies to ensure they can use information to get 22
  9. 9. ahead of their competitors."How do you use data, let alone data quality and governance asa competitive advantage? The most common answer is to lookinwards and consider data as a source of knowledge to bemined for business intelligence. This has been done withphenomenal success. From targeting customers with highlycontextual and relevant offers, to cutting edge logistics, toproduct customization and everything in between.Wal-Mart can rightly be said to be an information company thatuses retail to generate revenue and not a retail outlet that usesinformation to maximize revenue. Data itself has value andmany companies have successfully turned their data into arevenue source.Roger Ehrenberg states it well when he says, “In todays world,every business generates potentially valuable data. Thequestion is, are there ways of turning passive data into anactive asset to increase the value of the business by making itsproducts better, delivering a better customer experience, orcreating a data stream that can be licensed to someone forwhom it is most valuable?”I have found that you can often convincingly calculate thevalue of data by identifying the data that is essential to aspecific business process. Without the data, the process maynot fail but it would slow down, revenue would be lost andcosts would increase. Data is rarely the only contributing factorto the efficiency of a specific process however, by looking at 23
  10. 10. how data contributes to the efficiency of the process you canmeasure the value of the data.Of course there is nothing like a crisis to focus attention andliberate financial resources quickly. In order to sell a dataquality or data governance program it helps if you can find aburning bridge, and if you cannot find one that is actually onfire, it is not unknown to find one you can set on fire or at thevery least to point to the enormous and imminent risk of fire. Itreally does work, ask any politician.Any good data quality or data governance specialist will tell you“Show me the data and I will show you the money.” ***~~~*** 24