Managing Dirty Data In Organization Using Erp


Published on

Published in: Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Managing Dirty Data In Organization Using Erp

  1. 1. Managing dirty data in organizations using ERP: lessons from a case study Jodi Vosburg The University of Wisconsin-Whitewater, Wisconsin, USA Anil Kumar The University of Wisconsin-Whitewater, Wisconsin, USA Keywords achieve a competitive advantage in the Data, Data integrity, 1.0 Introduction marketplace (Sellar, 1999). On the other hand, Enterprise resource planning, Systems management Daily operations, planning, and decision- ``bad data can put a company at a competitive making functions in organizations are disadvantage'' comments Greengard (1998). A Abstract increasingly dependent on transaction data. recent study (Ferriss, 1998) found out that The integrity of the data used to ``Canadian automotive insurers are taking a This data is entered electronically and operate and make decisions about a business affects the relative manually and then organized, managed and major hit from organized and computer- efficiency of operations and extracted for decision-making. The same data literate criminals who are staging crashes quality of decisions made. entered and used to facilitate building, and taking advantage of dirty data in Protecting that integrity can be corporate databases''. The study found out shipping, and invoicing goods is also difficult and becomes more extracted and manipulated to evaluate that in one case several insurance firms lost difficult as the size and complexity of the business and its systems factory and sales force performance in the $56 million to one fraud ring. increase. Recovering data short term. In the long term this data is used How does a company end up with dirty integrity may be impossible once it data and what can be done to prevent this? to chart the course of the business in terms of is compromised. Stewards of manufacturing facilities, products, and Disparate data stores (individual, transactional and planning systems must therefore employ a marketing. The integrity of the data used to departmental, and organizational) that have combination of procedures operate and make decisions about a business been developed and used by organizational including systematic safeguards users over the years lead to dirty data affects the relative efficiency of operations and user-training programs to and quality of decisions made. Protecting problems. For example, dissimilar data counteract and prevent dirty data in those systems. Users of data integrity is a challenging task. Redman structures for the same customer data transactional and planning (1995) comments that ``many managers are (spelling discrepancies, multiple account systems must understand the unaware of the quality of data they use and numbers, address variations), incomplete or origins and effects of dirty data perhaps assume that IT ensures that data are missing data, lack of legacy data standards, and the importance of and means of guarding against it. This perfect. Although poor quality appears to be actual data values being different from meta- requires a shared understanding the norm, rather than the exception, they labels, use of free-form fields, etc. (Kay, 1997; within the context of the business have largely ignored the issue of quality''. Knowles, 1997; Weston, 1998). These problems of the meaning, uses, and value of can be compounded by the volume of data data across functional entities. In Other scholars (Greengard, 1998; Kilbane, this paper, we discuss issues 1999; Tayi and Ballou, 1998; Wallace, 1999) that is stored and used in organizations. One related to the origin of dirty data, also point out the importance of data quality way of overcoming this problem is to use associated problems and costs of for organizations. technologies that integrate the disparate data using dirty data in an organization, stores for an organization and help the process of dealing with dirty Maintaining the quality of the data that is data in a migration to a new used in an organization is becoming an companies clean up their data. Enterprise system: enterprise resource increasingly high priority for businesses. In resource planning (ERP) systems (SAP, planning (ERP), and the benefits of a recent survey of 300 IT executives Peoplesoft, Baan, J.D. Edwards, etc.) are an ERP in managing dirty data. examples of such systems. ``A good ERP These issues are explored in the conducted by Information Week (Wallace, paper using a case study. 1999), majority of the respondents (81 per system offers an integrated option, cent) said, ``improving customer data quality implementing browser and client-server was the most important post-year 2000 modes while maintaining consistent data and technology priority''. The respondents function within the enterprise and out to the further stated that there would be supply chain'' (Stankovic, 1998). In recent ``significantly increased spending'' on data years, ERP vendors have gone beyond quality in their organizations. Companies providing the traditional integrated Industrial Management & that manage their data effectively are able to applications, such as manufacturing, Data Systems financials, and human resources. Newer 101/1 [2001] 21±31 applications that have emerged include The current issue and full text archive of this journal is available at # MCB University Press supply chain management, customer- [ISSN 0263-5577] relationship management, data mining and [ 21 ]
  2. 2. Jodi Vosburg and Anil Kumar data warehousing (Caldwell and Stein, 1998; the organization who were involved with this Managing dirty data in Stankovic, 1998) and browser modes that project. These employees included the organizations using ERP: enable organizations to reach out to manager of the CSC and marketing services, lessons from a case study customers and the supply chain. Caldwell an information analyst in the marketing Industrial Management & Data Systems and Stein (1998) also point out that ``most services group, and a customer support 101/1 [2001] 21±31 important, ERP forces discipline and representative (CSR). The manager of the organization around processes, making the CSC is responsible for managing domestic alignment of IT and business goals more order processing and sales and marketing likely in the post-ERP era''. Aligning IT and reporting for the division. The information business goals has always been a top priority analyst works with users and programmers for senior management. Thus it might be to specify report requirements and does helpful for a company to implement an ERP much of the testing and trouble-shooting for system. those reports. The CSR is the data entry point In this paper, we discuss the experiences of analyzing and translating customer purchase a company, which implemented an ERP orders into ERP documents. This study will system in their organization. The discussion look primarily at issues relating to the CSC. is focussed primarily on the data aspect of the implementation. The paper is organized as follows. In the next section we describe the 3.0 Dirty data defined case-study organization. Section 3 defines the At first, the abbreviation for black was blk. concept of dirty data and its impact on the Then it was changed to bck. We didn't integrity of organizational data. In Section 4 discover this change until someone said the we list the costs incurred by organizations as color mix didn't look right (Horwitz, 1998). a result of using dirty data. Section five Dirty data exists when there are inaccuracies highlights several lessons learnt from the or inconsistencies within a collection of data case-study organization and, finally, in or when data extraction is inconsistent with Section 6 we summarize the guidelines for intent. Inclusion of dirty data in a data companies planning to implement ERP source may pollute the entire data source solutions to overcome dirty data problems. making it difficult or unwise to use the data for analysis. Dirty data in a transactional system can mean incorrect order taking, 2.0 The case study products not built to specification, or errors The organization where this case study was in packaging, documentation, or billing. The conducted is a $650 million division of a result is dissatisfied customers, loss of Fortune 500 company located in the Midwest. shareholder confidence, unnecessary This company is a manufacturer of electrical, material and labor costs, and the real and lighting, and automotive equipment. The opportunity costs of time spent correcting products of this company are marketed errors resulting from dirty data. Those domestically and internationally. The interviewed define dirty data as follows: The GIGO (garbage in, garbage out) theory company employs approximately 1,600 people applies to dirty data. If you don't have checks in manufacturing and sales facilities located in the system that prevents human error, you both domestically and internationally. There will have errors in your data. Data integrity are 17 manufacturing facilities located in refers to data that is systematically edited or North America and Asia. The case study was edited by ``experts'' after data entry to remove used to understand the implications of dirty errors (Manager, CSC). data at the company before and after the Duplicate data or data that is incomplete or implementation of an ERP system. The ERP extraneous (Information Analyst, Marketing implementation in the company replaced a Services). number of independent mainframe legacy Anything that is entered incorrectly (CSR). systems used for order and quotation processing, manufacturing, transportation, The definitions used reflect each one's billing, and finance applications. One of the experience with dirty data. Awareness of this co-authors of the study works at the company problem is growing within the organization as the system/support supervisor for the as users, systems people, and management Customer Support Center (CSC). In this role, uncovers and deals with problems resulting the author was directly involved in from dirty data. identifying, trouble-shooting, and training Data integrity requires awareness and for dirty data concerns in data entry and with control of dirty data. A collection of data has specifying, testing, and distributing integrity if the data are logically consistent customer and sales-force reports. In addition, and accurate. Data integrity requires that we interviewed several other employees in data additions or changes be reflected in each [ 22 ]
  3. 3. Jodi Vosburg and Anil Kumar of the locations where that data is stored and Each person's perspective is culled from that Managing dirty data in that data is consistent across the storage person's training and experience. The CSR organizations using ERP: medium(s) used. Data integrity also requires lessons from a case study indicated that she had little understanding of that the users of that data understand the the way in which the data she enters is used in Industrial Management & Data Systems meaning of the data within the context of the peripheral departments and how it becomes 101/1 [2001] 21±31 business. Maintaining data integrity part of reporting. For that reason, it is requires a systematic approach to data important to examine the data and rationalize processing, storage, sharing, manipulation, it. Data rationalization involves determining and reporting. what data is important to which department and prioritizing the value of those data sets. Once this determination is made, plans to 4.0 Cost of using dirty data correct and prevent dirty data can be laid. ``Errors in data can cost a company millions of dollars, alienate customers, and make implementing new strategies difficult or 5.0 The ERP implementation: impossible'' (Redman, 1995). The manager of lessons learned CSC commented that: The start of data integrity problems is really Any business that has to issue debits and a failure to treat data as a strategic business credits or that throws out surplus, unusable resource. Scholars (Redman, 1995; Tayi and inventory, understands the costs of dirty Ballou, 1998) point out that data is a key data. Each credit or debit is estimated to cost organizational resource. However, as pointed the company $75 for the clerical efforts of analyzing, generating and disseminating the out by Kilbane (1999), ``Many companies who document. Added to that are the following: use data contained in legacy systems are not production errors from erroneous bills of leveraging it as a strategic company asset.'' material or misinterpretation of a customer's The primary challenge to maintaining data specifications; freight costs for shipping and integrity is the lack of resources allocated to returning product; inventory scrapping it. To maintain data integrity, people with an charges where the product cannot be resold; understanding of the origins and results of financial penalties charged by the customer dirty data and the ways to prevent and for our error; ordering of unneeded materials; scrapping of raw materials; wasted labor correct it, must be dedicated to the task. charges at the organization and its customer; Redman (1995) says that: ``Due largely to the warranty charges to fix the product, if it can organizational politics, conflicts, and be modified; and unknown cost of the passions that surround data, only a customer not ordering additional product corporation's senior executives can address from you because of your data problems. The many data quality issues. Only senior managers and people involved in warranty, management can recognize data (and the credit and collection and finance understand processes that produce data) as a basic the ramifications. The rest of the organization understands what their managers or corporate asset and implement strategies to supervisors have shared with them. Our proactively improve them.'' Where data quality program emphasizes feedback to the integrity is one of many responsibilities of person involved with a quality problem. It is people with no understanding of the concepts up to the management team to insure that all surrounding data integrity, dirty data is the people understand the problems dirty data result. Integrity, issues receive attention in can cause as well as prevention. times of crisis, but as soon as the crisis is The information analyst for marketing over, those with responsibilities other than services was of the opinion that ``most of the data integrity turn to the pressing deadlines costs associated with dirty data cannot be or daily tasks that they are responsible for. In measured in terms of dollars. If these costs a complex ERP environment, this can result could be quantified the management would in perpetual crisis management. In the be shocked''. She stressed the cost of the following paragraphs we discuss the lessons endless number of consultants required to learned from this case study. configure the system to prevent a particular data problem or to determine or correct the 5.1 Understanding and communicating results of one. new demands of an ERP system The CSR focused on ``costs associated with Before the move from legacy applications to customer dissatisfaction ± lost confidence an ERP system takes place, considerable and business are hard to measure and harder thought should be given to how the system to win back''. She pointed out the frustration change will change the roles of the users. The and time lost at the factory, in the marketing conversion to an ERP system is not just a departments, and at the CSC in correcting data extraction, cleansing, transformation, problems resulting from dirty data. and populate process to effectively [ 23 ]
  4. 4. Jodi Vosburg and Anil Kumar implement an ERP system. An organization this way of working. The combination of Managing dirty data in needs a strategy and a plan. Atre (1998) points these factors has increased the occurrence of organizations using ERP: out that ``legacy data is invariably in worse lessons from a case study inaccurate, inconsistent data being entered condition than you realize''. Caldwell and on the ERP via sales orders, as CSRs attempt Industrial Management & Data Systems Stein (1998) comment that ``ultimately, by to complete their complex and time- 101/1 [2001] 21±31 feeling their way through the initial shock of consuming data entry work in the same an ERP implementation ± new business amount of time they did prior to the ERP processes, new job roles, new management implementation and without a clear structures, and new technologies ± understanding of how that data is to be used companies are transforming themselves''. by other functional areas in the business and In this company there are 48 CSRs in the by upper management for analysis and CSC. These CSRs are responsible for entering business decisions. orders taken from domestic customers. Now, Lesson: Organizational users need to be with the ERP, the items on these orders not educated and prepared for the changes that only initiate the manufacturing, shipping, will take place as a result of ERP and invoicing functions, but also are the raw implementation. data used to generate the sales and marketing reports. The sales and marketing reports feed 5.2 Developing shared understanding of the decision-making processes that steer the data business. The correct and consistent entering The lack of a shared understanding of the of these orders is critical to preventing uses and value of data among those dirty data. performing the same tasks and among those Most CSRs believe that the order entry performing different tasks can lead to process has increased in complexity with the creation of dirty data. Tayi and Ballou (1998) implementation of the ERP. Some estimate point out that ``the data gatherer and initial that the time required to enter an order has user may be fully aware of the nuances increased two to four times. The reasons for regarding the meaning of the various data this widely-held perception are threefold. items, but that will not be true for all of the First, the ERP is still quite new ± system other users''. Where those performing the glitches can mean several unsuccessful same tasks have a different understanding of attempts at entering a single order and the the data being processed, inconsistencies are eventual involvement of system support inevitable. For example, if the marketing personnel in processing. Second, there are services department members differ on more steps to the order entry processes, and whether abbreviations are to be used in greater variation across product lines. customer master data, inconsistent entry is Legacy systems were used for narrowly- the result. Locating this data becomes defined transaction sets. For example, each of difficult for CSRs because they cannot be the four product groups in the company had sure if they are using the wrong their own manufacturing system. The abbreviation, or if the data has not been homogeneity of the transactions and of the entered. The result of this lack of shared users meant that the legacy systems could be understanding is duplicate records ± when customized to accommodate those tasks the CSR cannot find the record that they are without affecting the ability of other users to looking for, a new record is requested. Even perform other tasks. Now that all users share if marketing services is able to locate the a single system, transactions must be record and corrects the abbreviation before generalized to fit all tasks. Where creating a duplicate record, both the CSR and customization cannot be automated, it marketing services have spent unnecessary becomes a manual part of user work time. processes ± the order entry process varies A lack of a shared understanding is greatly from product line to product line. common among data generators and report Greater expertise is required on the part of writers. A CSR knows that the promised ship the user, not only in the performance of their date on an order with a production block is assigned tasks but also in those of others that not valid, but a consultant writing a backlog are affected by their system transactions. The report probably does not. As a result, the learning curve has been steeper than anyone invalid date is published on the report. imagined. Third, data entry skills are no Geographical distances and functional longer enough to successfully enter orders ± barriers exacerbate this complexity. The the ERP requires system savvy and an further an employee is from another analytical approach. It has become critical employee, and the less that employee that CSRs understand the logic behind the understands what is required in the other's processes and the ramifications of their position, the less likely they are to share a actions on-line. Many are inexperienced in common understanding of the importance of [ 24 ]
  5. 5. Jodi Vosburg and Anil Kumar the data each deals with. According to the ERP data structures that define transactional Managing dirty data in CSC manager: data, and for authoring and generating sales organizations using ERP: In the business right now, those entering the and marketing reports. Marketing services lessons from a case study data and those using the data are so confused has been successful in guarding against Industrial Management & that there is little understanding of the data Data Systems duplicated records, misspelled names, 101/1 [2001] 21±31 in the system. We are working with users inverted text, missing fields, outdated area AND the IT departments to share the codes and ZIP codes, and other kinds of dirty knowledge about the entered, calculated AND extracted data. Without this, we are, and have data in customer and salesforce master data been, subject to interpretation of a field with a by employing combination of user training, title meaning different things to those well-defined procedures, and tight control entering versus using the data. We are and auditing of additions, changes, and finding how difficult it is to deal with a deletes. Every CSR received at least four program written in another language, as field hours of training on the use and import of translations have always assisted users and this master data. During that training, CSRs IT people in the past. In our ERP, there is no were asked to review the master data for their such extra help available for those looking for assigned customers and to advise marketing field definitions and understandings. services of any necessary changes. New Lesson: Champions of the ERP master data requires the completion of a form implementation project should ensure that all to ensure all necessary information is users understand the organizational data in a provided. Only two people in the marketing manner that is consistent throughout the services department do the actual addition of organization. the new data to ERP. In addition, an audit report is run regularly to identify changes 5.3 Ownership of data and responsibilities made to the data. This report helps to catch Responsibility for ensuring data integrity mistakes and identify where additional belongs to all employees. Tayi and Ballou training is required. A data steward, who is (1998) comment: ``The capability of judging the responsible expressly for protecting data reasonableness of the data is lost when users integrity, should support the efforts of the have no responsibility for the data's integrity CSRs and the marketing services department. and when they are removed from the This data steward would be responsible for gatherers.'' Atre (1998) points out: ``IT staff raising awareness about data issues and need help and cooperation from business implementing systematic procedures for data users to identify and cleanse operational data. auditing and user training. Users should be primarily responsible for Lesson: Ensuring that all stakeholders of an determining the business value of data. Don't ERP system understand their responsibilities rely on systems integrators ± they don't with respect to maintaining data integrity understand the business value of the data.'' will lead to a better quality system. Data that One has also to consider the ``politics'' which is a part of an ERP system belongs to an play an important role. Often managers may organization and not to any individual agree that they own the data, but may want department or user. everybody to be involved in cleaning it. The manager of the CSC commented: 5.4 Migrating legacy data I believe that data integrity is the Ruber (1999) comments, ``Migrating responsibility of every company employee. information from departmental databases All positions, all departments are responsible and transaction-processing systems . . . is a for insuring the data they are entering, daunting task.'' He goes on to say that the reviewing or utilizing is error free. It is the ``hardest part is cleansing the data, yet people responsibility of every manager to make sure tend to underestimate that part of the the tools are in place to insure data integrity process.'' Legacy systems in corporations, for the data they are responsible for. . .. In the past, users relied on the IT departments to which were created in different generations, make sure the edits were in place to make the create major stumbling blocks for migrating data correct. With ERP systems and more data to integrated application systems. Quick user controlled systems and input, it is a joint fixes that become embedded in the case of responsibility. Users must understand legacy systems create complexities that are systems better and IT personnel must difficult to overcome. Most of these systems understand business problems better in order are usually lenient with the data that is for them to work together to achieve the maintained, resulting in lack of data highest level of data integrity. Too many IT standards or documentation in the form of people are good programmers but not good metadata. Before this data is migrated there business analysts. is a need to clean it. An effective strategy for The marketing services staff is responsible companies planning to implement integrated for maintaining customer and salesforce applications, such as ERP, may be to use master data, for testing and maintaining the automated tools for cleaning legacy data [ 25 ]
  6. 6. Jodi Vosburg and Anil Kumar before integrating it. Tools provided by Customer master data was loaded Managing dirty data in vendors such as id.Centric, Vality programmatically initially. Customer master organizations using ERP: data includes addressing for billing and lessons from a case study Technology, HarteHanks, etc. (Knowles, 1997) may benefit organizations significantly. shipping, tax identification numbers and Industrial Management & Data Systems Sales of such tools used for data extraction, designations of customer type and pricing 101/1 [2001] 21±31 refining and loading, was expected to reach levels. Migration from legacy systems to the $210 million by the end of 1999 (Kay, 1997). ERP has allowed marketing services an The initial ERP implementation involved a opportunity to scrutinize and clean data programmatic load of legacy sales order maintained about our customers and sales backlog onto the ERP. The order load force. More stringent master data program was developed and tested over a requirements in the ERP, in fact, made this a period of months by a programmer familiar necessity. For example, the legacy system with the organization's business practices had used an address in Varnons, Georgia, for and a team of users. The load was simplified one customer for years. This address was by the fact that the legacy system was well kicked out in the programmatic load of supported. That support meant not only that customer master data. On investigation, it data to be converted was relatively clean, but was discovered that Varnons was not a city, but a stop on a railway line. The ERP will also that the data in the legacy system was determine the zip code and county given a well-defined and understood ± a program city and state. This not only ensures that the could be written to capture only relevant city and state are entered accurately, but also data. Unfortunately, the idiosyncrasies of the that the customer has provided a valid city order entry process for the various product and state combination. Customer data moved lines and of the ways in which CSRs entered from the legacy to the ERP system were the orders meant that no program could relatively cleaner than they had been on the convert the data without some errors. well-maintained legacy system. Because the data integrity of the orders was so important, each converted order was Migrating poorly-maintained legacy data reviewed on the ERP by the responsible CSR. Atre (1998) comments that ``you are likely to Many data errors were caught and corrected run into problems such as incompatible data during this review, including item quantity, formats, codes that no one can decipher, data material number, and ship to errors. But that's embedded in long text fields, some were missed. A tremendous amount of overlapping customer records from multiple time has been spent and is still being spent to systems, some with redundant data and correct these errors. One of the most common others with conflicting or outdated data and errors involved contract release orders. The even chunks of mystery data of long- program designed to convert the data forgotten provenance and uncertain ownership''. Weston (1998) suggests using somehow selected and input the wrong flags for dirty data that is migrated. As a material number into the converted order. result, a decision-maker can decide if he/she The CSRs, possibly tired after consecutive 12- wants to use the information or leave it out hour days of data verification, missed many during data analysis activities. The customer material errors. These kinds of errors, and salesforce master load for the migration though, are always found eventually ± from a much less well-maintained system usually by the customer. The results of these was a more tedious and difficult procedure. errors were: shipment of the wrong Keeping that data clean on this legacy system materials; angry customers; time spent was never a priority. The data was entered by investigating the error; cost of processing the order entry group, as there was no credit orders and replacement orders; position assigned to the management and expedited production of the correct materials control of master data for this system. (resulting in late shipments of other orders); Misspellings, duplicate records, and transportation costs for returning the wrong inconsistencies were the result of a lack of units; and/or cost of scrap or storage. No control over who could add, change, or delete attempt has been made to assign a dollar customer master data, of instructions for value, though, to the results of this dirty data. proper management of the data, and of Overall, though, this data migration was a auditing procedures. The problems were success ± the inevitable data errors were exacerbated by the fact that, when the identified, some sooner, some later, and company purchased this facility, a corrected. The success was due in large part completely new group of users began to enter to the fact that the legacy system was well this data. A lack of shared definitions of the supported, the migration process was well components of the master data and their uses tested and documented, and those closest to increased the number of discrepancies and the data verified the data after the migration. errors. Where the original group might [ 26 ]
  7. 7. Jodi Vosburg and Anil Kumar define a salesperson as a customer or a and some in a closed status, some in an open Managing dirty data in vendor or an agent, the company group status. Attempts to suppress this data on the organizations using ERP: defined the salesperson as an agent only. conversion order might have, without lessons from a case study Subsequently, where there was no agent-type extensive testing, resulted in inadvertent Industrial Management & Data Systems record for a particular salesperson, one was suppression of materials that should be 101/1 [2001] 21±31 created, thereby creating the potential for converted ± the Miami order entry location inaccurate reporting of sales data. Thus, uses freight items not to communicate before any master data could be moved to the shipment information, but to charge the ERP, each record had to be manually customer. reviewed. The marketing services group This project, though, was also a success. again handled this process. Spelling errors, While the manual conversion presented an duplicate records, and incomplete data were opportunity for entry error, the process was addressed before the data was loaded largely error free. This can be attributed to to the ERP. the extensive testing of the backlog report The sales order and production data on this serving as the basis for the conversion, system had been subject to inexplicable simple comprehensive check-list style changes. For example, in November of 1998, instructions for the CSRs in the use of the the order entry group started to notice that backlog report, and, most importantly, a some items on orders were being closed by group of CSRs now more comfortable and the system for no apparent reason. Thus, experienced in the use of the current ERP. they would never be built or shipped. The in- Again, migration to the new ERP was a boon, house support could not identify the cause or because it drove the process of examining propose a solution, nor could the and cleaning current data. manufacturer of the software. The in-house Lesson: Migrating dirty data is a challenging support group advised CSRs to address these task. Use of automated tools is a good strategy for organizations planning to implement system-generated cancellations as they integrated application systems. The most happened ± a virtually impossible task. After important factor is that the data needs to be much discussion, the support team agreed to cleaned before it is migrated to an ERP write a report to locate these items. system. The data on this legacy system was not well supported or understood. The data was in 5.5 Recognizing the complexity of such poor condition that sales and marketing integrated data reports generated from system data were The integration of several business functions virtually useless. For example, the Canadian on a single system holds tremendous order entry location might enter orders using potential for reporting. All transactional data the same customer master record for is now available from one source. Reporting different customer locations by overwriting that was difficult or not feasible in the past is the sold-to-address text to reflect the different now possible. This consolidation of functions location addresses. The domestic location onto one system has forced the various units would add new customer master records for of the business to develop a greater each customer location. Existing reports understanding of the work done by other could not accurately reflect these units of the business and their interpretation contradictory approaches. of the data. With this potential, though, These factors combine to make a comes increased complexity. Tayi and Ballou programmatic migration of the sales order (1998) point out ``personnel databases data to the ERP infeasible. Instead, sales situated in different divisions of a company orders were manually loaded onto the ERP by may be correct but unfit for use if the desire CSRs using an expanded backlog report. The is to combine the two and they have lack of understanding of the way the system incompatible formats''. Kilbane (1999) says stores data, coupled with inaccuracies and ``the problem is that data is, too often, in inconsistencies in order entry and different formats and companies don't know processing, made the writing of this report how to properly bring it together and turn it very difficult. For example, the initial run of into actionable information''. the report included thousands of freight Locating data tables within the ERP system items. Freight items are added to sales orders appropriate for the intended reporting has by the shipping department to indicate turned out to be more tedious and difficult carrier and shipment date of materials on the than anyone imagined. Reports used by the order ± they are not backlogged. These were salesforce and in manufacturing to describe difficult to suppress in the report because of sales order backlog have been found to be so the inconsistent ways in which they have error-ridden that they have been totally been added to the sales orders ± some were scrapped and rebuilt. Reports meant to loaded as text items, some as freight items, describe incoming businesses took months to [ 27 ]
  8. 8. Jodi Vosburg and Anil Kumar write. Several iterations of these reports suspicions were confirmed in March, when it Managing dirty data in were developed before the set currently in was discovered why incoming business organizations using ERP: circulation was completed. numbers seemed too high. CSRs had been lessons from a case study The information analyst describes an error entering the sales credit designation on Industrial Management & Data Systems that she stumbled across while researching orders more than once. Whenever a new item 101/1 [2001] 21±31 another reporting data discrepancy. It seems is added to an existing ERP sales order, the that the same incoming business report was ERP returns an error indicating that sales run for the month of March on April 1 and credit is missing. The correct action is to then again on April 3. She noticed that the activate the existing sales credit designation totals were different. This should never occur on the order for the new items. This problem ± once the month is closed, no updating was not anticipated or clearly understood. should occur. She indicates that locating the Thus the correct handling was never made cause of a problem like this is difficult and part of the ERP training for CSRs. So, CSRs time consuming and sometimes proves to be generally entered an additional sales credit impossible. In this case, though, they were designation with each addition to an order. able to locate the source of the problem ± the Some orders showed a sales credit allocation reporting structure was identifying the of 400 per cent or more of the net value of the wrong date field as the determinant for order. The sales credit numbers are also used which month a particular type of order would to report incoming business. In total, this be allocated to. The correction of this data entry error resulted in an eight million- structure error is perhaps more tedious than dollar overstatement of incoming business. finding the cause of it ± the field reference Because this affected incoming business and must be changed in more than 100 places in not shipments or production, the cost was each of the several data structures. These and minimal financially. However, sales other integrity problems detected in the early managers were forced to adjust sales going have meant that several manual engineer bonuses downward as a result of the adjustment schedules must be published with discovery. each run of this report ± the data cannot be This data cannot be corrected on the ERP cleaned. The information analyst attributes system. All adjustments had to be handled these errors to a lack of comprehensive manually. Some preventative measures were testing of the updating that occurs when immediately put in place. In the short term, these orders are processed. She sights a lack additional training was provided to the of communication between those that people who enter orders to make them aware understand the way the company accrues of the impact of this error. A daily report is and processes data and those responsible for being run to identify these errors as they are building the data definition structures. As a made, allowing on-line corrections. In the result, some basic assumptions were made in long term, the ERP configuration changes the definition of data that were incorrect. have been requested to eliminate the The complexity entailed by system misleading error message and to add integration is compounded by the marketing messages when more than 100 per cent of the services staff's inexperience with the selected value of the order is allocated as sales credit. reporting bolt-on, the ERP data structures, According to the manager of the CSC: ``The and the architecture of the data itself. Basic problem might have been prevented if we all reporting requirements to operate the knew how to test wrong. In all the massive business, coupled with this inexperience, testing done on order entry and reporting on have resulted in an inordinate reliance on it, not enough was done to try to enter bad consultants for report writing. While these data. Some of the edits seemed so self-evident, consultants are skilled in report writing and that there lack was almost impossible to the integration of ERP, their lack of comprehend. I think we are just now learning understanding of company business and the how important understanding and testing for transactional data and processes, and dirty data is in a truly integrated system.'' subsequent ERP configuration changes, has Lesson: Test, test and test again. Testing is a impeded accurate reporting. crucial aspect of implementing ERP Lesson: It takes time for users to comprehend solutions. There should be no short-cuts in and use integrated data as a result of using testing. Different user groups should be ERP packages. Care should be taken to ensure involved in the testing process to ensure that that all users understand the concept of all possible scenarios are used for testing the integrated corporate data and use it ERP system before the conversion to ERP is accordingly. implemented. 5.6 Testing the new system 5.7 Training The costs of insufficient testing prior to Lack of proper training can frustrate users implementation can be very high. Months of when they begin using an ERP system in an [ 28 ]
  9. 9. Jodi Vosburg and Anil Kumar organization. Caldwell and Stein (1998) point Even marketing services, though, does not Managing dirty data in out the example of Amoco, where ``managers have a system in place to check data organizations using ERP: lessons from a case study found SAP so unfriendly they refused to use it. regularly for problems. The information Few [of our] people use SAP directly because analyst indicates that the department spends Industrial Management & Data Systems you have to be an expert''. The authors further so much time ``putting out fires'' that there is 101/1 [2001] 21±31 comment that in the case of Owens Corning, little time left over for carrying out the organization found out that ``the cultural systematic data checks. The problem is and organizational impact on IT organizations exacerbated by a lack of tools for auditing. is a little short of revolutionary''. The entry The information analyst attributes this to the and extraction of dirty data can be prevented newness of the implementation. with greater dedication to initial and on-going At present, data integrity is protected training for those responsible for entering and through a combination of system safeguards, extracting data. A lack of time is typically user training, and data entry procedures. sighted as the reason for inadequate training. System safeguards are the result of building The time required to investigate, understand, data integrity rules into ERP. For example, correct, and prevent problems due to dirty ERP will prevent a CSR from entering a ship data is considerably more, though, than that to address in a sold-to-field. This is a hard required simply to understand and prevent error, preventing saving of the data. Soft those problems. The additional cost of this error messages give the CSR the opportunity reactive approach is the loss of shareholder to review potentially erroneous data. confidence in the system, employees, and Additionally, many fields are populated from data. A significant training effort was put into drop-down boxes, eliminating the chance for teaching those that would be using and misspelled entries or entries outside the entering data in the system. Each CSR acceptable domain for the field. received in excess of 50 hours of training in Lesson: Organizations should emphasize that meaning and population of the various fields maintaining data integrity is an on-going comprising the order entry screens. Order process and everybody needs to play an active entry procedures are documented in detail role. Maintaining data integrity does not stop and available to all CSRs. The difficulty lies in with the implementation of the ERP system. knowing how much training is enough ± a difficult question to answer at conversion 5.9 Using consultants Care must be taken to ensure that if time, given the consultants' lack of consultants are hired for the transition understanding of the particular business and the employees' lack of understanding of the project, the internal stewards of the system new system and the potential problem areas. understand their work. For example, in this There is no question, though, that additional company, consultants were responsible in training will be required after large part for developing data structures for implementation to address the numerous the new system, and form the system unanticipated problems that will arise. metadata. These structures are used in Lesson: On-going training is a prerequisite for conjunction with raw data to define the success in implementing ERP systems. context of the data and to ensure that data Organizations should plan ahead of time to reported is consistent with what is intended train all users before and after the or required. For example, a structure may implementation. Periodic exchange of ERP define incoming business as the value of the experiences by users in an organization from selling prices on sales orders not including their work environment will go a long way. items that have been cancelled. Thus reports providing incoming business data will not 5.8 Prioritizing data maintenance include cancelled items. Direct involvement According to the CSC manager: ``Data by the manager of the CSC and the marketing integrity is assigned a high priority at the services staff throughout the development, management and IT level. It is not as high a ensured that data structures defined by the priority at the middle manager and lower consultants matched data the way users of levels, as worrying about data integrity can that data defined it. This prevents the slow down production, order entry, shipping, possibility that once the consultants leave etc..'' The information analyst and the CSR the project, the users of the system expressed similar opinions when asked about understand the data that is being processed the prioritization of data integrity at CPS. by the system. The information analyst indicated that data Lesson: Hiring consultants to assist with the integrity was critical in the marketing ERP implementation is an effective strategy if services department, but prioritized much organizations ensure that all work done by lower in departments dealing with the day-to- the consultants is understood and day operations. documented. The ERP implementation [ 29 ]
  10. 10. Jodi Vosburg and Anil Kumar knowledge should not leave the organization problems should be systematically Managing dirty data in after the consultants work is completed. documented and stored so as to be easily organizations using ERP: lessons from a case study accessible to interested users. If a similar 5.10 Post-ERP implementation problem occurs, documentation of other Industrial Management & Data Systems Counteracting and preventing dirty data ± similar instances would be readily available. 101/1 [2001] 21±31 current perceptions and practices Where necessary, the communication should Data entry procedures have been created to be followed up by training. control the potential damage accruing to Regular training sessions should also be dirty data. For example, CSRs are required to scheduled to ensure that users understand place a production block on orders for some data integrity concepts and methods. These material types. This gives the product group sessions would not only build a shared marketing departments an opportunity to interpretation of data and preferred review the order and correct any errors processing methods, but would also foster a before production begins. Taken more global perspective on the part of the individually, this procedure seems like a users ± instead of seeing only their own role, reasonable safeguard. Taken together with users would see their role in the context of all of the other exceptions and qualifiers to the business. This perspective would assist in the basic order entry procedure based on paring down some of the current procedural material type, or product line, and order complexity. Simpler procedures would type, though, the procedures begin to seem further increase data accuracy. like the source of the errors rather than the Equipped with an understanding of the way to avoid them. As the procedures grow impact of their work on other areas of the more complex, the likelihood of entering data business, users can be analysts rather than accurately and consistently drops. data entry clerks. Analysts can make good In almost every department within the decisions in a complex and dynamic work business, the increased complexity of environment. Broadly-trained analysts performing the job has meant more time would also be in a position to work effectively required to do the same work. This means with consultants thus reducing our reliance even less time and attention to data integrity on them. issues and more dirty data ± where someone Performance measures should be taken may have taken the time to find out what an regularly to gauge the effectiveness of and to error message means and to address the data improve on the system and training entry error prior to the implementation of initiatives. All of these measures would ERP, now they may pass the error without directly improve data integrity and would addressing it because of the work backlog. serve to underline the importance of data Out of necessity, though, where data integrity to all users. These measures would integrity is compromised, user involvement reduce errors in carrying out tasks into testing and reporting procedures is throughout the business and all their increasing. The correction will begin with an associated costs and help to draw a sharper investigation by system/support and/or marketing services of the problem. Then, picture of the business to improve long- and reporting tools will be generated to find all short-term decision-making. instances of the error. Finally, users will be enlisted to implement corrections. CSR involvement in the corrections is critical 6.0 Conclusion because of their intimacy with the data and Implementing ERP systems requires as a training tool ± those repeating the error reinventing the business. Several legacy most frequently will have the most systems are integrated in the process with a corrections to make. The more corrections single integrated system for managing that the CSR is required to make, the greater operations across the organization. Data that the likelihood that they will be able to avoid resided in dozens of disparate sources is now the mistake in the future. available through one integrated system for Counteracting and preventing dirty data ± all users in an organization. To achieve areas for improvement success in ERP systems implementation, A systematic approach should replace the project champions should make sure that more reactive crisis management approach they address the relevant issues. Some of the to data integrity. Data audits should include key lessons from this study include, among daily integrity checks within the system and others, the following issues: regular audits performed by user groups. . The champion of the ERP implementation Problems uncovered in those audits should project should ensure that the be shared with all affected parties. The transformation is not viewed as an IT causes, effects, and resolutions of those initiative, rather a business necessity. [ 30 ]
  11. 11. Jodi Vosburg and Anil Kumar This requires educating the stakeholders from this case study would be valuable for Managing dirty data in about the transition to an ERP. organizations planning to implement ERP organizations using ERP: . The champion for the change to ERP systems. lessons from a case study should recognize the value of data as an Industrial Management & Data Systems organizational resource and educate users References 101/1 [2001] 21±31 about it. The issue of sharing corporate Atre, S. (1998), ``Beware dirty data'', data and assigning responsibilities for Computerworld, Vol. 32 No. 38, pp. 67-9. managing it should be done with a view to Caldwell, B. and Stein, T. (1998), ``New IT agenda'', avoid any political issues arising from Information Week, No. 711, November, p. 30. owners of disparate data sources. Ferriss, P. (1998), ``Insurers victims of DBMS . The ERP implementation should be fraud'', Computing Canada, Vol. 24 No. 36, planned to prepare users for the change. 28 September, pp. 13-15. The expectations based on new Greengard, S. (1998), ``Don't let dirty data derail responsibilities should be outlined you'', Workforce, Vol. 77 No. 11, November, upfront to avoid any conflicts. pp. 107-8. . The user community should be given time Horwitz, A.S. (1998), ``Ensuring the integrity of your data'', Beyond Computing, Vol. 7 No. 4, to accept the changes in their work May. environment to minimize the impact on Kay, E. (1997), ``Dirty data challenges warehouses'', organizational culture, such as Software Magazine, October, pp. S5-S8. overcoming comments like ``we've always Kilbane, D. (1999), ``Are we overstocked with done it this way''. Users should be data'', Automatic I.D. News, Cleveland, OH. encouraged to use the new system by Vol. 15 No. 11, October, pp. 75-9. providing incentives. Knowles, A. (1997), ``Dump your dirty data for . All data that is migrated to an ERP system added profits'', Datamation, Vol. 43 No. 9, should be cleaned before the migration. September, pp. 80-2. Automated tools for data migration can be Redman, T.C. (1995), ``Improve data quality for very useful for companies. competitive advantage'', Sloan Management . Training users on a continual basis is Review, Cambridge, Vol. 36 No. 2, Winter. very important. It is important that users Ruber, P. (1999), ``Migrating data to a warehouse'', do not get bogged down by activities that Beyond Computing, November/December, take up too much of their time. pp. 16-20. . Extensive testing is required for Sellar, S. (1999), ``Dust off that data'', Sales and implementing ERP systems. A good strategy Marketing Management, New York, NY, would be to phase-in the implementation Vol. 151 No. 5, May, pp. 71-3. rather than making a direct conversion. Stankovic, N. (1998), ``Dual access: lower costs, . Consultants experienced with ERP tighten integration'', Computing Canada, implementation can be very helpful. Care Vol. 24 No. 27, July, p. 30. must be taken to ensure that all the work Tayi, G.K. and Ballou, D.P. (1998), ``Examining done by consultants is documented for data quality'', Communications of the ACM, future use. New York, NY, Vol. 41 No. 2, February, pp. 54-7. In this paper, we listed and discussed issues Wallace, B. (1999), ``Data quality moves to the pertaining to ERP implementation. Though forefront'', Information Week Online, implementation in different organizations 30 September. can vary based on the organizational culture Weston, R. (1998), ``Using dirty data'', and business needs we feel that the lessons Computerworld, Vol. 32 No. 22, 1 June, p. 54. [ 31 ]