Bloor managing-spreadsheets-paper


Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Bloor managing-spreadsheets-paper

  1. 1. Managing Spreadsheets (Version II) Author: Philip Howard Published: May 2006 a White Paper by
  2. 2. Managing Spreadsheets © Bloor Research 2006 Page Fast facts Spreadsheets represent one of the most popular applications on the planet. This is because they are the reporting and analysis tool of choice for many professionals, and because they support collaboration and information sharing. Moreover, this is not going to change, not just in terms of existing business people but, as our chil- dren are being taught how to use spreadsheets in school, this popularity is likely to continue for many years to come: spreadsheets are ubiquitous and will remain so. However, no matter the popularity of spreadsheets, they also, used improperly or incorrectly, or without sufficient control, pose a greater threat to your business than almost anything you can imagine. They can give rise to compliance issues because changes to data are not audited. They can also be used to aid and abet fraud, because security is not applied to conventional spreadsheets and, again, because there is no control over the ability to change data values. Further, it is easy to make mistakes in spreadsheets (for example, by entering an incorrect formula) that can mislead decision makers, the results of which can be very expensive. HM Customs Excise in its “Methodology for the Audit of Spreadsheet Models” says that “the complexity and functionality of spreadsheets has reached levels of sophistica- tion that few could have imagined… the consequent threat posed to businesses by such powerful ‘end user’ applications, mainly in the hands of untrained users, is immense”. A major cause of these problems is that spreadsheets are not treated as an enter- prise resource. For example, although there are (limited) security and auditing facilities in Microsoft Excel, these are not usually enforced. Indeed, because many users are self-taught they will not be aware that such facilities even exist. In the main, this is because spreadsheets are not perceived to be an IT resource but are seen to lie within the business domain. As a result, corporate security standards are not implemented for spreadsheets. On the other hand, the business is not aware of the potential dangers that the uncontrolled use of spreadsheets can cause. A major focus of this paper is therefore to make business users aware of these dangers so that they can push the task of managing spreadsheets into the hands of the IT department. In particular, it discusses the need for spreadsheet management, precisely in order to prevent, or at least minimise, the issues just mentioned. Having established the need for spreadsheet management solutions, this paper goes on to discuss what such a solution might look like. In fact, there are different approaches that may be taken, ranging from complete control (that is, you abso- lutely prevent people from doing what you don’t want them to) to complete monitoring with no control (that is, you monitor all changes but do not actively prevent any of them—rather like closed circuit TV). We will discuss the relative merits of these positions and when each of these might be most suitable (which will depend upon how spreadsheets are used and for what purpose). Unfortu- nately, at present there are few vendors offering relevant solutions, though there are rather more that purport to do so—we will discuss the types of facilities that any potential solution should provide. When we first published this white paper in 2005, there were no complete solu- tions available on the market. However, vendors are getting much closer now.
  3. 3. Managing Spreadsheets © Bloor Research 2006Page Nevertheless, in the absence of such a complete remedy, we will also consider how organisations can optimally manage their spreadsheet environments, both in terms of the management procedures that need to be implemented and with respect to those tools that are available. One point that we must make is that there is an inevitable congruity between the concept of spreadsheets on the one hand and Microsoft Excel on the other. Excel is, after all, the epitome of a spreadsheet application, and it is by far the most widely used. In general, where Excel is referred to in this paper it can be taken as a synonym for spreadsheet. Finally, spreadsheet management can be extended into the realms of spreadsheet automation. While this is not the principle focus for discussion in this paper, it is worth bearing in mind. Spreadsheets take an inordinately long time to produce. This is typically caused by the varying skill sets of the spreadsheet users but even the expert spreadsheet craftsman tends to spend too much time in tedious and repetitive tasks such as formula copying, formatting, workbook assembly, and distribution. From a business perspective, therefore, it will be useful if spreadsheet management solutions were to be able to provide automated facilities that could alleviate this repetition: with all the productivity gains that that implies.
  4. 4. Managing Spreadsheets © Bloor Research 2006 Page Spreadsheet problems There are five major problems with spreadsheets: the potential for errors, lack of security, the absence of an audit trail, the misperception that spreadsheets are not an enterprise resource, and productivity issues. We will consider each of these in turn before considering some other management issues related to spreadsheets. Error potential The following paragraph is excerpted from a PriceWaterhouseCoopers (PwC) report published on the use of spreadsheets and the Sarbanes-Oxley Act, in July 2004: “An article in the May 24, 2004 issue of Computer World indicated that, ‘Anecdotal evidence suggests that 20% to 40% of spreadsheets have errors, but recent audits of 54 spreadsheets found that 49 (or 91%) had errors, according to research by Raymond R. Panko, a professor at the University of Hawaii.’ The Journal of Property Management on July 1, 2002 stated, “30 to 90 percent of all spreadsheets suffer from at least one major user error. The range in error rates depends on the complexity of the spreadsheet being tested. In addition, none of the tests included spreadsheets with more than 200 line items where the prob- ability of error approaches 100 percent.’ Perform an online search for spread- sheet errors or spreadsheet audit, and you will find a number of major failures attributed to spreadsheet inaccuracies that hit the press in the past year alone.” This is not the first time that PwC has reported the errors inherent in spreadsheets. In earlier work, the company reported that in a survey of large client spreadsheets, it found that 90 per cent contained significant errors. More recently, KPMG Consulting has reported that 95% of the financial models that it reviews contain material errors. Note in particular the statement that: “in spreadsheets with more than 200 line items the probability of error approaches 100 per cent”. Of course spreadsheet errors may be more or less important, depending on the spreadsheet in which they appear and the purpose for which the spreadsheet has been created. However, research has been carried out to establish the impact of errors in spreadsheets on decision making. According to a 1996 report, the cost of these mistakes is within the range of $10,000 to $100,000 per decision per month. The European Spreadsheet Risks Interest Group (EuSpRiG) runs a web site (www. on which it lists the results of various spreadsheet errors. The following are just three of these: A. January 29, 2005 Mistakes happen during budget planning – “Gov. John Lynch’s budget team… has to find another $70 million to make its budget balance. Figures that the Health and Human Services Department provided to budget writers in the fall contained an error that double-counted more than $17 million of Medicaid money in each of the next two years. A detailed spreadsheet that HHS gave Ways and Means Tuesday morning showed that $35 million in a specific category of hospital reimbursements would come in
  5. 5. Managing Spreadsheets © Bloor Research 2006Page over the next two years. A second sheet HHS produced Tuesday afternoon showed no money in the category, reflecting the fact the funds can only be used one way—at the state hospital.” B. June 17, 2005 Natural gas consumers sue Dominion Transmission over clerical error – A Federal Energy Regulatory Commission investigation found that the subsidiary of Richmond, Va.-based Dominion Resources Inc. submitted the wrong week’s gas storage figures in November, leading to an artificial inflation of natural gas prices. The lawsuit estimates that consumer prices were hiked by between $200 million and $1 billion. “The investiga- tion concluded that it was not deliberate, but when I hear the words clerical error, I think of negligence,” plaintiff’s attorney W. Coleman Allen Jr., of Richmond, Va., said Friday. “Consumers were harmed the same as if it was intentional.” One explanation for the error was that the company had used the same computer file name for each week’s storage balance spreadsheet report, making it easy for the wrong one to be sent. C. September 12, 2003 ORLANDO Sentinel – Assistant County Manager Cindy Hall, in a memo to commissioners on Thursday, wrote that the April 22 study by Henderson Young Co. duplicated the cost of building a new elementary school. The extra $12 million cost was corrected on a spreadsheet in the study, but it wasn’t later adjusted on the total cost of school projects for the next five years. Jim Drake, director of finance for Lake County Public Schools, said: “It was basically a simple spreadsheet error. But obviously it’s going to have an impact on building new facilities.” County Commissioner Jennifer Hill said the consultant was given the final numbers for its study from the School Board only a few months before the study’s completion, “It was rush, rush, rush,” she said. Hill called the mistake “a simple mathematical error”. There are four types of potential errors in spreadsheets: 1. Errors in the data—these can occur through: a. Incorrect data entry—keyboard entry of data is to be avoided if at all possible. There are well-established error rates for keying errors, which are inevitable if data is to be entered manually. b. Incorrect specification—for example, you want data in a particular cell to reflect a database field called “cust1” but have inadvertently entered “cust2”. Similarly, if working with a front-end environment that supports Excel, you might have selected the wrong option from a drop-down list of relevant data sources. Of course, this is similar to incorrect data entry but it cannot be entirely eliminated. c. Incorrect definition—similar to incorrect specification. This occurs when you specify the wrong format for a field. For example, you define it as text when it should be a currency field. d. Incorrect placement—this differs from incorrect specification in that the data you have defined is correct, but you have put it in the wrong place,
  6. 6. Managing Spreadsheets © Bloor Research 2006 Page as opposed to the wrong data in the right place. Note that incorrect place- ment is not limited to single instances. You may reuse a particular value in multiple places within a spreadsheet or across spreadsheets and the data may be in the right place in some instances and wrong in others. e. Incorrect access—in spreadsheet reporting applications (as opposed to things like budgeting applications) it is often the case that data is loaded into the spreadsheet in some sort of automated way, either via an import from a CSV (comma separated value) file, or more directly from a query environment. In these cases there is the potential to address the wrong data source, or to perform an invalid transformation as the data is loaded into the spreadsheet. This problem can be exacerbated if timeliness is a major consideration, with lack of real-time access to data in spreadsheets driving users to deploy ad hoc query tools that exist outside of the spread- sheet environment. 2. Formulaic errors—that is, where a formula is incorrectly expressed. For example, you might have “x” instead of “+”, with appropriately dispropor- tionate results, or you might have added (or multiplied) the wrong columns. In the case of formulae there are basically three types of error: formulae that have been incorrectly worked out in the first place, formulae that are inaccu- rate because of keying errors, and problems with cloned formulae. In the last case, for example, it is all too easy to clone a formula designed to sum 10 cells and put it at the bottom of a column with 15 cells. Some, but by no means all, errors may be detected by the spreadsheet software. Microsoft Excel, for example, will display (if appropriately set up) a small green triangle with a pop-up comment if it suspects a formulaic error. 3. Macro errors—many spreadsheet users do not use macros, but for those that do this represents another major potential source of errors. Macros are, in effect, mini-programs and we all know how bug-ridden and error prone programs can be. While there is less scope for errors in macros, it remains a possibility that must be catered for. 4. Template errors—a template error relates to the misapplication of a template. It is common to use a template for recurring uses of the same spreadsheet, for example: uses that occur on a regular monthly basis. In such instances it is common to reuse the same base template and simply make relevant amend- ments for the month in question. However, this reuse is purely manual and is therefore an inherent risk. This is exactly what happened to Dominion Resources (example B): when the company applied the wrong month’s template which resulted in it under-representing gas reserves, leading in turn to artificially inflated consumer prices. There are of course a variety of other mistakes that you can make when using a spreadsheet. You can position columns inappropriately, it is possible to use inap- propriate graphical methods for particular data sets, and so on, but these are mistakes rather than errors. What is important about errors is that they give you misleading information that can lead to poor decision making, which in turn costs money: often lots of it.
  7. 7. Managing Spreadsheets © Bloor Research 2006Page Security There is not a lot to discuss about spreadsheet security because there isn’t any. Actually, that isn’t quite true: Microsoft Excel does, in fact, have a password facility though it is honoured more in the breach than the observance. In prac- tice, anyone can open up a spreadsheet, change the data to their heart’s content and amend formulae. Anyone of malicious intent can deliberately induce errors in spreadsheets either because they have a grudge against the company or in order to support any fraudulent activity that they may be indulging in, or simply to gild the lily with respect to their own performance. To take a simple example, you cannot go into your company’s General Ledger and gaily change the figures therein: the software and its security will not let you do that. However, you can extract the data from the General Ledger into Excel and then you can change that data as much as you like. We cannot believe that such a laissez faire attitude to corporate data makes sense. The other big problem with spreadsheet security is that there is little or no user- level access control (there will be in Excel 2007). At present, even if you are one of the rare few that use passwords, once data is in a spreadsheet you can see all the data that is in it: you cannot then limit who sees what information within a given spreadsheet. This would not be acceptable almost anywhere else within your business. For example, you may let a manager see details of his department’s total salary expenditure but you wouldn’t let him or her see the salary information for each individual employee, but that is exactly what you can do using spreadsheets. You can, of course, hide data but then it is hidden from everybody (an invita- tion to fraud if ever there was one) but that doesn’t get you any further forward since what you need to be able to do is to allow visibility to what individuals are allowed to, or need to see, but not what they are not permitted to see. In other words, what is needed is full role-based security so that people can only see what they have the right to see. Auditing The third major issue with spreadsheets is with respect to auditing. However, like security, there is not much to discuss because again, while there is some capability, it is rarely used. In practice, you can log changes to a spreadsheet into a separate worksheet but this only applies on an individual spreadsheet basis rather than across the whole spreadsheet environment. In other words, in most instances, not only can you not prevent someone from changing the data in a spreadsheet, for whatever purpose, but you also have no way of knowing who changed the data, when he or she changed it, or what the change consisted of. Indeed, you can’t even tell that a change has been made at all! In a way, this is much more serious than a lack of security (though the two go hand-in-hand in encouraging fraud) because it undermines any compliance or governance regulations that may be in place. As a general statement it would be fair to say that: At the time of writing Microsoft Office 2007 is currently in beta testing and it will be released within a matter of months. The Office 2007 version of Excel will have significantly enhanced security and auditing features. How- ever, these are not complete—there is no cell level locking, for example. Further, since most users of Excel are self-taught, it is likely that they will remain unaware of these new fea- tures and continue not to use them. Thus, while Excel 2007 holds out the promise of improved spreadsheet management we doubt that its new features will be deployed widely with- out organisations taking some of the actions recommended in this paper.
  8. 8. Managing Spreadsheets © Bloor Research 2006 Page Any company that is subject to Sarbanes-Oxley, IAS/IFRS or similar regula- tion, which uses spreadsheets for any purpose beyond very limited reporting, and does not use spreadsheet management, will be unable to comply with the strictures of those laws. This is a pretty strong statement. But Sarbanes-Oxley requires companies to be able to justify what has happened to the data it presents in its corporate accounts and how it got there. If a spreadsheet is involved at any point in that process then, unless appropriate controls are in place (spreadsheet management), you will have a breakdown in the data chain where you cannot certify what has happened to the data. Note that it is by no means impossible to use spread- sheets within a compliant environment—but it requires management. Micro- soft, for example, makes extensive use of Excel spreadsheets in its own internal compliance procedures. However, the key point is that Microsoft does use the management facilities in Excel and they are surrounded by appropriate addi- tional procedures. In fact, the issue is even worse than this. Spreadsheets are often passed from one user to another and the latter may well use the information in the original spread- sheet as a data source for spreadsheets of his or her own. Again, there is no way to track that this has happened. You can, in fact, prevent it from happening (you can lock the spreadsheet so that it cannot be forwarded, edited or even printed) by using Information Rights Management software, but this doesn’t help you to monitor what happens subsequently if you actually want to enable this sort of functionality. Finally, Excel allows data to be consolidated from a number of sources into a worksheet. By default, it shows results but no formulae, with the consolida- tion taking place in memory. From an auditing and compliance perspective, this default setting should never be used as the data cannot be tracked and there is a high risk of error. Spreadsheets as an enterprise resource It should be clear that spreadsheets, or at least some of them, are vital to organ- isational well-being. In particular, those spreadsheets that are used to inform important decision-making processes, are used for financial and other corporate reporting, or are to be used in customer or other third-party presentations, need to be treated just like any other corporate asset. In particular, just as you would not implement a new application without testing that it did what it was supposed to do, all corporate (if not personal) spreadsheets should be tested prior to deploy- ment. That is, sets of figures should be run through the spreadsheet to ensure that there are no formulaic, macro, placement, access or specification errors anywhere within the spreadsheet. In other words, if it is accepted that spreadsheets represent a corporate resource, then all spreadsheets that do anything more than very simple reporting should be subject to a quality control process to ensure accuracy. If, on top of that quality control, you can implement spreadsheet management procedures (especially
  9. 9. Managing Spreadsheets © Bloor Research 2006Page simplified data access) then you will be going a long way towards eliminating costly mistakes from your spreadsheets. The main reasons why spreadsheets are not recognised as being a corporate resource are varied. In the first instance, they are often simply dismissed as not really being a critical asset or as not being suitable for investment (for user training or process assurance). In our view this is clearly a mistake. The second problem is that spreadsheets are used in different ways and by different people. For example, there are what we might term “data collection” activities such as budgeting, which is driven by the finance department in this particular case, or by other relevant departments for different applications of this sort. Secondly, we have “spreadsheet reporting”, which is owned by operational groups. In effect, spreadsheets are treated as siloed applications, each of which is owned by its own clique, none of which relate to the IT department. This leads in turn to a third problem, which we might characterise by saying that while there is a technology gap in the sense that there is an inadequate environment provided for managing spreadsheets, there is also a cultural gap which means that users do not even employ the security and auditing capabilities that are provided. The bottom line is that there is a lack of ownership of spreadsheets as a whole, with no-one in the organisation being seen to be responsible for their use within the corporate structure. This needs to change if spreadsheets are going to be prop- erly managed. Productivity issues Finally, another major issue is the time taken to manage existing spreadsheets. Even today, when these are not recognised as important enterprise resources, indi- vidual users have a considerable management effort involved in managing their own spreadsheets—they may need to discover the location of relevant source data, they may need to extract information from previous versions of a spreadsheet and perform reconciliation procedures, they may need to distribute their spreadsheets to colleagues (which raises the possibility of errors in distribution lists), and they will (we hope) be taking back-ups on a regular basis. We are also aware of environ- ments where people have simply stopped using spreadsheets and moved to statis- tical packages because of the difficulty of managing complex spreadsheets. Moreover, note that we are not talking cheap people here. Enterprise management issues While the issues discussed above represent the main reasons for implementing a spreadsheet management application, there are potentially a number of other management issues surrounding the use of spreadsheets, which it would be useful to be able to handle. For example, it is often the case that spreadsheets exist within hierarchies and it would be useful if it was possible to easily view and maintain those hierarchies. The people who use spreadsheets are typically line managers on the one hand and business analysts on the other. These are expensive personnel and it is wasteful for them to have to do these routine tasks when such processes could be automated.
  10. 10. Managing Spreadsheets © Bloor Research 2006 Page Another issue that commonly arises is that spreadsheets are used to extract data automatically from a single source database, but what you actually want is to combine data from various data sources. Management issues around this sort of scenario would be greatly simplified if you could access multiple data sources from a single spreadsheet design, and assemble individualised workbooks that tailor their content to user roles and permissions, as you can with some vendors. A further complication is that most large organisations have probably thousands, if not tens of thousands, of spreadsheets distributed across the enterprise. Not only are these uncontrolled, they are unknown and not automated. Moreover, it is inevitable that much of the functionality embedded in these spreadsheets is dupli- cated which, in itself, is wasteful. A tool that can discover existing spreadsheets and bring some or all (according to user preference) of these into a single manage- ment structure will be especially useful for ongoing administration. Indeed, it is arguable that a product that only supports the management of new spreadsheets and has no facilities for bringing old spreadsheets into the new environment is only doing half the job, if that. In particular, a managed spreadsheet environment should be able to assimilate existing contents, formulae and macros into a new design and automation paradigm in order to evolve the environment smoothly. Further, it would be useful to have comparison capabilities through which you could automatically compare different spreadsheets to ascertain where you have duplications or near duplications. In the latter case, a visual comparison or differ- ence facility would be useful (similar to those provided in application develop- ment and change management environments) along with a merge facility. Finally, you would like to be able to consolidate all of your different spreadsheets into a centralised, IT-run environment. As a part of this process it will make sense to rationalise the spreadsheet environment. This would typically be on hierar- chical lines, using the principle of inheritance (that is, lower level spreadsheets inherit the characteristics of higher level ones—ideally, based on versioning). However, as spreadsheets are dynamic objects it will not make sense to instantiate the spreadsheets except as and when requested to do so by an authorised user. Thus what is required is a system in which the relevant metadata such as spread- sheet format, formulae, macros, data locations (including cached data) and so forth are stored as a part of this process. By maintaining this master data on the server, IT could maintain an auditable master object and oversee distribution of spreadsheets generated from this object.
  11. 11. Managing Spreadsheets © Bloor Research 2006Page 10 Spreadsheet management options Third party vendors relate to Microsoft Excel in a variety of different ways. At the lowest level, suppliers simply provide the facility to export data into a spread- sheet or, a slightly more advanced offering, the ability to print a formatted, static spreadsheet, where data values and formats are saved as an Excel file. In either case, this is usually done for one of two reasons: either to rectify some defi- ciency in the vendor’s offering (such as a lack of graphical capability) or simply because customers like to be able to play around with the data in Excel. In either case, while there is some sort of guarantee that the data was accurate when it was initially loaded into the spreadsheet, all bets are off once the data has been exported (including, potentially, the introduction of new errors) and the users begin construction and design. A more sophisticated approach is adopted by some vendors which offer an Excel plug-in. The intention here is to provide direct access to the data sourced from the business intelligence environment and to (possibly) lock-down data values, such that these are dynamically related back to the source and cannot be changed. However, this does not prevent specification or placement errors, nor eliminate errors in either formulae or macros. Moreover, because it does not provide either security or auditing within the spreadsheet environment there is a higher likeli- hood of misplaced trust in the reuse of these files and therefore increased opportu- nity for template errors. Moreover, it is always possible to copy the data from the spreadsheet into another one (on a laptop, say), amend the data and then create a new spreadsheet on the main system. Similarly, you can also e-mail a spreadsheet to a colleague and, again, there is no control over what he or she can do with the data. In other words, this sort of solution has only limited value and does not prevent abuse. An alternative adopted by some suppliers is to encapsulate and embrace spread- sheet capabilities into their own environment. This may be based on the fact that they have duplicated the Excel environment within their own system, or they may have licensed Excel and embedded it. In either case, the effect is that the spreadsheet is plugged into the vendor’s application as opposed to plug- ging the application into Excel, as discussed above. The advantage is that the whole environment is as well controlled as any other facility provided by that supplier. In addition, these sorts of products often provide a facility to automati- cally schedule, manufacture and distribute pure Excel spreadsheets to consumers, which should improve productivity and reduce construction and distribution errors. It also introduces the opportunity to deliver personalised views of the data within the spreadsheet. Further, in rare cases, these spreadsheets can be generated from templates or master spreadsheet designs (sometimes referred to as a spread- sheet blueprint) that manage data queries, workbook layout and assembly, the abstraction of recurring formulae and the inclusion of business macros. The net effect of this sort of approach is that every spreadsheet generated from a blueprint contains only as many errors as the original design, which of course increases the burden for testing and the application of proper design techniques. As we know, however, errors eliminated during design are significantly less costly than those identified later.
  12. 12. Managing Spreadsheets © Bloor Research 2006 Page 11 However, Excel is likely to be much more widely used in any organisation than any business intelligence product. After all, 150 million licensed copies of Micro- soft Office exist in the world, and the estimation of unlicensed use could be two or three times that. In other words, the approaches just discussed only address that tiny corner of the spreadsheet problem that is included in the business intelligence provider’s solution and it does not cover anything else. What is needed is a solution that spans all corporate spreadsheet resources. In the first version of this paper we focused on two elements within such a solution: one with respect to the management of spreadsheets in general and the other being more focused on the detection and correction of errors. However, it is now clear that the first of these categories, spreadsheet management, at least as far as vendors are concerned, has two sub-classes of solution: one that is focused on establishing control over the existing spreadsheet environment and the other emphasising the provision of management for new spreadsheets with an implicit assumption that existing spreadsheet resources will be re-developed within the new framework. We discuss each of these approaches in turn. Management solutions The first possible approach to a full spreadsheet management application is to start at the beginning and introduce what we might call “the complete control solu- tion”. The idea behind this sort of approach is that you will fully control every- thing that is done within the spreadsheet environment. Using a role-based security system, you will apply security at all levels, from the queries against the original data sources, to the files and directories accessed by users all the way to locking down changes at the cell-level, so that only authorised personnel can change data, and that that is audited and logged when it occurs. Similar strictures can be applied to formulae and other facilities within the spreadsheet environment. This approach is what we might also describe as the “go forward from here” method in that it is focused on the design, development and control of new spreadsheets as opposed to those that already exist. The alternative approach is to contain and monitor current activities. This is the so-called “closed circuit TV” (CCTV) option as provided, for instance by Cluster­ Seven ( In this approach, everything that you do is logged, every change to a macro or a formula, every change to the data, who did it and when. However, no attempt is made to prevent anybody from making such a change or, as a result, introducing the variety of error types mentioned earlier. In other words, this is auditing without security. Products in this category cover both existing and new spreadsheets. There are products in both of these areas of spreadsheet management and in the original version of this paper we reported that the containment and monitoring approaches were more mature than the former, largely as a result of the fact that auditing is not as onerous as design, development, and controlled distribution. In particular, the CCTV approach was already able to discover and manage all existing spreadsheet resources, something that is fundamental if you are attempting to manage resources that are already in place.
  13. 13. Managing Spreadsheets © Bloor Research 2006Page 12 However, it is no longer the case that monitoring and auditing is more mature than complete control solutions. In particular, products in the latter category, such as Actuate e.Spreadsheet ( have now added functions such as the definition and generation of multi-worksheet workbooks, versioning of spread- sheets and their original designs, macro support capabilities and the easier construc- tion and management of design templates that generate pure spreadsheets. Which approach to take will depend on individual preference but it seems likely that in the longer term most companies will want to have both complete control and auditing, rather than just the latter. Indeed, the CCTV market is primarily intended for users that deploy very advanced spreadsheet applications for mission- critical purposes (for example, in financial trading: if this stock moves by this much, then buy/sell). In these sorts of environments, spreadsheets are already recognised as a vital corporate resource, are fully tested prior to deployment, and are maintained and developed by qualified personnel only. So, although they might achieve further benefits from certain automation techniques, it is arguable that “complete control” is less of an issue as it is already employed in these sorts of environments. For most companies, it is likely that a complete control solution will be most appropriate. Requirements for a solution The following table shows the major features that we would like to see vendors provide in spreadsheet management solutions. We have divided these into “must- have” and “advanced” facilities, where the former are essential and the latter would be nice to have. Must-have features Advanced features Role-based security from query to file to spreadsheet elements, all the way down to the cell level Locking: at the spreadsheet level, for data down to the cell level, and for objects including formulae and macros Full audit trail for all changes, including macros Auto-discovery capability for existing spreadsheets Management and control of distribution and scheduling Spreadsheet hierarchy management Support for IT-based testing of formulae and procedures Where-used capabilities so that you can track the use of data across spreadsheets Template management Federated availability for heterogeneous data source access Version control: comparison, difference and merge capabilities Smart, server-side spreadsheet objects that contain layout definitions and query results that are ready to be personally delivered at view time While most of these requirements should be self-explanatory, or have already been discussed in some detail, it is worth briefly discussing template management, as this is a new requirement that we have added in this version. What we mean here is the ability not merely to manage but also to automate the use of templates so that if, for example, you use templates for monthly spreadsheets then the details for each month will be automatically generated for you so that the potential for using the wrong template is eliminated. The underlying templates that are being reused in this way are sometimes referred to as blueprints.
  14. 14. Managing Spreadsheets © Bloor Research 2006 Page 13 Dealing with errors There are two approaches to errors: one is to prevent them occurring in production spreadsheets and the other is to detect and correct them when they do occur. In the latter case, there are a variety of applications available for detecting and correcting errors in single spreadsheets (for example, XLSpell from Sheetware, see www.sheet- Indeed, Microsoft also provides a (limited) number of facilities within Excel for helping to identify errors, such as the ability to calculate nested formulae one step at a time, to trace relationships between formulae and cells, and to watch a formula and its result in a cell. In other words, there are some features to support the testing of spreadsheets as they are being built, though one would not say that these were the equivalent to the sort of testing that would be standard for applica- tions designed and developed by the IT department, for example. However, as with all software applications it is much more efficient (and less expensive) to prevent errors rather than to attempt to detect them after the event. Indeed, HM Customs and Excise states in its “Methodology for the Audit of Spreadsheet Models” that “detailed testing can be extremely laborious” even when using the software that it supplies for this purpose (SpACE, see www.lexisnexis.—and remember that you pay for this auditor’s time. Best practices guide for building spreadsheets and preventing errors It is worth going into some detail with respect to this HM Customs and Excise report. It suggests that the auditor start by assessing the risk that is associated with each spreadsheet and to concentrate upon the spreadsheets that have the greatest implications for the business. This only makes sense. It then goes on to recom- mend that the auditor assess the degree of risk associated with each spreadsheet. It is worth reporting what the methodology has to say with respect to this: “If the developer does not fully understand the business, there is a high risk of errors in the logic and design of the spreadsheet.” “Are the areas for input of raw data segregated from the computational areas?” “Is there a separate sheet containing a table of contents and a description of the purpose of the model?” “What evidence of testing and other documentation exists?” “If testing was thorough, the risk of undetected error is lower. If testing of the initial model and/or subsequent amendments was sketchy or non-existent, the risk of error is much higher.” “You must consider the adequacy as well as the mere fact of testing as evidence that the model or application presents a low risk of error.” “Has the developer documented the spreadsheet, to make clear: what it’s for; what it does; how it does it; what assumptions were made in its design; what constants • • • • • • •
  15. 15. Managing Spreadsheets © Bloor Research 2006Page 14 are used and where they are held; who developed it; when; when and how it has been changed since being brought into use; the presence and purpose of any macros?” “The better the documentation, the less scope there is for error or misunder- standing between the developer and the user.” “A good practice in design is to include the documentation as part of the work- book on a separate sheet.” “Again, consider the quality as well as the existence of documentation.” We make no apologies for quoting from this at length as it effectively provides a best practice guide for building spreadsheets and for preventing errors. Moreover, in our view the sort of structured approach that is recommended for developers should be followed even if the developer and user is one and the same person. The guide goes on to suggest that if the spreadsheet passes these criteria then it should need no more than a routine audit rather than the detailed (and extremely labo- rious) testing mentioned above. In other words, this planned approach substan- tially reduces the likelihood of error. Consider, however, the impact of employing these recommendations within an environment in which spreadsheet design is abstracted, as discussed previously, and not simply held within each spreadsheet. In essence we view the creation of a spreadsheet model or blueprint as the ideal method of applying these principals due to the fact that every spreadsheet generated from a design blueprint is guaran- teed to be as error-free as the original design. In addition, design abstraction offers other unique capabilities that are not possible when working within Excel, such as multi-sheet workbook definition, multi-dimensional summary table aggregation, formula and macro reuse, and dynamic control over multi-source query results. Obviously, the notion of building and maintaining a single design that serves thousands of spreadsheet consumers is very appealing. What you should do We said at the beginning we would discuss hybrid solutions. You could argue that there is no such thing as a hybrid: you have full control or you don’t, you have auditing or you don’t, though it would be feasible to combine a CCTV (for existing spreadsheets) and a complete control approach (for new ones). In any case, companies need to do something about spreadsheet management, either in conjunction with the sort of tools that we have already discussed or as a stand- alone exercise. The steps that organisations need to follow include: Identify all the spreadsheets in your organisation: who owns them, what they are for, how widely they are distributed and so on. Prioritise these according to their importance to the enterprise both in terms of their impact on corporate strategy and their scope for aiding and abetting fraud. • • • • •
  16. 16. Managing Spreadsheets © Bloor Research 2006 Page 15 High priority spreadsheets should be individually audited for design correct- ness, tested, published and generally managed by the IT department. Version control will be an advantage and the automated generation from design templates will eliminate many classes of errors (data, formulaic, macro, and template) and provide a strong framework for addressing the other manage- ment issues highlighted herein. Medium (as well as higher) priority spreadsheets should at least be server- based so that there is some form of central control. Where feasible store Excel data in XML format, so that you can validate fields and enforce integrity through use of an appropriate XML schema. This may not be necessary if using a complete control approach or if the complete control solution not only stores designs but also intelligent spread- sheet master objects on the server. The password and auditing facilities supplied within Excel should be used for all sorts of spreadsheets. Control over the use of macros (digital signatures and the use of trusted publishers) should be encouraged. Default settings that make calculations invisible should be turned off. Hiding of data should be discouraged. You may wish to treat older versions of spreadsheets differently from current ones. Clearly, there are fewer user obstacles to be overcome when applying security to the former. Versioning of spreadsheets is also something that you may want to explore as well as methods of evolving older spreadsheets into newer, server-controlled ones. Publish best practise guides for users (based on the HM Customs Excise model above)—there are many features of spreadsheet applications that users are simply not aware of. No doubt there are many users that would imple- ment passwords if they knew about it. Publish spreadsheet designs including formula definitions, query definitions, worksheet formats and all assumptions that make up the design. Consider the implementation of information rights management software so that you can limit the use of published spreadsheets. This is by no means an exhaustive list and the implementation of these techniques will not take the place of either the management solutions we have discussed or the need for error detection and correction. However, implementing a policy for managing spreadsheet management is the first step that you need to take, and the points above represent some of the basic things that you need to consider. • • • • • • • •
  17. 17. Managing Spreadsheets © Bloor Research 2006Page 16 Conclusion In an ideal world one might design a new spreadsheet paradigm, based on appro- priate industry standards, where all facilities were encapsulated into other query, reporting or planning environments, which provided the necessary security and auditing capabilities. Unfortunately, we do not live in such an idyllic setting: spreadsheets as they currently exist will continue to be used in their millions and any attempt to move users to another environment is doomed to failure. The chal- lenge is therefore to provide as close to this type of functionality as possible, while at the same time offering comprehensive management capability yet without removing the obvious benefits to end users. If we accept the arguments outlined above for spreadsheet management (and we believe these to be overwhelming) then control, security and auditing must be imposed externally, without impacting on the user’s ability to use Excel (or what- ever) as he or she sees fit. As we have discussed, this can either be done through the provision of auditing on its own (where security is not considered an issue) or by a combination of complete control and auditing. There are few such solutions avail- able on the market at present, but we can expect to see more as time goes by. In particular, there has been a growth in plug-in approaches to spreadsheets. We regard these as inadequate: on the one hand they offer auditing and security but are not as rich in their capabilities as complete control solutions (they do not play in the CCTV market), while on the other hand they are limited to siloed environ- ments such as business intelligence or planning and budgeting but do not span the entire enterprise—which is precisely what you don’t want. Moreover, these are bound to installations within particular, vendor-supported versions of Office and therefore have limited deployment possibilities. The emphasis for any product selection policy should be to ensure cross-functional and cross-application capa- bility. While it is impossible to remove entirely the possibility of errors occurring in spreadsheets, it is possible to greatly reduce their likelihood. This can be accom- plished in two ways: first, by treating spreadsheets as enterprise resources that need to be properly tested and checked prior to deployment and, secondly, by using tools that simplify the spreadsheet environment both with respect to heteroge- neous data access and the use of automation (for example, in generating spread- sheets from design templates)—reduction in complexity paired with design-driven automation should lead directly to a reduction in error rates. To conclude: while evolving, this is still an emerging market and there are vendors that are taking different approaches. However, we would argue against plug-ins and clearly favour either or both of the CCTV and complete control methods outlined. Which will best suit your company’s requirements will depend on your circumstances but what is certain is that you should be considering spreadsheet management as a matter of urgency.
  18. 18. Philip Howard Research Director—Technology Philip started in the computer industry way back in 1973 and has variously worked as a systems analyst, programmer and salesperson, as well as in marketing and product management, for a variety of companies including GEC Marconi, GPT, Philips Data Systems, Raytheon and NCR. After a quarter of a century of not being his own boss Philip set up what is now P3ST (Wordsmiths) Ltd in 1992 and his first client was Bloor Research (then ButlerBloor), with Philip working for the company as an associate analyst. His relationship with Bloor Research has continued since that time and he is now Research Director – Technology. In addition to his analyst management respon- sibilities Philip continues to manage his own practice area for Bloor Research, which focuses on anything to do with data: databases, data management, data integration and data warehousing (including business intelligence and corporate performance management). He also has an historic interest in the areas of applica- tion development and enterprise portals, amongst others. In addition to the numerous reports Philip has written on behalf of Bloor Research, Philip also contributes regularly to and www.IT-Analysis. com and was previously the editor of both “Application Development News” and “Operating System News” on behalf of Cambridge Market Intelligence (CMI). He has also contributed to various magazines and published a number of reports published by companies such as CMI and The Financial Times. Away from work, Philip’s primary leisure activities are canal boats, skiing, playing Bridge (at which he is a Life Master) and walking the dog.
  19. 19. …optimise your IT investments Suite 4, Town Hall, 86 Watling Street East TOWCESTER, Northamptonshire, NN12 6BS, United Kingdom Tel: +44 (0)870 345 9911 – Fax: +44 (0)870 345 9922 Web: – email: Copyright Disclaimer This document is subject to copyright. No part of this publication may be repro- duced by any method whatsoever without the prior consent of Bloor Research. Due to the nature of this material, numerous hardware and software products have been mentioned by name. In the majority, if not all, of the cases, these product names are claimed as trademarks by the companies that manufacture the products. It is not Bloor Research’s intent to claim these names or trademarks as our own. Whilst every care has been taken in the preparation of this document to ensure that the information is correct, the publishers cannot accept responsibility for any errors or omissions.