How auditable is your disaster recovery program


Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

How auditable is your disaster recovery program

  1. 1. Print Document This research note is restricted to the personal use of Aristotle Castro ( How Auditable Is Your Disaster Recovery Program? 20 October 2010 | ID:G00208004 John P Morency We discuss best practices for preparing for and managing an IT disaster recovery management audit. Overview For reasons often related to regulatory compliance as well as senior-management directives, audits of disaster recovery readiness are becoming increasingly common. Understanding what is and is not important while the audit is being conducted can help ensure a far more favorable outcome. Key Findings IT disaster recovery management (IT-DRM) audits are based on the same general control testing principles that are used in IT security and privacy audits. While IT-DRM control definition is a necessary prerequisite, it is by no means sufficient. Tangible evidence of disaster recovery control execution and management will also be required by the auditors. A documented recovery plan, recovery exercising (i.e., testing) plan and clear evidence of plan exercising are the minimum set of controls that should be in place prior to an audit. An SAS 70 Type 2 audit commissioned by external service providers may not necessarily be an acceptable substitute for having your own auditors perform provider control testing. Recommendations Ensure that both you and the audit team have a clear, agreed-on understanding of the IT-DRM testing scope prior to the beginning of the audit. Never assume that having only a recovery plan without any additional evidence of plan exercising will be sufficient for ensuring a favorable testing outcome. If you do not have supporting evidence of past recovery plan exercises, invite the auditor (typically a member of the internal audit team) to be a neutral observer during your next test. In addition, make every effort to ensure that documented evidence of test planning and execution, as well as test improvement through the application of lessons learned, are integral parts of future testing. If you use services from one or more third parties to support your IT-DRM program, ensure that your providers SAS 70 Type 2 testing results are sufficient and up to date to meet your needs, or work out arrangements with your provider to have your auditor perform controls testing at the providers service delivery facility.1 of 6 9/23/12 4:08 PM
  2. 2. Print Document Analysis Introduction Managing an audit of IT-DRM constitutes a significant challenge for IT audit professionals and IT operations managers. In contrast to regulatory compliance, infrastructure security and application controls, few best practices, formal guidance and automated audit tools are available. More often than not, it is difficult to even determine the scope of an audit in terms of time and effort. Much of this is because there are still very few professionals who possess the combination of disaster recovery and audit practice knowledge that is required to be effective. Further complicating this challenge is the fact that several IT-DRM service options are available to an organization, including the use of shared data centers and equipment, dedicated colocation space, mobile recovery trailers, and recovery "in the cloud" (also referred to as recovery as a service [RaaS]). An organizations use of one or more of these services in support of its IT-DRM program will require that the audit covers supporting internally and externally managed controls. IT Audit 101 An audit is a process by which a set of management controls is in place and verified as operating correctly, and is consistent with organizational risk management objectives. A management control is a process and/or technology that is implemented to prevent, detect, and/or correct actions or events that could adversely impact operations risk. The verification process consists of individual tests that are designed to ensure that the controls work as advertised. Testing is performed by a qualified, neutral third party that is typically an internal auditor or a qualified external auditor. These are the organizations that will define and document the formal attestation as to whether a control is effective or deficient. There are two types of controls — manual and automated. A manual control is either some form of document or an executable procedure that is performed by a person, versus executed by one or more pieces of software. In contrast, an automated control is one in which the control mechanism is entirely supported by software execution. Because software execution is deterministic, only one test iteration is normally required for an automated control, whereas the test frequency for the verification of manual controls must be statistically determined because consistent, reliable control execution cannot be guaranteed by the audit staff. A simple example of a manual control is a documented internal process for change control, the implementation of which is designed to ensure the proper qualification, testing and production implementation of all infrastructure, application and data changes into production operations. In contrast, an example of an automated control is identity and access management (IAM), software-based end-user provisioning, which automatically permits or denies access to corporate applications and data based on an employees identity credentials. There normally are four different types of control testing performed by auditors. From weakest to strongest, these test types include: Inquiry of a member of the audit subjects IT team regarding the existence and correct operation of the control (typically, this does not include related evidence review) Examination of the supporting process definition and execution evidence documentation Observation of a member of the audit subjects IT team executing the control and generating evidence documentation Execution (also called reperforming) of the control and verification that control2 of 6 9/23/12 4:08 PM
  3. 3. Print Document execution yields the required evidence Regardless of the test type used by the auditor, an essential test result must be the generation of one or more forms of documented evidence of control operation. This evidence may include notes that are handwritten by the auditor, procedure definition documentation, report documentation, spreadsheets, control screen shots and software- generated output, which may include software-specific reports and logs. In our change control example, the documented change management process alone is not sufficient evidence that the process is consistently followed. Supporting evidence, such as documented change control review meeting minutes, evidence of preimplementation testing and production turnover logs, must also exist to ensure that the control is being implemented and managed as intended. In the IAM example, auditable logs that record every setup and change made to individual user credentials and access rights is produced by the identity and access manager. These logs can be used as is by the audit team to ensure that management implementation aligns with documented security policy. The reason for this is that the operation and outputs of any form of management software are assumed to be highly reproducible and consistent, thereby eliminating the need for statistical, sample-based testing. What IT-DRM Controls and Evidence Need to Be in Place? In preparation for a disaster recovery audit, IT managers should ensure that management controls are in place for the definition and/or management of: 1. The events that would have the most disruptive impact on production data center operations. The event impacts most frequently tested by Gartner clients include partial or complete data center equipment failure, loss of electrical power, hurricanes, floods, and earthquakes, as well as the total loss of service from one or more third-party providers. 2. The specific financial impact that these events could generate. 3. A prioritized set of business processes, applications and data recovery tiers. 4. Relationships between the recovery tier definitions and their related financial impact on the business. 5. A documented recovery strategy that supports the prioritized implementation of the people, process and technology resources required to support each recovery tier. 6. A testable recovery plan that includes: The composition of the IT emergency response team Detailed, team-member-specific tasks and deliverables (pretest, during the test and post-test) Specific damage assessment and response escalation procedures Disaster declaration procedure checklist In-scope equipment, software and external services inventory External third parties that must be contacted following a declaration, including key vendor and service providers Prioritized software recovery and data activation procedures that occur at the recovery site Failback procedures from recovery to normal production operations Restoration procedures for returning to normal production operations once the disaster is over 7. Documented test scripts that clearly exercise the recovery plan in addition to containing clear and measurable pass/fail criteria. 8. Documented evidence of recovery exercising completion times consistent with3 of 6 9/23/12 4:08 PM
  4. 4. Print Document formally defined recovery time objectives (RTOs) and recovery point objectives (RPOs). 9. A testing postmortem process for assessing actual results, defining lessons learned, identifying steps for improvement and reporting of the same to key stakeholders, including senior management. Remember that having the control documentation in place only addresses part of the auditors requirements. What is far more important is to ensure that consistent control operation is much more important to ensuring a desirable audit outcome. If you do not currently have all these controls in place, and need to prioritize their definition and implementation, ensure that, at a minimum, recovery plan Steps 6, 7 and 9 (above) are in place. You should never assume that having a documented plan (regardless of how recently that plan was updated) alone will be sufficient. Ensuring that evidence of plan execution is in place should always be a critical audit prerequisite. If this evidence does not exist for whatever reason, you always have the option of inviting the auditor to examine your recovery scripts and observe your recovery plan exercising. Note that the auditor cannot be a participant in the recovery testing, because that would constitute an immediate and obvious conflict of interest. When the Audit Scope Includes Third Parties — Let the Organization Beware While challenging, an IT-DRM audit of a program that is self-managed (meaning that the recovery facilities, infrastructure and program management are managed by in-house resources) is far less complex in scope than an audit of a program that utilizes the services of one or more external providers (for further details on IT-DRM sourcing options, see "IT Disaster Recovery Sourcing Considerations"). Currently, Gartner estimates that at least 70% of client disaster recovery programs depend on one or more external service providers for recovery plan exercising and execution. When third parties are included, the burden is on the provider customer to ensure that the upfront due diligence has been performed on the scope and quality of the service providers operations management controls, especially as they relate to provider operations recoverability and data privacy management. This may require the provider customer to incur additional audit-related costs. A service provider may claim that the level of customer due diligence recommended in this research is not necessary, because the provider has already achieved SAS 70 Type 2 certification via an external audit that addresses most, if not all, these requirements. SAS 70 audits (Type 1 and Type 2) are typically based on standard control frameworks, such as COBIT and International Organization for Standardization (ISO)/International Electrotechnical Commission (IEC) 27001 and 27002. The audits are general in nature and attest to the overall effectiveness of provider management controls (Type 2 only). This may or may not mean that there is sound alignment between the policies and procedures managed by the provider and your organizations specific privacy, data breach management and operations recoverability requirements. It is important to note that the management controls that are tested in an SAS 70 Type 2 audit have been defined by the service provider. These controls must be verified by an external, independent auditor using standard best practices. However, because these controls are defined by the provider, they are typically documented to reflect the providers operation management controls that are in place, and not necessarily the set of management controls needed to support a customers specific security, compliance and continuity management commitments. The result is that the burden of proof is now left to the customer to determine whether the provider recovery management procedures can effectively manage the customers specific control requirements. At this point, you may well need to broaden the audit scope, and incur the costs of sending your IT-DRM auditor4 of 6 9/23/12 4:08 PM
  5. 5. Print Document or audit team to the providers facilities to ensure that all control bases are effectively covered. Bottom Line While an audit of your disaster recovery plan may show that the plan has no gaping holes, you should never assume that passing an audit is a guarantee that the plan will work as intended, or that it is even efficient. Use auditing to demonstrate plan integrity, but ensure that continuous improvement is fundamental to your test management strategy. Recommended Reading "Top Issues & Research Agenda, 2010-2011: IT Compliance, Audit & Legal Professionals" "Critical Recovery Questions to Ask SaaS Providers" "Gartner for IT Leaders: Overview: IT Compliance, Audit & Legal Professionals" "IT Disaster Recovery Sourcing Considerations" "Will Your Data Rain When the Cloud Bursts?" "SAS 70 Is Not Proof of Security, Continuity or Privacy Compliance" Strategic Planning Assumption(s) By 2014, 60% of organizations will be required to undergo formal annual audits of their disaster recovery management readiness. Through 2013, at least 60% of cloud service providers (including software-as-a-service providers) will fail to have an auditable disaster recovery management program in place. © 2010 Gartner, Inc. and/or its Affiliates. All Rights Reserved. Reproduction and distribution of this publication in any form without prior written permission is forbidden. The information contained herein has been obtained from sources believed to be reliable. Gartner disclaims all warranties as to the accuracy, completeness or adequacy of such information. Although Gartners research may discuss legal issues related to the information technology business, Gartner does not provide legal advice or services and its research should not be construed or used as such. Gartner shall have no liability for errors, omissions or inadequacies in the information contained herein or for interpretations thereof. The opinions expressed herein are subject to change without notice.5 of 6 9/23/12 4:08 PM
  6. 6. Print Document of 6 9/23/12 4:08 PM