• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Cissp Week 24
 

Cissp Week 24

on

  • 291 views

 

Statistics

Views

Total Views
291
Views on SlideShare
291
Embed Views
0

Actions

Likes
0
Downloads
63
Comments
0

0 Embeds 0

No embeds

Accessibility

Upload Details

Uploaded via as OpenOffice

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Cissp Week 24 Cissp Week 24 Presentation Transcript

    • Business Continuity and Disaster Recovery Planning Domain 8 CISSP Official CBK 3rd Edition Pages 1092-1155 Tim Jensen Jem Jensen StaridLabs
    • Key Terms ● Business Continuity Planning (BCP) ● Disaster Recovery Planning (DRP)
    • Project Initiation and Management ● First steps to building a BCP – – Define a project scope, the objectives to be achieved, and the planning assumptions – Estimate the project resources needed to be successful, both human resources and financial resources. – ● Obtain senior management support to go forward with the project Define a timeline and major deliverables of the project Assign a project manager for the initial creation of a BCP and DRP
    • Senior Leadership Support ● Senior Leadership's major goals: – – ● Execute the mission Protect the organization Risks you can point out to get buy in: – – Reputational – Regulatory (lawsuits) – ● Financial Senior Management could be held liable for not using due care to protect the corporation. BCP and DRP plans can take a year or more to complete, management support is critical so the process doesn't get postponed half way through.
    • Financial Risks ● Can be quantified ● Determines amount to spend on the recovery program ● P*M=C – Probability of harm (p) – How likely is a damaging event to occur Magnitude of harm (m) – What is the financial damage for a single event? Cost of prevention (c) ● ● ● The cost of putting in place a countermeasure. The cost of the countermeasure should not be more than the cost of the event.
    • Additional Benefits of Planning ● Locating single points of failure (SPOF) ● Process Improvements ● Dealing with technical incidents
    • Project Scope and Plan ● It's very important to gain firm agreement on the scope and goals of the DRP and BCP. – Technology only or include business processes? – Main office only or all offices? – Workforce impairment ● – Pandemic, labor strike, transportation issues Project manager must agree with leadership on scope, timeline, and deliverables.
    • Legal and Regulatory Requirements ● Many industries have applicable regulations. ● Recent regulations: – The 9/11 Commission Recommendations Act Of 2007 (Public Law 110-53) ● ● – Recommends that private sector organizations validate their recovery readiness by comparing their programs to an unnamed standard (NFPA 1600 has been proposed) US Government endorsed but is vuluntary British Standard BS25999
    • The Ten Professional Practice Areas NFPA 1600 ● Project Initiation and Management ● Risk Evaluation and Control ● Business Impact Analysis ● Developing Business Continuity Strategies ● Emergency Response and Operations ● Developing and Implementing Business Continuity Plans ● Awareness and Training Programs ● Maintaining and Exercising Business Continuity Plans ● Public Relations and Crisis Communications ● Coordination with Public Authorities
    • BS25999 ● ● ● Extension of PAS56 Intention is to create the ability to demonstrate compliance with the standard Stage 1: Audit including a desktop review – ● ● ● Must be completed before Stage 2 Stage 2: conformance and certification audit where the planner must demonstrate implementation If implementation fails then corrective action must be agreed upon. If both stages complete then the organization can apply for BS25999 certification.
    • US Financial Regulations ● Federal Financial Institutions Examination Council (FFIEC) specifies that BCP is about maintaining, resuming, and recovering the organization. Not just the technology. – ● ● ● ● The planning process must be conducted enterprise wide. BCP and test results should be independently audited and reviewed by board of directors Company should be aware of the BCP activities of its 3rd party providers, key suppliers, and organization partners. If processes are outsourced then the service providers BCP must be reviewed to ensure critical services can be restored within acceptable timeframes. Additional Regulations: – National Association of Insurance Commissioners (NAIC) – National Futures Association Compliance Rule 2-38 – Electronic Funds Transfer Act – Basel Committee
    • Other Regulations and Standards ● Australian Prudential Standard CPS 232 – July 2012 – Requires institution BCM must include: ● BCM Policy ● Business Impact Analysis (BIA) including risk assessment ● Recovery objectives and strategies ● Business Continuity Plan (BCP) including crisis management and recovery ● Review and testing of the BCP ● Training and awareness ● Monetary Authority of Singapore – June 2003 ● Standard for Business Continuity/Disaster Recovery Service Providers (SS507) – Singapore ● ● Sets stringent standards for DR service providers HIPAA – Requires data backup plan, disaster recovery plan, and emergency mode operations plan
    • Sarbanes Oxley Section 404 ● ● Applicable if required to file annual report required by Section 13(a) or 15(d) of the Securities Exchange Act of 1934 (15 USC 78m or 78o(d) Must contain: – Responsibility of management for establishing and maintaining adequate internal control structure and procedures for financial reporting – Contain an assessment, as of the end of the most recent fiscal year of the issuer, of the effectiveness of the internal control structure and procedures of the issuer. – Internal Control Evaluation and Reporting ● BCP and contingency planning is not considered in scope
    • Legal Standards ● Blake vs Woodford Bank & Trust Co (1977) – ● Sun Cattle Company, Inc vs Miners Bank (1974) – ● Foreseeable workload – failure to prepare Computer System Failure – Foreseeable Computer Failure US vs Carroll Towing Company (1947) – Defined breach of duty of care where B < PL ● ● ● ● B = (cost) Burden of taking precautions P = Probability of Loss L = Gravity of Loss P * L must be greater than B to create a duty of due care for the defendant
    • Legal Standard Continued ● Negligent Standard to Plan or Prepare (pandemic) 2003 – Canadian nurses filed suit saying the federal government was negligent in not preparing for the second wave after the disease was first identified.
    • Resource Requirements ● Require plan for both staff and finances ● Staff resources – Need staff from business operations and technology groups (IT). – Identify recovery priority – Identify required timeframes ● – Once timeframes are identified, plan staffing to meet timeframes (If 24 hour recovery will be required, etc) The staff planning recovery must be the same team who executes the recovery in the event of an incident.
    • Financial Resources ● Finances may be required to: – Hire outside contractors/consultants – Travel may be required to offsite locations – Hardware, software, etc may need to be purchased.
    • Emergency Notification Lists ● ● The BCP/DRP planner should build a contact list of critical staff and leadership. The list should include at a minimum: – – ● Title, name, home phone, work phone, mobile phone Tim Recommends also home address Tim also recommends: Distribute the list and make sure everyone on the list has a physical copy offsite. Storing the list in a computer system housed onsite with no offline copies is stupid.
    • Vital Records ● ● All vital records needed to rebuild the organization must be stored offsite in a secure location that can be accessed following a disaster. This includes electronic data backups as well as paper record backups
    • Common Vital Records ● Anything with a signature ● Customer Correspondence ● Customer Conversations ● Accounting Records ● Justification Proposals/Documents ● Transcripts/minutes of meetings with legal significance ● ● Paper with Value (Stock certificates, bonds, comercial paper) Legal Documents (Letters of incorporation, deeds, etc)
    • Common Vital Records ● Databases and contact lists for employees, customers, vendors, partners, etc ● Business unit contingency plans ● Procedure/application manuals ● Backup files from production servers/applications ● Reference documents used regularly ● Calendar files or printouts ● Source Code
    • Risk and Business Analysis ● The planning team will make recommendations about which risks the organization should mitigate and which systems and processes the plan will recover and when.
    • Strategy Development ● ● The planner will review different strategies for business recovery based on required SLA for critical systems. Cost/Benefit analysis will be done to identify strategy viability.
    • Alternate Site Selection and Implementation ● ● The planner selects and builds out alternate sites used to recovery the organization/technology. Shouldn't be susceptible to the same threats as the primary site. – ● Example: If Fargo is the main datacenter location, the backup site shouldn't be in Grand Forks. If one floods the other is likely to flood at the same time. Good resources: – www.prep4agthreats.org – www.switchlv.com/wpcontent/uploads/disaster_avoidance_2013/disastermap.html
    • Video Segway
    • Documenting the Plan ● ● All of the information is compiled into a plan document. Procedures are designed for each site and for each technology and/or application to be recovered.
    • Testing, Maintenance, and Updating ● ● The plan must be validated by testing recovery. A maintenance schedule must be established to the plan doesn't become obsolete.
    • Business Impact Analysis ● ● The purpose of a BIA is to decide what needs to be recovered and how quickly. Priority: – – Essential – Supporting – ● Critical Non-Essential Must determine maximum tolerable downtime (MTD). Also known as Recovery Time Objective (RTO)
    • Risk Assessments ● Three elements of risk: – – Assets – ● Threats Mitigating Factors Threats are measured as a probability. (May happen 1 in 10 years) ● Most common threat is power availability. ● Second most common is a water event. – ● Flooding, plumbing leak, broken pipe, leaky roof, water main break Other Common Threats: – Severe Weather, cable cuts, fires, labor disputes, transportation mishaps, hardware failures.
    • Internal Threats ● Equipment fails prematurely: – – ● Improper installation Improper environment Equipment fails due to wear and tear: – Most equipment has a “mean time between failures” rating. – Running equipment beyond MTBF is risking failure.
    • Assets ● ● If the organization doesn't own anything then it won't be concerned about risks because it has little or nothing to lose. (Gotta love IT Security consulting!!!) Assets include: – Information – Financial – Physical – Human
    • Mitigating Factors ● ● ● Controls ore safeguards that will be put in place to reduce the impact of a threat. Example is that UPS devices can save production systems from hard crashes which could lead to data loss and long recovery times. When a risk is identified the planner must accept the risk, transfer the risk, avoid the risk, or mitigate the risk.
    • Mitigation Strategies ● Accept – ● Transfer – ● Insurance Avoidance – ● The risk is so unlikely to occur or the impact is so small, it'd cost more to mitigate. Have compensating controls so risk is completely removed. Example is having 2 call centers in very different climates. In the event of inclement weather in one, the other is still operational. Mitigation – Controls implemented to avoid the risk or to lessen the impact.
    • Define Recovery Objectives ● Identify all the resources necessary to perform each recovery function
    • Recovery Strategy ● Strategies are driven by the recovery timeframes
    • Surviving Site Strategy ● A surviving site strategy is implemented so that while service levels may drop, a function never ceases to be performed because it operates in at least two geographically dispersed buildings that are fully equipped and staffed.
    • Self Service Strategy ● An organization can transfer work to another of its own locations, which has available facilities and/or staff to manage the time sensitive workload until the interruption is over.
    • Internal Arrangement Strategy ● Training rooms, cafeterias, conference rooms may be equipped to support organizational functions while staff from the impacted site travels to another site and resumes organization.
    • Mutual Aid Agreement Strategies ● Other similar organizations may be able to accommodate those affected.
    • Dedicated Alternate Site Strategy ● Built by the company to accommodate organization function or technology recovery.
    • Work from Home Strategy ● Workers can remote in
    • External Supplier Strategy ● ● Pay an external company for disaster recovery. These companies provide data centers, alternate site spaces, mobile units, and temporary staff.
    • Backup Storage Strategy ● ● ● ● Data should be backed up once or more times a day and a copy sent offsite. The offsite storage should be far enough away from your primary site to be safe and close enough to your recovery site to allow timely recovery operations to start. Systems should be prioritized to make sure resources are available for the most critical systems and data. A full backup is normally taken and then incremental backups occur every few hours or every day.
    • Recovery Site Strategies
    • Dual Data Center ● ● Applications are load balanced or hot swapped between two data centers so downtime is minimized. Each data center should be able to operate at full load.
    • Internal Hot Site ● ● ● Site is standby ready with all technology and equipment necessary already in place. Often used as dev/test until recovery is needed, at which time dev/test is removed and production is implemented. Should be exactly the same hardware, software, etc.
    • External Hot Site ● ● ● Equipment is installed and waiting, but the environment must be rebuilt for recovery. Often contracted through a recovery service provider. Equipment and software should be kept as close to identical as possible to speed recovery.
    • Warm Site ● ● ● A leased or rented facility which is partially configured with some equipment, but not the actual computers. Generally has cooling, cabling, and networking in place. Servers are delivered to the site at the time of the disaster.
    • Cold Site ● ● Empty data center space with no technology. All technology must be acquired at the time of a disaster.
    • Mobile Sites ● Mobile house or sea cargo trailer with a data center in it which can be dropped, hooked up, and is ready to go.
    • Processing Agreements ● Organizations can contract with other organizations for data processing.
    • Reciprocal Agreements ● ● Similar organizations can share the risk of an outage by hosting the data and processing of the other organization in the event of a disaster. Has a lot of contractual, legal, and compliance issues depending on what data you process.
    • Outsourcing ● Business processes can be outsourced entirely.
    • Multiple Processing Sites ● ● ● Multiple sites inside the organization can be used for processing. Useful if the company is spread throughout the country or world. Runs into bandwidth and latency issues.
    • Disaster Recovery Process ● ● When things are going bad, people get stressed and make bad decisions Document the plan! – Clear instructions on who will do what and when – Consistent regardless of the event – Define communication strategy – Distribute to everyone who has a role in recovery – Test/verify the plan
    • Disaster Recovery Process ● Response – Assessment team: evaluates the event and escalates to the appropriate people if needed – Escalation team: contacted by assessment team ● ● Consists of event owner, responders, stakeholders Emergency notification lists – Response teams must be reachable 24x7 – Must be reachable by everyone in the organization – Should be used for every event, from plumbing leaks to Godzilla attacks
    • Disaster Recovery Process ● Emergency Management Team – – ● Provide management (short-term tactical command) Assess damage, keep executives in the loop, initiate and organize response Executive Team – Senior executives – Respond to issues that need direction – Handle PR – Provide leadership (long-term strategic direction)
    • Disaster Recovery Process ● Emergency Response Team – – Retrieve recovery info (potentially offsite) – Communicate with command center – Work with alternate site personnel – ● Execute the recovery process Identify/Install replacement equipment or software Command Center – Should have copy of the plan so they can ensure it is being followed correctly – Should keep track of what's being done and costs
    • Disaster Recovery Process
    • Communications ● Important to keep everyone informed – Emergency notification list ● – Contingency line (ex: printed phone# on badges) ● – Single number to call to get the latest info PR ● ● – Team members/managers who disseminate notifications Important that everyone tells the same story Keep things short and honest Multiple communication channels ● Could actually reduce confusion (techs on their own conference bridge since some jargon sounds scary)
    • Assessment ● Process to rate severity of events ● Tiered categories like: – Non-incident: limited or no disruption – Incident: cause downtime for a facility or service ● – Trigger disaster recovery plan, report to senior mgmt Severe incident: significant destruction or disruption ● Trigger DR, contact senior management and crisis mgmt
    • Restoration ● Planned event after recovery – Interim plans (example) ● ● ● ● Part of DR plan was to set up alternative site Work from alt site until original site is restored Slowly transition back to original site After everything is back at the original site, dismantle the alternate site
    • Training ● Awareness program – Make sure everyone knows the plan before they need to use it – Train all employees on how to raise issues to the evaluation team – Train stakeholders on their role in case of an event ● Conduct exercises to practice – Reassure customers that a plan is in place so the organization will always be there – New hire training
    • Exercises ● “Exercise” instead of “test” – ● Test makes people think it's pass fail Call exercise – activate the call tree – – ● Verify numbers are correct What percentage were unavailable Walkthrough exercise – Talk though a scenario with everyone – Make sure everyone has actually read the plan – Find weaknesses
    • Exercises ● Simulation – – Validate alternative site readiness – Considered successful if everything worked out to get the resources needed to recover – ● Never create a disaster by testing for one Also successful if it didn't since you learn what to fix Compact exercise – Start with call exercise and run right into a simulation – Fake injuries, pretend reporters, fire drills
    • Maintaining the Plan ● Should be reviewed regularly and updated – – ● Review every 3 months Formal audit yearly Version control – – ● Ensures everyone is using the latest version Keeps a history of what changed and why Store the latest plan offsite so it's available in a real disaster
    • Disaster Recovery Program ● Probably will start as a project – ● Projects have an end; DR must be on-going Transition into a an ongoing process – – ● Repeat the steps regularly Use the program to spin off smaller projects like yearly audits and quarterly reviews Emergency Management Organization (EMO) – ● Department or group responsible Emergency Operations Center (EOC) – Provides a location and resources for recovery
    • Other Risk Areas ● Business continuity is closely related to other areas of risk – – ● A good DR plan doesn't matter if records management policy is so poor that offsite backups don't exist or aren't maintained Good firewall policy doesn't help if alternate site has so little physical security that people could enter it and access the data directly Need to address all risk areas for complete coverage