Designing a Modern Disaster Recovery Environment

1,212 views

Published on

In this presentation we present EAGLE's ideas on designing a modern disaster recovery environment. Key concepts include balancing cost, risk and complexity in DR strategies. Most notably we'll cover recovery objectives, common DR technologies (that allow you to backup and pre-position data), and the importance of viewing DR as an insurance policy.

Published in: Technology, Business
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,212
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
0
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide
  • Common DR Technologies that allow you to pre-place data.
  • There are many reasons for this spike in disaster declarations; one of which is undoubtedly the growing sensitivity we have towards the impact of environmental factors on our national infrastructure. Computer technology has become a critical component for the way businesses manage their supply chain, market and sell to customers, as well as communication.
  • Don’t turn a deaf ear… do you know if your organization is actually protect?
    - Took over DC -> no VM backups… SQL jobs were inconsistent… Primary San as target… tape library next to it

  • Laggard: a person who makes slow progress and falls behind others.

    The cost of downtime is greatly increasing! There has been a 38% increase in the cost per hour of downtime. Interestingly enough, 42% of “best-in-class” reported no outages in the last year. Not all companies can afford, or need, the infrastructure that is required for “best-in-class” performance, but even moving to the middle of the pack can save your business a lot of cash.
  • DR is a focus, but the fiscal backing is not there to do much about it


    Research shows that improving disaster recovery is a top focus for most companies.
  • Business continuity is the activity performed by an organization to ensure that critical business functions will be available to customers, suppliers, regulators, and other entities that must have access to those functions. These activities include many daily chores such as project management, system backups, change control, and help desk. Business continuity is not something implemented at the time of a disaster; Business Continuity refers to those activities performed daily to maintain service, consistency, and recoverability.

    The foundation of business continuity are the standards, program development, and supporting policies; guidelines, and procedures needed to ensure a firm to continue without stoppage, irrespective of the adverse circumstances or events. All system design, implementation, support, and maintenance must be based on this foundation in order to have any hope of achieving business continuity, disaster recovery, or in some cases, system support. Business continuity is sometimes confused with disaster recovery, but they are separate entities. Disaster recovery is a small subset of business continuity. It is also sometimes confused with Work Area Recovery (due to loss of the physical building which the business is conducted within); which is but a part of business continuity.

    The term Business Continuity describes a mentality or methodology of conducting day-to-day business, whereas business continuity planning is an activity of determining what that methodology should be. The business continuity plan may be thought of as the incarnation of a methodology that is followed by everyone in an organization on a daily basis to ensure normal operations.
  • We’ll be discussing backup and disaster recovery specifically today. BC planning is very complicated, and has as much to do with people as it does with technology.


  • RPO = Snapshots
    RTO = Replication

    RPO/RTO are separate…
    - each application can have different RPOs & RTOs

    Webfarm = high RPO / low RTO (must be online 24/7)
    PO App = low RPO (cannot lose data) / higher RTO (paper form until servers are up)
  • (Joking) At Eagle, we love it when you don’t do you homework & have a gut reaction to wanting a 0/0 b/c it costs a lot of $$... Only joking, we would actually take the time to help educate you to make the best decision for your environment.
  • **add defined lines between technology solutions, add numbers to them, describe what you are about to do

    Disaster Recovery is not just for storage, you typically need to factor servers, networks, workstations, and mobile devices. We are going to focus on storage for this presentation however.
  • Who is doing this?

  • Who’s doing this?

    Higher level of Intelligence about your environment --> invidual VM restores / Workflow automation

    More layers to the onion to troubleshoot
  • Who’s doing this?

  • Who’s doing this?

    Not really adequate for modern growing datasets
  • Who’s doing this?

    Transitioned from Backup to Archive
    - cost effective
    - easiest to get offsite (trunk of car – “offsite” is actually “onsite” from 8-5… Iron mtn)

    RTO is killed (restore times are unfeaseable)
  • Who’s doing this?

    No different than tape (for restoring purposes)
    - do the math or put a TB in the cloud & try to recover it
    - Dataset / internet pipe = recovery time
  • So many options! What is right for you depends heavily on you budget, your data, and your service-level agreements.
  • Stick on this a little bit longer so folks can consume the content.

    Ask for questions here.
  • Your site may dictate which DR technologies you can leverage.
  • Mention the 3-2-1
  • Mention the 3-2-1
  • Until your organization understands the size and scope of any potential impact in cannot define a DR plan that makes good business sense. Investing a hundred thousand dollars into your DR plan only makes sense if the cost of a business interruption is measured in hundreds of thousands of dollars as well.

    What is the cost of being out of business for 4 hours? Your DR solution needs to make business sense, otherwise you are just spending money and playing with technology.

    Don’t fall into the trap of disconnecting DR solutions from what makes good business sense. “Try not to get enamored with tech and a 35 second RTO”.
  • Pick the right technology! A spare no expense approach does not make good business sense. Don’t fall victim to becoming enamored by technology and near-zero recovery objectives.

  • Balancing cost against the time to recover (RTO) requires an in-depth understanding of your application’s RTO/RPO requirements and the technologies needed to meet them.

    My point is not to encourage folks to cheap-out on DR, there are justifiable use cases for architecting around zero time recovery objectives. What we do at EAGLE is attempt to understand your environment so we can architect something that makes good business sense.

    Mention my async blunder; ask if anyone else has screwed-up like that.
  • Ask for questions here.
  • Operational efficiency is typically where large cost savings are realized.
  • Successful companies embrace a multi-tiered catalog of recovery technologies connected by a unified management platform. This approach enables IT departments to continuously balance cost vs. risk and protect data accordingly.
  • Let me know if you want a copy of the presentation!
  • Reiterate: The way we see it, our job is to take the complex and make it simple… that means adding real value by leveraging our expertise and integration capabilities to help you find a solution that makes sense for your business. And now, with our new solutions, if that means customer-managed \ on-prem, Eagle-managed \ Eagle-prem, or somewhere between, we have you covered.
  • Designing a Modern Disaster Recovery Environment

    1. 1. Eagle Technologies, Inc. © Copyright 2015 Designing a Modern Disaster Recovery Environment Balancing cost, risk and complexity in your DR strategy. Brian Anderson, Senior Systems Engineer
    2. 2. Eagle Technologies, Inc. © Copyright 2015 • The importance of DR • Business Continuity ≠ Disaster Recovery • Recovery Objectives • Common DR Technologies • DR Strategies & the 3-2-1 • DR as an Insurance Policy • Recovery Service Catalog • DR Design Tips & Takeaways Overview
    3. 3. Eagle Technologies, Inc. © Copyright 2015 • Natural disasters cause the most catastrophic data losses: – Hurricane Katrina and Sandy – Tornado Alley, Moore OK – 1,274 in US each year – Fires, floods, earthquakes – Only way to recover is to store a copy of data in a physically separate location • Hardware failure, human error and software corruption still make up 82% of data losses Disasters are on the rise. Is your data center protected?
    4. 4. Eagle Technologies, Inc. © Copyright 2015 Recovery Site Strategies
    5. 5. Eagle Technologies, Inc. © Copyright 2015 Why DR? • The nature of disasters is that they are usually unexpected. • An organization which fails to provide a minimum level of service to customers following a disaster may not have a business to recover. – Lost revenue – Damage to brand/reputation – Reduced customer satisfaction – Lost market presence – Walking papers • SANsplosion 2012!
    6. 6. Eagle Technologies, Inc. © Copyright 2015 The Cost of Downtime Source: Datacenter Downtime: How Much Does It Really Cost?, Aberdeen Group, March 2012
    7. 7. Eagle Technologies, Inc. © Copyright 2015 Data Protection Priorities & Spending Top 10 IT Priorities for 2014 Priority Percent Server Virtualization 54% DR / Business Continuity 46% Smartphones 41% Tablets / PCs 41% Business Intelligence / Analytics 39% Mobility 39% Network management / monitoring 38% Network-based Security 37% Virtual Server Backup 37% Encryption 36% IT Budget Growth for 2014 Priority Percent Increase by more than 10% 22% Increase by 5% to 10% 28% Increase by less than 5% 10% Same 28% Decrease by less than 5% 3% Decrease by 5% to 10% 5% Decrease by more than 10% 4% Source: CIO Magazine
    8. 8. Eagle Technologies, Inc. © Copyright 2015 • Business Continuity Planning - A business continuity plan specifies how a company plans to restore core business operations when disasters occur. • Disaster Recovery - Disaster recovery looks specifically at the technical aspects of how a company can get back into operation using backup facilities. The IT department, with its disaster recovery plan, is one element of a larger business continuity scenario. Business Continuity ≠ Disaster Recovery
    9. 9. Eagle Technologies, Inc. © Copyright 2015
    10. 10. Eagle Technologies, Inc. © Copyright 2015 RPO vs. RTO •Amount of data loss acceptable •The point to which data must be restored RECOVERY POINT OBJECTIVE •Amount of time it takes to come back online •The time by which data must be restored RECOVERY TIME OBJECTIVE Days MinsHrsWks Secs Recovery Point Mins DaysHrsSecs Wks Recovery Time Interruption
    11. 11. Eagle Technologies, Inc. © Copyright 2015 RPO vs. RTO Days MinsHrsWks Secs Recovery Point Mins DaysHrsSecs Wks Recovery Time Costs
    12. 12. Eagle Technologies, Inc. © Copyright 2015 Technology Synchronous/ Semi-Sync Replication Software- defined Traditional Snapshot Replication Traditional Disk-based Backups Restores from Tape or Public Cloud Application Business Criticality Business Critical Applications Business Sensitive Applications Business Sensitive Applications Business Sensitive Applications Non-Critical Applications Disaster RPO 0-15 minutes 0-60 minutes Snap frequency based 4-24 hrs Backup frequency based 6-24 hours 36 hours Disaster RTO Less than 4 hours Less than 4 hours 4-24 hours 12-72 hours Best Effort Relative Cost $$$$ $$$ $$$ $$ $ Common DR Technologies
    13. 13. Eagle Technologies, Inc. © Copyright 2015 Synchronous/Semi-Sync Replication • Typically array-based • The most aggressive RPO capable solutions • Highest cost and bandwidth requirements • Still requires traditional backup for day-to-day recovery Technology Array-Based Replication Application Business Criticality Business Critical Applications Disaster RPO 0-15 minutes Disaster RTO Less than 4 hours Relative Cost $$$$
    14. 14. Eagle Technologies, Inc. © Copyright 2015 Software-defined Replication • Allows you to decouple replication process from hardware. • Solution may include DR workflow automation • May add additional costs and complexity. • Does not provide true technology gap. Technology vSphere or Software- based Replication Application Business Criticality Business Sensitive Applications Disaster RPO 0-60 minutes Disaster RTO Less than 4 hours Relative Cost $$$
    15. 15. Eagle Technologies, Inc. © Copyright 2015 Traditional Snapshot-based Replication • Cost effective alternative to sync/async array-based replication. • Low recovery time and minimal data loss. • Snapshots can also provide operational recovery capabilities that can integrate with backup processes. • Typically WAN efficient, this type of replication may include dedupe capabilities Technology Replicated Snapshots Application Business Criticality Business Sensitive Applications Disaster RPO Snap frequency based 4-24 hrs Disaster RTO 12-72 hours Relative Cost $$$
    16. 16. Eagle Technologies, Inc. © Copyright 2015 Traditional Backups (Disk) • Provides the most flexibility and value • Fulfill the “technology gap” requirement • A cost effective method for maintaining a large number of recovery points and high retention Technology Disk-based Backups Application Business Criticality Business Sensitive Applications Disaster RPO Backup frequency based 6-24 hours Disaster RTO 12-72 hours Relative Cost $$
    17. 17. Eagle Technologies, Inc. © Copyright 2015 • Tape is not dead! • May still have legitimate place in your data protection scheme • Excellent way to get data offsite • Certain situations may dictate tape – Poor WAN connectivity/bandwidth – Lack of a warm/hot site • Keep a copy onsite, take another copy offsite • Recovering large datasets may take a significant amount of time. Traditional Backups (Tape) Technology Restores from Tape Application Business Criticality Non-Critical Applications Disaster RPO 36 hours Disaster RTO Best Effort Relative Cost $
    18. 18. Eagle Technologies, Inc. © Copyright 2015 • Bandwidth is a huge factor in the success of cloud backup & restore. – Recovering large datasets may take a significant amount of time. • Performance & privacy issues remain. • Potential hidden costs compared to alternatives (not necessarily the lowest cost) Public Cloud Backup Technology Restores from Cloud Application Business Criticality Non-Critical Applications Disaster RPO 36 hours Disaster RTO Best Effort Relative Cost $
    19. 19. Eagle Technologies, Inc. © Copyright 2015 Disaster Recovery Strategies Lots of options! • Tapes onsite • Tapes in the boss's garage (or Iron Mountain) • Replicate data to your DR site • Replicate data to a third-party data center (CoLo, IaaS) • Back up data to the cloud • Software replication between dissimilar SAN hardware
    20. 20. Eagle Technologies, Inc. © Copyright 2015 Days MinsHrsWks Secs Mins DaysHrsSecs Wks Determining Your Recovery Needs Recovery Point Recovery Time Tape/Cloud Restore Clustering Snapshots Sync/Semi-Sync Replication Tape/Cloud Backup Disk-based Backups Periodic Snapshots & Replication Disk- based Backups
    21. 21. Eagle Technologies, Inc. © Copyright 2015 The “3–2–1” Strategy • Keep your data in 3places • Keep your data on 2disparate types of hardware/software/firmware – ”Technology Gap” • Keep 1copy of your data offsite
    22. 22. Eagle Technologies, Inc. © Copyright 2015 The “3–2–1” Strategy
    23. 23. Eagle Technologies, Inc. © Copyright 2015
    24. 24. Eagle Technologies, Inc. © Copyright 2015 • Risk Management – Avoid – Mitigate – Transfer – Accept • What is the cost of being out of business (30 minutes, 4 hours, 72 hours)? – Calculate the cost of downtime in your organization. • DR solutions need to make business sense. – Would you spend 1 million dollars on an insurance policy that pays 1 million dollars? • Don’t flush your IT budget down the toilet! DR is an Insurance Policy
    25. 25. Eagle Technologies, Inc. © Copyright 2015 Cost of DR No RecoveryLow Downtime High Cost Low Cost Minimize Cost!
    26. 26. Eagle Technologies, Inc. © Copyright 2015 • 80% of surveyed environments indicated that data needs to be restored within 72 hours • 34% of surveyed environments indicated that data needs to be restored within 4 hours • Balancing cost against RTO requires understanding of your application’s RTO/RPO requirements Defining Recovery Objectives 2012 Gartner, Survey Analysis: IT Disaster Recovery Management Spending and Testing Activities Expand in 2012
    27. 27. Eagle Technologies, Inc. © Copyright 2015 Align your business needs with your backup and DR solutions. • Define your business critical applications & services. – What services are core to your business (customer facing, revenue generating)? – Lost revenue = critical systems • Work with stakeholders to define: – How long of a service disruption you can suffer through? – How much data could you afford to lose? • Define tiers for your services (RTO & RPO) – What effect would an outage have on your business (1 hour, 4 hours, 72 hours)? Recovery Service Catalog
    28. 28. Eagle Technologies, Inc. © Copyright 2015 – Strive to have as few tiers as possible • Time and money always accompany complexity – Document your risks and plan. • Get approval from up the chain. – Maintain your plan • Assign a senior-level position • Regular tests and reviews are important – A solid recovery service catalog enables other components of DR planning • Forces discussions about Business Continuity • Recovery Priorities • Run Book Recovery Service Catalog(cont.)
    29. 29. Eagle Technologies, Inc. © Copyright 2015 1. Identify Services 2. Analyze Business Impact 3. Evaluate & Choose Recovery Objectives 4. Document 5. Get Approval 6. Implement 7. Maintain  Test, Test, Test! Recovery Service Catalog(cont.)
    30. 30. Eagle Technologies, Inc. © Copyright 2015 DR Design Tips • Unified Backup/Management – Islands of disparate technology are an operational obstacle for maintaining and managing DR – Point solutions create complexity and uncertainty • Reduced storage consumption with shared indexes and deduplication databases – Enable disk-based backups with long-term retention • Management tool must be flexible to business change – Be as prepared as you can be to support new applications and services
    31. 31. Eagle Technologies, Inc. © Copyright 2015 • Pre-position data – Datasets are getting too large for alternate methods • Automate DR workflows when possible – If not, create a run book documenting steps • Align with a technology partner that is invested in your DR success – I have one I can recommend ;) DR Design Tips(cont.)
    32. 32. Eagle Technologies, Inc. © Copyright 2015 Takeaways • If you work in IT long enough, you will encounter a large-scale outage. – Be prepared! • Business Continuity is not the same as DR – Encourage your organization at large to build a business continuity plan. • Understand recovery objectives and the methodstechnologies that enable your business to meet them. • Simplify whenever and wherever you can! – Unify the backup & restore process, this includes the management of those technologies that enable DR. – Workflow, automation, run books… these are your friends. 
    33. 33. Eagle Technologies, Inc. © Copyright 2015 Takeaways (cont) • Consider the danger of incremental decisions over time – Build a blueprint and attempt to stick to it. • Don’t forget the 3-2-1! – Data in 3 places, – on two disparate pieces of hardware/software, – with at least one copy off-site. • DR is an insurance policy! – You need to spend wisely. – Calculate the cost of downtime for your organization. – Building a recovery service catalog will enable you to make wise decisions.
    34. 34. Eagle Technologies, Inc. © Copyright 2015 Got questions? We’ve got answers. 35
    35. 35. Eagle Technologies, Inc. © Copyright 2015 Thank you! Brian Anderson Senior Systems Engineer brian.anderson@eagleinc.com
    36. 36. Eagle Technologies, Inc. © Copyright 2015 We’re making technology easy.

    ×