Calculating Downtime Costs: How Much Should You Spend on DR?

•Download as PPTX, PDF•

7 likes•3,331 views

Rackspace

Technology Business

2
Agenda
• Downtime: The Numbers
• Building Your Case
• Managing Expectations
• Reference Architectures
• Roles & Responsibilities
• Q&A

4
Outages Happen
59% of Fortune 500 companies
experience a minimum of 1.6
hours of downtime per
week, according to Dunn &
Bradstreet.

5
F500 2012 Hourly Loses
Total 2012 Revenue = $11.75T
Total 2012 Profit = $824B
• Ave. F500 Revenue = $23.5B
• Med. F500 Revenue = $10B
• Ave. F500 Profit = $1B
• Med. F500 Profit = $646M

6
F500 2012 Hourly Loses
Total 2012 Revenue = $11.75T
Total 2012 Profit = $824B
• Ave. F500 Revenue = $23.5B
• Med. F500 Revenue = $10B
• Ave. F500 Profit = $1B
• Med. F500 Profit = $646M
 ($2.7M/hr)
 ($1.2M/hr)
 ($122k/hr)
 ($74k/hr)

7
Minutes Matter
• The average cost of data center downtime across industries:
approximately $5,600 per minute.
• For a partial data center outage, averaging 59 minutes in length,
average costs were approximately $258,000.
• For total data center outages, which had an average recovery time of
134 minutes, average hourly costs were approximately $680,000.
• 93% of companies that lost their data for 10 days or more filed for
bankruptcy within one year of the disaster, and 50% filed for
bankruptcy immediately.

8
Humans Make Mistakes
Through 2015, 80% of outages impacting mission-critical
services will be caused by people and process issues, and
more than 50% of those outages will be caused by
change/configuration/release integration and hand-off issues.
– Gartner Research

10
Do You Have A Plan?
41% of SMBs surveyed said that putting together a Disaster
Recovery plan never occurred to them.
Less than half of SMBs back up their data weekly or more
frequently, and only 23% backup daily.
Backups are not enough! The goal of a backup is to enable
data restoration. A DR plan helps quickly restore operations.
DR is a holistic strategy for restoring IT systems that powers
business ops that includes people, process, policies and
technology.

11
From minutes to weeks
Downtime Perspective
How Resilient Is Your DR plan?
– Device failure
– Cabinet failure
– Facility failure
Time To Recovery

12
Cost of Downtime Scenarios
Annual Revenue App/Productivity
Annual Revenue $15,000,000 Annual Revenue $75,000,000
Percentage of Revenue from Online 90% Number of employees 400
Average shopping hours per day 12 Annual revenue per employee $187,500
Annual total revenue hours 4380 Work hours per year (2000 hours/employee) 500,000
Cost of downtime per hour $3,082 Employee revenue per hour $150
Hours of downtime 10
Sales Lost Percentage employees affected by downtime 20%
Duration of Event (days) 4 Cost of downtime per hour $375,000
Hours of event 96
Expected visits generated 500,000 Event Revenue
Conversion rate (visits to purchase) 6% Expected Event Revenue $100,000
Average revenue per purchase $500 Event Duration (days) 3
Revenue per event $15,000,000 Event hours 72
Cost of downtime per hour $156,250 Cost of downtime per hour $1,389
If you don’t know your actual
cost of downtime,
you are wasting time.

13
Annual Revenue Basis
Cost of Downtime Scenarios
Annual Revenue
Annual Revenue $15,000,000
Percentage of Revenue from Online 90%
Average shopping hours per day 12
Annual total revenue hours 4380
Cost of downtime per hour $3,082

14
Cost of Downtime Scenarios
Event Revenue
Expected Event Revenue $1,000,000
Event Duration (days) 3
Event hours 72
Cost of downtime per hour $13,089
Single Event Revenue

15
Cost of Downtime Scenarios
Sales Lost
Duration of Event (days) 4
Hours of event 96
Expected visits generated 500,000
Conversion rate (visits to purchase) 6%
Average revenue per purchase $500
Revenue per event $15,000,000
Cost of downtime per hour $156,250
Sales Lost

16
Cost of Downtime Scenarios
App/Productivity
Annual Revenue $75,000,000
Number of employees 400
Work hours per year (2000 hours/employee) 500,000
Employee revenue per hour $150
Annual revenue per employee $187,500
Hours of downtime 10
Percentage employees affected by downtime 20%
Cost of downtime per hour $120,000
Productivity Basis

17
Get The Downtime Calculator
rackspace.com/dt-cost

19
RPO / RTO
Weekly
Backup
Weekly
Backup
Weekly
Backup

20
RPO / RTO
Weekly
Backup
Weekly
Backup
Weekly
BackupOUTAGE

21
RPO / RTO
RPO
Weekly
Backup
Weekly
Backup
Weekly
BackupOUTAGE

22
RPO / RTO
RPO
Weekly
Backup
Weekly
Backup
Weekly
Backup
RTO
OUTAGE

23
RPO / RTO
RPO
Weekly
Backup
Weekly
Backup
Weekly
Backup
RTO
Recovery
Completed
OUTAGE

24
RPO / RTO
Recovery Point Objective
How much data is lost
Recovery Time Objective
How long to recover
Weeks Days Hours Min Sec Sec Min Hours Days Weeks

25
RPO / RTO
Recovery Point Objective
How much data is lost
Recovery Time Objective
How long to recover
Weeks Days Hours Min Sec Sec Min Hours Days Weeks
Tape
Periodic
Replication
Snapshots
Replication Clustering
Snapshots
Tape Restore

26
RPO / RTO
Recovery Point Objective
How much data is lost
Recovery Time Objective
How long to recover
Weeks Days Hours Min Sec Sec Min Hours Days Weeks
Tape
Periodic
Replication
Snapshots
Replication Clustering
Snapshots
Tape Restore
CostImpact

27
RTO/RPO Cost Expectations
HOT COLDWARM
RTO
RPO
Tier
• DNS Failover
• Array-based Replication
• Host-based Replication
• DB Replication (Transactional)
• DB Rep. (Log Shipping)
$$$ $$ $
0-24
2-6
0-24
4-24+
24-48+
1 2 3 4
0-2
• MBU (Disk)
• VM Replication
Price
• MBU (Tape)
• MBU (Offsite)
Elements of DR,
not an end-to-end solution
Missing process, policies
and procedures• GSLB

28
Architectures
Defined By Your Priorities

29
Designing for Redundancy
HA FirewallsHA Load Balancers
Private Cloud
DB Cluster
Shared
StorageDedicated
Storage
Hypervisor

30
Designing for Geo-Redundancy
LUN
VM
vSphere-based
Array-based
Prod DR

31
I need backup!
DR Site Requirements
How long must you depend on your DR site?
How do you define your DR site requirements?
DR = Insurance.

40
Leverage Expertise
Common questions from customers:
Who owns the overall DR strategy?
Who will design it?
Who is going to manage and monitor it?
Who will perform the failover?

41
Designing Your DR Strategy
Businesses own the strategy.
Vendors enable the strategy.
The strategy is unique to your needs.
Testing matters.

42
Prioritizing Content/Apps
How do you prioritize?
What are you protecting?
– Business Operations
– Revenue
– Data
– Customers
– All of the above

43
Roles & Responsibilities
Role Responsibility
DR Plan Failover Plan / Run Book Business
“Pushing the failover button” Business
Failover Process Partner
Replication Applications Partner
Virtual Machine Partner
Database Partner
Guest OS Partner
Hypervisor Partner
Server Partner
Storage Partner
Network Partner
Data Center Partner

44
Testing
Companies don’t test their failover
plan enough.
Some replication services charge
per test: expensive
The failover/back process can be
risky in production
Risk dictates extensive planning
around every test

45
So, How Much Should You Spend On DR?
How much revenue will you lose?
How much else will you lose?
How much can you afford?
Based business decisions on fact, not emotion.

46
Summary
DR is Your Responsibility
Know Your Cost of Downtime
Prioritize Your Apps
Select The Right Tools
Select The Right Partner

47
Rackspace Hosting
A Diverse Portfolio

48
The Rackspace Portfolio
PRIVATE
CLOUD
PUBLIC
CLOUD
CUSTOMER
PREMISE
PARTNER
DATA CENTER
PRIVATE
CLOUD
PRIVATE
CLOUD
VIRTUALIZED
VMware
DEDICATED
BARE METAL
RACKSPACE DATA CENTER
Powered by Powered by Powered by Powered by Powered by

Optimizing for performance and reducing latency is a hard problem. Examples could be: choosing a different algorithm and data structures, improving SQL queries, adding a cache, serving requests asynchronously, or some low-level optimization that requires a deep understanding of the OS, kernel, compiler, or the network stack. The engineering effort is usually nontrivial, and only if you're lucky, you'll see some tangible results. That being said, there are some performance optimization techniques, with a few lines of code — even exist in the built-in library — it can lead to noticeable surprising results. One of these techniques is to "fail fast, retry soon". These techniques are often neglected or taken for granted. In distributed systems, a service or a database consists of a fleet of nodes that functions as one unit. It is not uncommon for some nodes to go down, usually, for a short time. When this occurs, failures can happen on the client-side and can lead to an outage. To build resilient systems, and reduce the probability of failure, we're going to explore these topics: timeouts, backoff, and jitter. We'll talk about timeouts, what timeout to set, pitfalls of retries, how backoff improves resource utilization, and jitters reduce congestion. Furthermore, we're going to see an adaptive mechanism to dynamically adjust these configurations. This is inspired by a real-production use case where DynamoDB latency p99 & max went down from > 10s to ~500ms after employing these three techniques: timeouts, backoff, and jitter. This is inspired by a real-production use case where DynamoDB latency p99 & max went down from > 10s to ~500ms. AWS articles, specifically M. Brooker’s writings, and SDKs code have been great resources to dive into these techniques: - Timeouts, retries and backoff with jitter in the AWS Builder's Library, 2019 (https://aws.amazon.com/builders-library/timeouts-retries-and-backoff-with-jitter/) - Exponential Backoff and Jitter on the AWS Architecture Blog, 2016 (https://aws.amazon.com/blogs/architecture/exponential-backoff-and-jitter/) - Fixing retries with token buckets and circuit breakers, Marc's Blog, 2022 (https://brooker.co.za/blog/2022/02/28/retries.html)

Analyzing and Interpreting AWR

pasalapudi

Zero Downtime Migration

Software Park Thailand

Actualización en la implementación de centros de datos

Jack Daniel Cáceres Meza

Kubernetes has the concept of resource requests and limits. Pods get scheduled on the nodes based on their requests and optionally limited in how much of the resource they can consume. Understanding and optimizing resource requests/limits is crucial both for reducing resource "slack" and ensuring application performance/low-latency. This talk shows our approach to monitoring and optimizing Kubernetes resources for 80+ clusters to achieve cost-efficiency and reducing impact for latency-critical applications. All shown tools are Open Source and can be applied to most Kubernetes deployments.

Learn O11y from Grafana ecosystem.

HungWei Chiu

MAA Best Practices for Oracle Database 19c

Markus Michalewicz

Understanding oracle rac internals part 2 - slides

Mohamed Farouk

Stability Patterns for Microservices

pflueras

Writing Scalable Software in Java

Ruben Badaró

Exadata Deployment Bare Metal vs Virtualized

Umair Mansoob

Data guard oracleAntony James Vijay

Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 1

Tanel Poder

Average Active Sessions - OaktableWorld 2013

John Beresniewicz

ASE Performance and Tuning Parameters Beyond the cfg File

SAP Technology

Passive DNS Collection -- the 'dnstap' approach, by Paul Vixie [APNIC 38 / AP...

APNIC

SQL Database Mirroring setup

Kamaljeet Singh Matharu (Kam)

Concurrent Programming Using the Disruptor

Trisha Gee

Checklist_AC.pdf

Neaman Ahmed MBA ITIL OCP Automic

Demystifying the use of wallets and ssl with your database

Aishwarya Kala

This presentation is from a session that was delivered in in Oracle Groundpreaders Tour to help understand how to secure your data which flows through the network between client and server. This will also help the attendees understand how toleverage the various tools like Oracle Network Encryption & TLS to harden the security of the databases, along with practical insights,case studies and various tips & tricks for multitenant and RAC architectures. Securing Communication with Oracle Databases

B35 all you wanna know about rman by francisco alvarezInsight Technology, Inc.

10 Problems with your RMAN backup script

Yury Velikanov

Yuri is called to audit RMAN backup scripts on regular basis for several years now as part of his Day to Day duties. He see the same errors in scripts that Oracle DBAs using to backup critical databases over and over again. Those errors may play a significant role in a recovery process when you working under stress. During that presentation you will be introduced to typical issues and hints how to address those.

P99 Pursuit: 8 Years of Battling P99 Latency

ScyllaDB

Performance engineering is a Sisyphean hill climb for perfection. Those who climb the hill are hardly ever satisfied with the results. You should always ask yourself where the bottleneck is today and what’s holding you back. Great performance improves your software. It enables you to run fewer layers, manage 10x less machines, simplifies your stack, and more. In this keynote session, ScyllaDB CEO Dor Laor will cover the principles for successful creation of projects like ScyllaDB, KVM, the Linux kernel and explain why they spurred his vision for the P99 CONF.

MySQL HA Solutions

Mat Keep

Avoiding Data Center Disasters

Jesse Andrew

What's hot

2020 - OCI Key Concepts for Oracle DBAs

Marcus Vinicius Miguel Pedro

Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...

Henning Jacobs

Learn O11y from Grafana ecosystem.

HungWei Chiu

MAA Best Practices for Oracle Database 19c

Markus Michalewicz

Understanding oracle rac internals part 2 - slides

Mohamed Farouk

Stability Patterns for Microservices

pflueras

Writing Scalable Software in Java

Ruben Badaró

Exadata Deployment Bare Metal vs Virtualized

Umair Mansoob

Data guard oracleAntony James Vijay

Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 1

Tanel Poder

Average Active Sessions - OaktableWorld 2013

John Beresniewicz

ASE Performance and Tuning Parameters Beyond the cfg File

SAP Technology

Passive DNS Collection -- the 'dnstap' approach, by Paul Vixie [APNIC 38 / AP...

APNIC

SQL Database Mirroring setup

Kamaljeet Singh Matharu (Kam)

Concurrent Programming Using the Disruptor

Trisha Gee

Checklist_AC.pdf

Neaman Ahmed MBA ITIL OCP Automic

Demystifying the use of wallets and ssl with your database

Aishwarya Kala

B35 all you wanna know about rman by francisco alvarezInsight Technology, Inc.

10 Problems with your RMAN backup script

Yury Velikanov

P99 Pursuit: 8 Years of Battling P99 Latency

ScyllaDB

What's hot (20)

2020 - OCI Key Concepts for Oracle DBAs

Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...

Learn O11y from Grafana ecosystem.

MAA Best Practices for Oracle Database 19c

Understanding oracle rac internals part 2 - slides

Stability Patterns for Microservices

Writing Scalable Software in Java

Exadata Deployment Bare Metal vs Virtualized

Data guard oracle

Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 1

Average Active Sessions - OaktableWorld 2013

ASE Performance and Tuning Parameters Beyond the cfg File

Passive DNS Collection -- the 'dnstap' approach, by Paul Vixie [APNIC 38 / AP...

SQL Database Mirroring setup

Concurrent Programming Using the Disruptor

Checklist_AC.pdf

Demystifying the use of wallets and ssl with your database

B35 all you wanna know about rman by francisco alvarez

10 Problems with your RMAN backup script

P99 Pursuit: 8 Years of Battling P99 Latency

Viewers also liked

MySQL HA Solutions

Mat Keep

Avoiding Data Center Disasters

Jesse Andrew

Reduce planned database down time with Oracle technology

Kirill Loifman

How to design an Oracle database system to minimize planned interruptions? That depends on the requirements, goals, SLAs etc. The presentation will follow top-down approach. First we will describe major types of planned maintenance, prioritize those and then based on the system availability requirements find the best cost-effective technics to address those. A bit of planning, strategy and of course modern database and OS technics including latest Oracle 12c features.

Taking Database Development to the 21st Century

DBmaestro - Database DevOps

Gartner Data Center Conference 2014 - When Downtime is Not an Option.

Joe Felisky

Slides: Maintain 24/7 Availability for Your Enterprise Applications Environment

NetApp

Viewers also liked (6)

MySQL HA Solutions

Avoiding Data Center Disasters

Reduce planned database down time with Oracle technology

Taking Database Development to the 21st Century

Gartner Data Center Conference 2014 - When Downtime is Not an Option.

Slides: Maintain 24/7 Availability for Your Enterprise Applications Environment

Similar to Calculating Downtime Costs: How Much Should You Spend on DR?

The Edge of Disaster Recovery - May Events Presentation FINALJohn Baumgarten

CFITS Disaster Recovery 2009

cfits

Can your business survive the next disaster?

Ashish Patel

Did you know that 40% of businesses do not re-open after a disaster? Or that it could cost an organization up to $600,000 per hour during a disaster scenario? In today’s “always on” world, businesses must continue to operate no matter what, which means that critical IT infrastructure must be available 24/7/365. In this session we will learn more about a holistic approach towards business continuity & IT resiliency and how organizations can achieve high levels of availability. We will also go over each stage of the business continuity lifecycle and talk about the importance of managed services, key processes and technologies that must be considered for a comprehensive Business Continuity & Resiliency plan.

Disaster Recovery Planning: Best Practices, Templates, and ToolsZetta Inc

Why You Should Be Selling Business Continuity Services (5 MSP Tips to Get Sta...

David Castro

Test 2

Justin Wong

Making driver-based planning and budgeting work

Anaplan

For Finance departments to best navigate through the twists and turns of today’s fast moving marketplace, a haphazard, once-a-year budgeting process just doesn’t cut it. To survive and thrive in this environment, this process needs to change to be more agile, align around a consistent set of resources, and attain a trusted level of accuracy. One reliable way to transform your budgeting process is to integrate the modeling that budget contributors typically do on spreadsheets to deliver driver-based planning and budgeting. With benefits such as being able to rapidly reforecast with minimal effort, having operational capacity always aligned, and better decision making that comes from having a deeper insight into variances, it has obvious appeal. So why is it not more widely used? View these slides from our webinar with Forrester Research and Proformative and watch the full webinar here: https://www.anaplan.com/webinars/driver-based-budgeting/

Nimble storage investor presentation q3 fy15(1)

nimblestorageIR

Total Cost Management PowerPoint Presentation Slides

SlideTeam

It has PPT slides covering wide range of topics showcasing all the core areas of your business needs. This complete deck focuses on Total Cost Management PowerPoint Presentation Slides and consists of professionally designed templates with suitable graphics and appropriate content. This deck has total of sixty seven slides. Our designers have created customizable templates for your convenience. You can make the required changes in the templates like colour, text and font size. Other than this, content can be added or deleted from the slide as per the requirement. Get access to this professionally designed complete deck PPT presentation by clicking the download button below. http://bit.ly/31TKF9W

TechMD - Backup vs Business Continuity

TechMD

The CEO's Guide to Downturn

Gainsight

After the markets pounded SaaS early this year, CEOs started gearing up for a downturn. Your company may already be tightening its belt. But that might not be enough. Gainsight CEO Nick Mehta, InsightSquared CEO Fred Shilmover, and Bessemer Venture Capital partner Byron Deeter are teaming up for an all-star webinar to outfit you with a plan for the downturn. They'll discuss: The timeline for the SaaS "reset" Optimizing customer acquisition Investing in lean revenue And much, much more

VMware Disaster Recovery Planning: Essential Checklist

Veeam Software

Enterprise grade disaster recovery without breaking the bank

actualtechmedia

Backup and Disaster Recovery for Business Owners and Directors

Lucy Denver

Cloud: The Commercial Silver Lining for Partners

Amazon Web Services

Aligning Profit to Execution

Alithya

Operating a Highly Available Cloud Service

Depankar Neogi

Operating a highly available cloud service is not just about technology and architecture. It has a lot to do with people and processes. Everything fails all the time. So, how do you ensure you have the right people and the right processes in the right places to run a highly available web service. This talk covers people, processes and technology and tools required to run a highly available web service.

Warranty Master Breakout Session at IT Nation Connect 2019

Allice Shandler

"How to Sell More and Service Less with Automated Asset Lifecycle Management" Learn why thousands of MSPs voted Warranty Master “Best Revenue Opportunity” and “Best in Show” at DattoCon 2019. Award-winning 25 year Channel veteran, Warranty Master CEO, Dan Wensley along with special guest Mike Brooks from audIT share the impact Asset Lifecycle Management (ALM) has on both an IT Service Provider and their customers. Hear about the process that lead one New Jersey based MSP to increase their revenue by over $500,000. During this dynamic session you’ll learn how ALM: - Provides a sales pipeline for your business and a budget plan for your customers - Automates manual processes and lowers service delivery costs - Improves network performance and security for your customers - Provides best practices for conducting Quarterly Business Reviews (QBRs)

Designing a Modern Disaster Recovery Environment

Eagle Technologies

How to Build a Great Cloud/SaaS Business Case Analysis for Technology Investment

Gotransverse

Similar to Calculating Downtime Costs: How Much Should You Spend on DR? (20)

The Edge of Disaster Recovery - May Events Presentation FINAL

CFITS Disaster Recovery 2009

Can your business survive the next disaster?

Disaster Recovery Planning: Best Practices, Templates, and Tools

Why You Should Be Selling Business Continuity Services (5 MSP Tips to Get Sta...

Test 2

Making driver-based planning and budgeting work

Nimble storage investor presentation q3 fy15(1)

Total Cost Management PowerPoint Presentation Slides

TechMD - Backup vs Business Continuity

The CEO's Guide to Downturn

VMware Disaster Recovery Planning: Essential Checklist

Enterprise grade disaster recovery without breaking the bank

Backup and Disaster Recovery for Business Owners and Directors

Cloud: The Commercial Silver Lining for Partners

Aligning Profit to Execution

Operating a Highly Available Cloud Service

Warranty Master Breakout Session at IT Nation Connect 2019

Designing a Modern Disaster Recovery Environment

How to Build a Great Cloud/SaaS Business Case Analysis for Technology Investment

More from Rackspace

What Would You Do With More Time?

Rackspace

We're constantly reminded about how fleeting time is—but what if you had more time? What would you do? We decided to illustrate some of the answers we found on Twitter. If you're looking to get some time back in your day, consider letting Rackspace manage your servers. We can focus on the tech so that you can focus on innovation. Learn more about Rackspace: http://rackspace.com Learn more about this project: https://blog.rackspace.com/moretime-question-ages

RMS Security Breakfast

Rackspace

6 Commonly Asked Questions from Customers Building on AWS

Rackspace

The Evolution of OpenStack – From Infancy to Enterprise

Rackspace

As OpenStack turns 5 this year, we thought it would be a good time to take a look back at the evolution of OpenStack. We start with a quick overview of what OpenStack is, how OpenStack came to be and describe the OpenStack Foundation. Next we describe the problem that OpenStack helps to solve, the components of OpenStack and the timeline for when these components came to be. Last, we outline the current features and benefits that make OpenStack ready for the enterprise with supporting Enterprise use case examples. Blog can be found here ( https://developer.rackspace.com/blog/evolution-of-openstack-from-infancy-to-enterprise/) and webinar can be found here (https://www.brighttalk.com/webcast/11427/138613)

How Startups can leverage big data?

Rackspace

Data is being generated at a feverish pace and forward thinking companies are integrating big data and analytics as part of their core strategy from day one. However, it is often hard to sift through the hype around big data and many companies start with only a small subset of data. Can smaller companies benefit from big data efforts? We will discuss several use cases and examples of how startups are using data to optimize their operations, connect with their users, and expand their market.

Become an IT Service Broker

Rackspace

Deploy Apache Spark™ on Rackspace OnMetal™ for Cloud Big Data Platform

Rackspace

There's an elephant in the room when it comes to Big Data. Apache Hadoop and Spark offer the promise to transform how businesses leverage Big Data, finding the right mix of flexible deployments, elastic scalability, and performance can be daunting. Introducing Rackspace OnMetal™ for Apache Spark™ an industry first that combines the performance and efficiency of bare metal with the ease and flexibility of cloud. With Rackspace OnMetal for Cloud Big Data Platform you can transform how you run Hadoop and Spark workloads: •Deploy in minutes, not months •Spin instances up or down on demand •Process data in-memory for faster query times •Get bare metal performance and say goodbye to virtualization taxes Sign up and learn how Rackspace OnMetal for Cloud Big Data Platform can rapidly move your organization from planning to deploying.

Rethinking People Costs in Enterprise IT

Rackspace

Starting the Journey to Managed Infrastructure Services

Rackspace

Rackspace::Solve NYC - Welcome Keynote featuring Rackspace CTO John Engates

Rackspace

Solving business challenges - that was the focus at Rackspace::Solve in New York City, the second of three one-day summits that showcase how companies are overcoming the toughest challenges in their businesses and for their customers. In this presentation, Rackspace CTO John Engates kicks off the day with an overview of some of those challenges, and the various ways some of our customers and partners help solve them by using Rackspace's Managed Cloud offerings. Rackspace (NYSE: RAX) is the #1 managed cloud company. Our technical expertise and Fanatical Support® allow companies to tap the power of the cloud without the pain of hiring experts in dozens of complex technologies. Rackspace is also the leader in hybrid cloud, giving each customer the best fit for its unique needs — whether on single- or multi-tenant servers, or a combination of those platforms. Rackspace is the founder of OpenStack®, the open-source operating system for the cloud. Headquartered in San Antonio, we serve more than 200,000 business customers from data centers on four continents. We rank 29th on Fortune’s list of 100 Best Companies to Work For. For more information, visit www.rackspace.com.

Rackspace::Solve NYC - Solving for Rapid Customer Growth and Scale Through De...

Rackspace

At Rackspace::Solve NYC, Jon Hyman, CIO of Appboy and Prashanth Chandrasekar, GM of DevOps at Rackspace, discuss the role of DevOps in helping to solve the technical challenges that come with rapid growth. Rackspace (NYSE: RAX) is the #1 managed cloud company. Our technical expertise and Fanatical Support® allow companies to tap the power of the cloud without the pain of hiring experts in dozens of complex technologies. Rackspace is also the leader in hybrid cloud, giving each customer the best fit for its unique needs — whether on single- or multi-tenant servers, or a combination of those platforms. Rackspace is the founder of OpenStack®, the open-source operating system for the cloud. Headquartered in San Antonio, we serve more than 200,000 business customers from data centers on four continents. We rank 29th on Fortune’s list of 100 Best Companies to Work For. For more information, visit www.rackspace.com.

Rackspace::Solve NYC - Second Stage Cloud

Rackspace

James Staten, VP and top Analyst at Forrester Research discusses tech adoption of cloud computing at Rackspace::Solve New York. Staten explains the Second-Stage Cloud, which means that the optimization phase of client-server is ending while we enter the rationalizations phase of cloud computing. This makes the cloud-competition today based on “service-value”, causing a hyper-growth for cloud services.

Rackspace::Solve NYC - Solving for Rapid Customer Growth and Scale Through De...

Rackspace

Rackspace::Solve NYC - The Future of Applications with Ken Cochrane, Engineer...

Rackspace

What does intermodal shipping have to do with managing your app’s components in different environments? Ken Cochrane, Engineering Manager at Docker, explains in this presentation from Rackspace::Solve NYC. For more information about Rackspace::Solve, visit http://www.rackspacesolve.com. Rackspace (NYSE: RAX) is the #1 managed cloud company. Our technical expertise and Fanatical Support® allow companies to tap the power of the cloud without the pain of hiring experts in dozens of complex technologies. Rackspace is also the leader in hybrid cloud, giving each customer the best fit for its unique needs — whether on single- or multi-tenant servers, or a combination of those platforms. Rackspace is the founder of OpenStack®, the open-source operating system for the cloud. Headquartered in San Antonio, we serve more than 200,000 business customers from data centers on four continents. We rank 29th on Fortune’s list of 100 Best Companies to Work For. For more information, visit www.rackspace.com.

vCenter Site Recovery Manager: Architecting a DR Solution

Rackspace

VMware’s vCenter Site Recovery Manager is the market-leading disaster-recovery management product. It ensures the simplest and most reliable disaster protection for all virtualized applications. However, it is not a turn-key DR solution. Architecting your SRM solution requires deep thought and heavy planning. This presentation will help you with planning and architecting your SRM solution as well as addressing specific configuration and installation challenges. Our goal is to help you deploy and maintain a solid SRM solution to enable your DR Plan.

Outsourcing IT Projects to Managed Hosting of the Cloud

Rackspace

How to Bring Shadow IT to the Light

Rackspace

DR-to-the-Cloud Best Practices

Rackspace

Migrating Traditional Apps from On-Premises to the Hybrid Cloud

Rackspace

Rackspace::Solve SFO - CoreOS CEO Alex Polvi on Solving for What's Next

Rackspace

More from Rackspace (20)

What Would You Do With More Time?

RMS Security Breakfast

6 Commonly Asked Questions from Customers Building on AWS

The Evolution of OpenStack – From Infancy to Enterprise

How Startups can leverage big data?

Become an IT Service Broker

Deploy Apache Spark™ on Rackspace OnMetal™ for Cloud Big Data Platform

Rethinking People Costs in Enterprise IT

Starting the Journey to Managed Infrastructure Services

Rackspace::Solve NYC - Welcome Keynote featuring Rackspace CTO John Engates

Rackspace::Solve NYC - Solving for Rapid Customer Growth and Scale Through De...

Rackspace::Solve NYC - Second Stage Cloud

Rackspace::Solve NYC - Solving for Rapid Customer Growth and Scale Through De...

Rackspace::Solve NYC - The Future of Applications with Ken Cochrane, Engineer...

vCenter Site Recovery Manager: Architecting a DR Solution

Outsourcing IT Projects to Managed Hosting of the Cloud

How to Bring Shadow IT to the Light

DR-to-the-Cloud Best Practices

Migrating Traditional Apps from On-Premises to the Hybrid Cloud

Rackspace::Solve SFO - CoreOS CEO Alex Polvi on Solving for What's Next

Recently uploaded

Neuro-symbolic is not enough, we need neuro-*semantic*

Frank van Harmelen

Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”. All of this illustrated with link prediction over knowledge graphs, but the argument is general.

FIDO Alliance Osaka Seminar: Overview.pdf

FIDO Alliance

Essentials of Automations: Optimizing FME Workflows with Parameters

Safe Software

Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place. Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects. Here’s what you’ll gain: - Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows. - Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy. - Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency. - Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity. We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic. Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.

State of ICS and IoT Cyber Threat Landscape Report 2024 preview

Prayukth K V

The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development. The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers: State of global ICS asset and network exposure Sectoral targets and attacks as well as the cost of ransom Global APT activity, AI usage, actor and tactic profiles, and implications Rise in volumes of AI-powered cyberattacks Major cyber events in 2024 Malware and malicious payload trends Cyberattack types and targets Vulnerability exploit attempts on CVEs Attacks on counties – USA Expansion of bot farms – how, where, and why In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East Why are attacks on smart factories rising? Cyber risk predictions Axis of attacks – Europe Systemic attacks in the Middle East Download the full report from here: https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/

Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...

Jeffrey Haguewood

Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows. We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases. This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams. Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.

Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf

91mobiles

AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...

Product School

UiPath Test Automation using UiPath Test Suite series, part 4

DianaGray10

Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap. The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies. Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques What will you get from this session? 1. Insights into SAP testing best practices 2. Heatmap utilization for testing 3. Optimization of testing processes 4. Demo Topics covered: Execution from the test manager Orchestrator execution result Defect reporting SAP heatmap example with demo Speaker: Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP

Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality

Inflectra

In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring. Learn about: • The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks. • Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective. • Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification. • Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process. Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.

DevOps and Testing slides at DASA Connect

Kari Kakkonen

Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...

Ramesh Iyer

In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.

Accelerate your Kubernetes clusters with Varnish Caching

Thijs Feryn

From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...

Product School

Assuring Contact Center Experiences for Your Customers With ThousandEyes

ThousandEyes

How world-class product teams are winning in the AI era by CEO and Founder, P...

Product School

Designing Great Products: The Power of Design and Leadership by Chief Designe...

Product School

The Future of Platform Engineering

Jemma Hussein Allen

From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...

Product School

Knowledge engineering: from people to machines and back

Elena Simperl

LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...

DanBrown980551

Do you want to learn how to model and simulate an electrical network from scratch in under an hour? Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)! During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook. PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides: - A fully editable and extendable library for grid component modelling; - Visualization tools to display your network; - Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses; The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well. What you will learn during the webinar: - For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills; - For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.

Recently uploaded (20)

Neuro-symbolic is not enough, we need neuro-*semantic*

FIDO Alliance Osaka Seminar: Overview.pdf

Essentials of Automations: Optimizing FME Workflows with Parameters

State of ICS and IoT Cyber Threat Landscape Report 2024 preview

Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...

Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf

AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...

UiPath Test Automation using UiPath Test Suite series, part 4

Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality

DevOps and Testing slides at DASA Connect

Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...

Accelerate your Kubernetes clusters with Varnish Caching

From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...

Assuring Contact Center Experiences for Your Customers With ThousandEyes

How world-class product teams are winning in the AI era by CEO and Founder, P...

Designing Great Products: The Power of Design and Leadership by Chief Designe...

The Future of Platform Engineering

From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...

Knowledge engineering: from people to machines and back

LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...

Calculating Downtime Costs: How Much Should You Spend on DR?

1. 1 CALCULATING DOWNTIME COSTS: How much should you spend on DR? Paul Croteau – Enterprise Cloud Strategist Rackspace Hosting

2. 2 Agenda • Downtime: The Numbers • Building Your Case • Managing Expectations • Reference Architectures • Roles & Responsibilities • Q&A

3. 3 Downtime The Numbers

4. 4 Outages Happen 59% of Fortune 500 companies experience a minimum of 1.6 hours of downtime per week, according to Dunn & Bradstreet.

5. 5 F500 2012 Hourly Loses Total 2012 Revenue = $11.75T Total 2012 Profit = $824B • Ave. F500 Revenue = $23.5B • Med. F500 Revenue = $10B • Ave. F500 Profit = $1B • Med. F500 Profit = $646M

6. 6 F500 2012 Hourly Loses Total 2012 Revenue = $11.75T Total 2012 Profit = $824B • Ave. F500 Revenue = $23.5B • Med. F500 Revenue = $10B • Ave. F500 Profit = $1B • Med. F500 Profit = $646M  ($2.7M/hr)  ($1.2M/hr)  ($122k/hr)  ($74k/hr)

7. 7 Minutes Matter • The average cost of data center downtime across industries: approximately $5,600 per minute. • For a partial data center outage, averaging 59 minutes in length, average costs were approximately $258,000. • For total data center outages, which had an average recovery time of 134 minutes, average hourly costs were approximately $680,000. • 93% of companies that lost their data for 10 days or more filed for bankruptcy within one year of the disaster, and 50% filed for bankruptcy immediately.

8. 8 Humans Make Mistakes Through 2015, 80% of outages impacting mission-critical services will be caused by people and process issues, and more than 50% of those outages will be caused by change/configuration/release integration and hand-off issues. – Gartner Research

9. 9 Building Your Case Quantifying Risk

10. 10 Do You Have A Plan? 41% of SMBs surveyed said that putting together a Disaster Recovery plan never occurred to them. Less than half of SMBs back up their data weekly or more frequently, and only 23% backup daily. Backups are not enough! The goal of a backup is to enable data restoration. A DR plan helps quickly restore operations. DR is a holistic strategy for restoring IT systems that powers business ops that includes people, process, policies and technology.

11. 11 From minutes to weeks Downtime Perspective How Resilient Is Your DR plan? – Device failure – Cabinet failure – Facility failure Time To Recovery

12. 12 Cost of Downtime Scenarios Annual Revenue App/Productivity Annual Revenue $15,000,000 Annual Revenue $75,000,000 Percentage of Revenue from Online 90% Number of employees 400 Average shopping hours per day 12 Annual revenue per employee $187,500 Annual total revenue hours 4380 Work hours per year (2000 hours/employee) 500,000 Cost of downtime per hour $3,082 Employee revenue per hour $150 Hours of downtime 10 Sales Lost Percentage employees affected by downtime 20% Duration of Event (days) 4 Cost of downtime per hour $375,000 Hours of event 96 Expected visits generated 500,000 Event Revenue Conversion rate (visits to purchase) 6% Expected Event Revenue $100,000 Average revenue per purchase $500 Event Duration (days) 3 Revenue per event $15,000,000 Event hours 72 Cost of downtime per hour $156,250 Cost of downtime per hour $1,389 If you don’t know your actual cost of downtime, you are wasting time.

13. 13 Annual Revenue Basis Cost of Downtime Scenarios Annual Revenue Annual Revenue $15,000,000 Percentage of Revenue from Online 90% Average shopping hours per day 12 Annual total revenue hours 4380 Cost of downtime per hour $3,082

14. 14 Cost of Downtime Scenarios Event Revenue Expected Event Revenue $1,000,000 Event Duration (days) 3 Event hours 72 Cost of downtime per hour $13,089 Single Event Revenue

15. 15 Cost of Downtime Scenarios Sales Lost Duration of Event (days) 4 Hours of event 96 Expected visits generated 500,000 Conversion rate (visits to purchase) 6% Average revenue per purchase $500 Revenue per event $15,000,000 Cost of downtime per hour $156,250 Sales Lost

16. 16 Cost of Downtime Scenarios App/Productivity Annual Revenue $75,000,000 Number of employees 400 Work hours per year (2000 hours/employee) 500,000 Employee revenue per hour $150 Annual revenue per employee $187,500 Hours of downtime 10 Percentage employees affected by downtime 20% Cost of downtime per hour $120,000 Productivity Basis

17. 17 Get The Downtime Calculator rackspace.com/dt-cost

18. 18 Expectations Implication of Time

19. 19 RPO / RTO Weekly Backup Weekly Backup Weekly Backup

20. 20 RPO / RTO Weekly Backup Weekly Backup Weekly BackupOUTAGE

21. 21 RPO / RTO RPO Weekly Backup Weekly Backup Weekly BackupOUTAGE

22. 22 RPO / RTO RPO Weekly Backup Weekly Backup Weekly Backup RTO OUTAGE

23. 23 RPO / RTO RPO Weekly Backup Weekly Backup Weekly Backup RTO Recovery Completed OUTAGE

24. 24 RPO / RTO Recovery Point Objective How much data is lost Recovery Time Objective How long to recover Weeks Days Hours Min Sec Sec Min Hours Days Weeks

25. 25 RPO / RTO Recovery Point Objective How much data is lost Recovery Time Objective How long to recover Weeks Days Hours Min Sec Sec Min Hours Days Weeks Tape Periodic Replication Snapshots Replication Clustering Snapshots Tape Restore

26. 26 RPO / RTO Recovery Point Objective How much data is lost Recovery Time Objective How long to recover Weeks Days Hours Min Sec Sec Min Hours Days Weeks Tape Periodic Replication Snapshots Replication Clustering Snapshots Tape Restore CostImpact

27. 27 RTO/RPO Cost Expectations HOT COLDWARM RTO RPO Tier • DNS Failover • Array-based Replication • Host-based Replication • DB Replication (Transactional) • DB Rep. (Log Shipping) $$$ $$ $ 0-24 2-6 0-24 4-24+ 24-48+ 1 2 3 4 0-2 • MBU (Disk) • VM Replication Price • MBU (Tape) • MBU (Offsite) Elements of DR, not an end-to-end solution Missing process, policies and procedures• GSLB

28. 28 Architectures Defined By Your Priorities

29. 29 Designing for Redundancy HA FirewallsHA Load Balancers Private Cloud DB Cluster Shared StorageDedicated Storage Hypervisor

30. 30 Designing for Geo-Redundancy LUN VM vSphere-based Array-based Prod DR

31. 31 I need backup! DR Site Requirements How long must you depend on your DR site? How do you define your DR site requirements? DR = Insurance.

32. 32 DR-specific Site Prod DR

33. 33 DR-specific Cold Site Prod DR

34. 34 DR-specific Cold Site Prod DR

35. 35 Staging as Warm DR Prod DR

36. 36 Staging as Warm DR Prod DR

37. 37 Staging as Warm DR Prod DR

38. 38 Staging as DR(expanded) Prod DR

39. 39 Roles/Responsibilities Shared

40. 40 Leverage Expertise Common questions from customers: Who owns the overall DR strategy? Who will design it? Who is going to manage and monitor it? Who will perform the failover?

41. 41 Designing Your DR Strategy Businesses own the strategy. Vendors enable the strategy. The strategy is unique to your needs. Testing matters.

42. 42 Prioritizing Content/Apps How do you prioritize? What are you protecting? – Business Operations – Revenue – Data – Customers – All of the above

43. 43 Roles & Responsibilities Role Responsibility DR Plan Failover Plan / Run Book Business “Pushing the failover button” Business Failover Process Partner Replication Applications Partner Virtual Machine Partner Database Partner Guest OS Partner Hypervisor Partner Server Partner Storage Partner Network Partner Data Center Partner

44. 44 Testing Companies don’t test their failover plan enough. Some replication services charge per test: expensive The failover/back process can be risky in production Risk dictates extensive planning around every test

45. 45 So, How Much Should You Spend On DR? How much revenue will you lose? How much else will you lose? How much can you afford? Based business decisions on fact, not emotion.

46. 46 Summary DR is Your Responsibility Know Your Cost of Downtime Prioritize Your Apps Select The Right Tools Select The Right Partner

47. 47 Rackspace Hosting A Diverse Portfolio

48. 48 The Rackspace Portfolio PRIVATE CLOUD PUBLIC CLOUD CUSTOMER PREMISE PARTNER DATA CENTER PRIVATE CLOUD PRIVATE CLOUD VIRTUALIZED VMware DEDICATED BARE METAL RACKSPACE DATA CENTER Powered by Powered by Powered by Powered by Powered by

49. 49 Q&A

Editor's Notes

Good afternoon and welcome. My name is Paul Croteau, I’m currently in my 9th year at Rackspace, the first 7 of those were spent as an Enterprise Solution Engineer…these days I work as a member of our product team helping to create technical content for our customers and the market as a whole. At the end of this session you will hopefully have a better understanding of how to asses your cost of downtime, you will have more clarity on the role a hosting provider plays in protecting your business, and will see where EMC fits into the equation.A QUICK SHOW OF HANDS: Raise your hand if you are here representing a small to medium business sized company. <WAIT> OK. My content today is very high level, I’m not going to dive into specific EMC storage management apps or hardware configuration settings. My goal today is to get you thinking about how to properly assess your cost of downtime, give you some tools or scenarios to help set expectations, and then your homework will be to take what you’ve learned and use that information to frame your DR solution architecture discussions back at the office. Let’s get started.
Here’s our agenda.
We know that data center outages and unplanned downtime are inevitable. IT downtime is like traffic. It’s not a matter of if it will happen, but when. There have been numerous public examples; we see it all the time in the news. Netflix suffered a very public and painful multi-hour outage last Christmas Eve. A variety of Amazon cloud outages have hit some very large and very popular web and social media properties. And thanks to social media, we can learn about and track the status of these outages and their recovery (or lack thereof) in real-time.InformationWeek published a study showing that IT downtime costs us $26.5 Billion in lost revenue every year. In another study by to Dunn and Bradstreet, 59% of Fortune 500 companies experience a minimum of 1.6 hours of downtime PER WEEK. That works out to more than 6 hours per month per company. LETS LOOK AT THE NUMBERS MORE CLOSELY.Source:Ronni J. Colville and George SpaffordConfiguration Management for Virtual and Cloud InfrastructuresAssessing The Financial Impact Of DowntimeBy Alan Arnold, in Analysis April 20, 2010http://www.businesscomputingworld.co.uk/assessing-the-financial-impact-of-downtime/IT Downtime Costs $26.5 Billion In Lost RevenueBy Chandler Harris InformationWeekMay 24, 2011 10:21 AM http://www.informationweek.com/storage/disaster-recovery/it-downtime-costs-265-billion-in-lost-re/229625441Network WorldHow much will you spend on application downtime this year?Aug. 2, 2009http://www.networkworld.com/newsletters/nsm/2009/080309nsm1.htmlOnline Banking Upgrade Contributed to Bank of America OutageOct. 2011Bank of America Corp., whose website has been down sporadically since last Friday, says the problem stems from technical hiccups, not a hack attack.BusinessNavitaire booking glitch earns Virgin $20m in compohttp://www.theaustralian.com.au/business/navitaire-booking-glitch-earns-virgin-20m-in-compo/story-e6frg8zx-1226033624246IT Downtime Costs $26.5 Billion In Lost RevenueBy Chandler Harris InformationWeekMay 24, 2011 10:21 AM http://www.informationweek.com/storage/disaster-recovery/it-downtime-costs-265-billion-in-lost-re/229625441------------------------------Target’s Online President Departs Following Website CrashTarget generates about 2% (or 1.35B) of its $67.4 billion revenue online.The day the revamped site went live, links such as “learn all about what’s new” didn’t work. On Sept. 13, the online store crashed when demand for products from the Italian fashion house Missoni exceeded expectations.Source: Bloomberg/Businessweek, 10/13/2011http://www.businessweek.com/news/2011-10-13/target-s-online-president-departs-following-website-crash.html------------------------------Navitaire booking glitch earns Virgin $20m in compo by: Teresa OoiApril 05, 201112:00AMRESERVATIONS management company Navitaire is understood to have compensated Virgin Blue for up to $20 million for a customer service meltdown that resulted in 130 cancelled flights and delays for more than 60,000 passengers in September.Source: The Australianhttp://www.theaustralian.com.au/business/navitaire-booking-glitch-earns-virgin-20m-in-compo/story-e6frg8zx-1226033624246
Here is some 2012 financial data for the Fortune 500. Total Revenue was almost $12 T, total profit almost $1 T. <NEXT> Here are the Averages and Means. (ELABORATE) These are large numbers, so let’s break them down into smaller more digestible chunks. <NEXT> These are the HOURLY downtime averages and medians for the F500. (ELABORATE) If we go back to that BusinessWeek study I mentioned on the previous slide, 6 hours of downtime per month at the median costs more than $7M in revenue loss or more than $450k in lost profit per month. LETS LOOK AT SOME MORE NUMBERS. <NEXT>Dividing by 8760 hours in a year$1.2M/hr x (59% x 500 = 295) = $354M/wk x 52 = $18.4B/yr
Here is some 2012 financial data for the Fortune 500. Total Revenue was almost $12 T, total profit almost $1 T. <NEXT> Here are the Averages and Means. (ELABORATE) These are large numbers, so let’s break them down into smaller more digestible chunks. <NEXT> These are the HOURLY downtime averages and medians for the F500. (ELABORATE) If we go back to that BusinessWeek study I mentioned on the previous slide, 6 hours of downtime per month at the median costs more than $7M in revenue loss or more than $450k in lost profit per month. LETS LOOK AT SOME MORE NUMBERS. <NEXT>Dividing by 8760 hours in a year$1.2M/hr x (59% x 500 = 295) = $354M/wk x 52 = $18.4B/yr
<NEXT> According to InformationWeek, the average cost of unplanned downtime is about $5,600 per minute. Let’s take a look at a couple of outage scenarios, and the average cost associated with each. <NEXT> For a partial DC outage, the average downtime is about an hour and costs approximately $260,000. <NEXT> For a total DC outage, the average recovery time is over two hours and costs on average, $680,000. For larger companies, or companies with an ecommerce business model, that number could easily go much higher. What would that cost look like if you were down for a few days? Here’s a sobering factoid, <NEXT> “93% of companies that lost their data for 10 days or more filed for bankruptcy within one year of the disaster, and 50% filed for bankruptcy immediately.” What’s the main takeaway here? Your company’s survival depends on quantifying the impact of downtime. SO, DEPLOYING MORE TECHNOLOGY IS THE FIX FOR THIS? NOT NECESSARILY. <NEXT>(Source: National Archives & Records Administration in Washington.)According to InformationWeek, the average cost of unplanned downtime is about $5,600 per minute. Let’s take a look at a couple of outage scenarios, and the average cost associated with each. For a partial DC outage, the average downtime is about an hour and costs approximately 260 thousand dollars. For larger companies or companies with an ecommerce business model, that number could easily go much higher. For a total DC outage, the average recovery time is over two hours and costs on average, 680 thousand dollars. What would that cost look like if you were down for a few days? Here’s a sobering factoid: “93% of companies that lost their data for 10 days or more filed for bankruptcy within one year of the disaster, and 50% filed for bankruptcy immediately.”Your company’s survival depends on quantifying the impact of downtime.According to InformationWeek, the average cost of unplanned downtime is about $5,600 per minute. Let’s take a look at a few outage scenarios, and the average cost associated with each. For a partial DC outage, the average downtime is about an hour and costs approximately 260 thousand dollars. For larger companies or companies with an ecommerce business model, that number could easily go north of 6 figures per hour. For a total DC outage, the average recovery time is over two hours and costs on average, 680 thousand dollars. What would that cost look like if you were down for a few days? Here’s a sobering factoid: “93% of companies that lost their data for 10 days or more filed for bankruptcy within one year of the disaster, and 50% filed for bankruptcy immediately.”Your company’s survival depends on quantifying the impact of downtime.
We also know that humans make mistakes, and Gartner predicts over the next several years the vast majority of outages impacting mission critical services will be caused by people and process issues, and more than half of these will be caused by change/configuration/release integration and hand off issues.You can deploy wonderfully reilient technology and still suffer downtime. What I want to do today is help you understand how much your inevitable downtime might cost your company so that you can determine how much to spend based on your business needs, financial needs, and your tolerance for risk. At the end of the day, the cost of any DR solution is analogous to an insurance policy. We may not like paying for it, but when you need it you are very happy you did so. <NEXT>Source:Ronni J. Colville and George SpaffordConfiguration Management for Virtual and Cloud InfrastructuresAssessing The Financial Impact Of DowntimeBy Alan Arnold, in Analysis April 20, 2010http://www.businesscomputingworld.co.uk/assessing-the-financial-impact-of-downtime/IT Downtime Costs $26.5 Billion In Lost RevenueBy Chandler HarrisInformationWeekMay 24, 2011 10:21 AM http://www.informationweek.com/storage/disaster-recovery/it-downtime-costs-265-billion-in-lost-re/229625441Network WorldHow much will you spend on application downtime this year?Aug. 2, 2009http://www.networkworld.com/newsletters/nsm/2009/080309nsm1.htmlOnline Banking Upgrade Contributed to Bank of America OutageOct. 2011Bank of America Corp., whose website has been down sporadically since last Friday, says the problem stems from technical hiccups, not a hack attack.BusinessNavitaire booking glitch earns Virgin $20m in compohttp://www.theaustralian.com.au/business/navitaire-booking-glitch-earns-virgin-20m-in-compo/story-e6frg8zx-1226033624246IT Downtime Costs $26.5 Billion In Lost RevenueBy Chandler HarrisInformationWeekMay 24, 2011 10:21 AM http://www.informationweek.com/storage/disaster-recovery/it-downtime-costs-265-billion-in-lost-re/229625441------------------------------Target’s Online President Departs Following Website CrashTarget generates about 2% (or 1.35B) of its $67.4 billion revenue online.The day the revamped site went live, links such as “learn all about what’s new” didn’t work. On Sept. 13, the online store crashed when demand for products from the Italian fashion house Missoni exceeded expectations.Source: Bloomberg/Businessweek, 10/13/2011http://www.businessweek.com/news/2011-10-13/target-s-online-president-departs-following-website-crash.html------------------------------Navitaire booking glitch earns Virgin $20m in compo by: Teresa OoiApril 05, 201112:00AMRESERVATIONS management company Navitaire is understood to have compensated Virgin Blue for up to $20 million for a customer service meltdown that resulted in 130 cancelled flights and delays for more than 60,000 passengers in September.Source: The Australianhttp://www.theaustralian.com.au/business/navitaire-booking-glitch-earns-virgin-20m-in-compo/story-e6frg8zx-1226033624246------------------------------
OK, four slides of data, is anyone nervous yet? ;-) I’m building a case to help you understand the potential impact of downtime in real terms. We all understand that downtime is a problem, but the data I’m presenting here should move us get past the emotional aspects of the topic and help quantify the impact of downtime so that you can go to your technical and financial stakeholders with a compelling case to take action to protect your business and your customers.In my many years as an engineer/architect I’ve talked to thousands of customers of all sizes. One of the things that has remained constant in these interactions is the fact that so many of the people I’ve talked to had been so busy running their businesses that they never pulled the trigger on a DR plan. And sadly, some of those conversations I had took place after disasters took place and customers were trying to save what they could after the fact. TIME FOR SOME MORE NUMBERS. <NEXT>
Here is some data from a 2011 Symantec SMB Disaster Preparedness Survey, when I first read this I was surprised. <NEXT> 41% of SMBs never thought about putting together a DR plan. <NEXT> Less than half backup weekly, less than ¼ backup daily. Granted. this may not be a surprise to some of you, as you know that depending on the amount of data your backing up, restore times can take many hours or even days. <NEXT> Backups are NOT enough. Your data may be protected but this approach doesn’t address downtime well. For some perspective, It’s difficult to say how long a restore will take b/c it depends on what kind of data your restoring, but on average on a 1Gbps network, restore time will take 60GB/hr.. <NEXT>Disaster recovery is a holistic strategy comprised of people, process, policies and technology. It’s focus is to restore IT systems critical to business function. In other words, it helps keep the business running after a major disruption. Remember a disaster could be mother nature’s wrath or a guy named Bob who installed a patch that broke a critical application. A DR Plan helps to keep the lights on and the company open for business. SO LETS TALK ABOUT RECOVERY PERSPECTIVE. <NEXT>
You don’t get to choose the type of disaster that hits your business. <NEXT> You might suffer a localized failure like a single device, or a cabinet or two of damage due to a burst pipe or fire or electrical surge. <NEXT> Or, you might face a facility wide failure like a flood, cooling failure (melting servers), or massive explosion (bomb, plane, volcano). So, depending on the scale of the event, your IT team or outsourcing provider faces the task of replacing thousands of devices, or maybe even tens of thousands. ELABORATE on channel issues, manpower issues, MRR stack ranking, etc.AND, if you don’t have your data in more than a single data center, I don’t want to see you out there on Twitter griping about downtime. You can RAID this and cluster that, but when downtime hits, a single data center is still a single point of failure.  Alright. We’ve seen how much downtime can cost, and we’ve seen that downtime is unpredictable. So WHAT THIS ALL BOILS DOWN TO: <NEXT>
If you don’t know your actual cost of downtime, you are wasting time talking about or designing a DR solution. And you may be wasting money(maybe lots of money) if you spend too much on DR. Let’s look at some specific business scenarios with real dollars tied to them to help gain even more perspective. <NEXT>
Here we have a company with annual revenue of $15M. Let’s assume this is an ecommerce site with limited retail space, where the vast majority of revenue comes from online sales. Since their market is mainly in the US, most of the committed transactions take place during business/daylight hours. So, assuming 12 hours of shopping every day, and 365 days per year, that gives us 4,380 peak shopping hours and a cost of downtime of just over $3000. (that’s 15M * 90% ,then divided by the total number of hours (4380), A 12 hour outage would mean more than $36k in lost revenue.AND, don’t forget to include the damage to your brand name, or lost future transactions for customers than went to a different vendor not just for this purchase but future purchases. LET’S LOOK AT ANOTHER SCENARIO. <NEXT>.
Here’s some math for a single online event. This could be a weekend charity fund raiser, or an annual pledge drive. Assuming a goal of $1M, every our of this 72 hour event should generate an average of more than $13k. And with an event this short, you better have a quickly scalable solution, something that lets you move fast in more than one data center location. <NEXT>
Here’s a different view on a single event, this time from the perspective of sales lost instead of pure revenue generation on the previous slide. Assume a four day online event, perhaps something over a holiday weekend. Lots of advertising dollars spent, print, television, online, etc. In this scenario we are using numbers that any good retail business should have readily available: things like historical web traffic stats, conversion rate percentages from click-through traffic, etc. Here we expect to see half a million visitors hitting our web property. We know in the past that we’ve had a great conversion rate of 6%. If the average price of our goods is $500 (maybe one of those fancy purses, or a wildly popular electronic device), we expect to generate $15M in sales over this four day period. And the math works out to more than $150k of downtime per hour. <NEXT>
OK, last one. This one looks at downtime from a productivity basis. Instead of focusing on sales or e-commerce, let’s assume we are talking about an outsourced back office application. (financlal, CRM, email, etc.). Take your annual revenue and divide it by the number of employees you have. This gives us an average revenue per hour per employee. Now multply the number of hours of downtime by the percentage or number of employees affected by the outage and you get $120k/hr in this example. Now, these examples have been very general, you can poke all sorts of holes or throw exceptions out there. These aren’t meant to be specific examples, they are meant to show averages, and more importantly, to show different ways of thinking about this topic. Now, wouldn’t it be great if you had a worksheet or app that you could play with to enter your own numbers and see how much downtime might cost your business? <NEXT>
This little web-based tool is available right now for you to try out. It’s a simple calculator with three of the business scenarios we just walked through. This link gets you to our DR Planning page, the link to the calc is down the page just a bit. Give it a try, see how things look, and feel free to use it to help get your point across to decision makers back home. <NEXT>
I’ve spent a lot of time talking about financial numbers, now let’s look at things from a process perspective. We all agree that downtime needs to be avoided and that it can get expensive very fast. Therefore, businesses need to determine how fast they want to get back online after an outage, and how far back in time they need to go to recover data and resume normal operations. <NEXT>
Here’s a common DR timeline of a generic business. Every week this company performs full data backups. <NEXT> Then an outage hits. Since this company isn’t using something really cool like VMware for virtualization with Site Recovery Manager, they have to rely on recovering data from backup tapes. Maybe the tapes are still on site and were not damaged. Or, perhaps the tapes are off-site. <NEXT> The company has a documented goal of resuming online operations within 60 hours of an outage. Plenty of time to get your tapes delivered and re-load al of your data. <NEXT> Unless your tape machine was destroyed by the disaster making your tapes useless until you replace that hardware. So, <NEXT> this company missed its desired RTO by 18 hours. <NEXT>
Here’s a common DR timeline of a generic business. Every week this company performs full data backups. <NEXT> Then an outage hits. Since this company isn’t using something really cool like VMware for virtualization with Site Recovery Manager, they have to rely on recovering data from backup tapes. Maybe the tapes are still on site and were not damaged. Or, perhaps the tapes are off-site. <NEXT> The company has a documented goal of resuming online operations within 60 hours of an outage. Plenty of time to get your tapes delivered and re-load al of your data. <NEXT> Unless your tape machine was destroyed by the disaster making your tapes useless until you replace that hardware. So, <NEXT> this company missed its desired RTO by 18 hours. <NEXT>
Here’s a common DR timeline of a generic business. Every week this company performs full data backups. <NEXT> Then an outage hits. Since this company isn’t using something really cool like VMware for virtualization with Site Recovery Manager, they have to rely on recovering data from backup tapes. Maybe the tapes are still on site and were not damaged. Or, perhaps the tapes are off-site. <NEXT> The company has a documented goal of resuming online operations within 60 hours of an outage. Plenty of time to get your tapes delivered and re-load al of your data. <NEXT> Unless your tape machine was destroyed by the disaster making your tapes useless until you replace that hardware. So, <NEXT> this company missed its desired RTO by 18 hours. <NEXT>
Here’s a common DR timeline of a generic business. Every week this company performs full data backups. <NEXT> Then an outage hits. Since this company isn’t using something really cool like VMware for virtualization with Site Recovery Manager, they have to rely on recovering data from backup tapes. Maybe the tapes are still on site and were not damaged. Or, perhaps the tapes are off-site. <NEXT> The company has a documented goal of resuming online operations within 60 hours of an outage. Plenty of time to get your tapes delivered and re-load al of your data. <NEXT> Unless your tape machine was destroyed by the disaster making your tapes useless until you replace that hardware. So, <NEXT> this company missed its desired RTO by 18 hours. <NEXT>
Here’s a common DR timeline of a generic business. Every week this company performs full data backups. <NEXT> Then an outage hits. Since this company isn’t using something really cool like VMware for virtualization with Site Recovery Manager, they have to rely on recovering data from backup tapes. Maybe the tapes are still on site and were not damaged. Or, perhaps the tapes are off-site. <NEXT> The company has a documented goal of resuming online operations within 60 hours of an outage. Plenty of time to get your tapes delivered and re-load al of your data. <NEXT> Unless your tape machine was destroyed by the disaster making your tapes useless until you replace that hardware. So, <NEXT> this company missed its desired RTO by 18 hours. <NEXT>
So, when measuring how far back you need to go to get useful data, and how fast you want to resume business operations after a disaster, different technologies get you there in different amounts of time. <NEXT> Tape is at one end of the spectrum, on the outer limits of this timeline, while things like replication and clustering are closer to the center. <NEXT> And as you might expect, the faster you want to recover, the more money you will need to spend. And, the longer it takes you to recover, the deeper the potential financial impact. < NEXT>
So, when measuring how far back you need to go to get useful data, and how fast you want to resume business operations after a disaster, different technologies get you there in different amounts of time. <NEXT> Tape is at one end of the spectrum, on the outer limits of this timeline, while things like replication and clustering are closer to the center. <NEXT> And as you might expect, the faster you want to recover, the more money you will need to spend. And, the longer it takes you to recover, the deeper the potential financial impact. < NEXT>
So, when measuring how far back you need to go to get useful data, and how fast you want to resume business operations after a disaster, different technologies get you there in different amounts of time. <NEXT> Tape is at one end of the spectrum, on the outer limits of this timeline, while things like replication and clustering are closer to the center. <NEXT> And as you might expect, the faster you want to recover, the more money you will need to spend. And, the longer it takes you to recover, the deeper the potential financial impact. < NEXT>
Here’s a graphic to help set cost expectations around certain technologies. Pricing ranges from hot to cold; the tier rankings are just a way to group things. The RTO and RPO numbers here are telling, because while DR recovery scenarios are unique, you generally find that each Objective falls into one of three timeframes: <ELABORATE ON RTO/RPO ROWS>. Array-based at storage layer / vSphere/VM replication = at Hype layerArray = replicate physical servers / vSphere cannot . Array configured per LUN or volume / Host = configured per VMArray = storage eng / VM = sys admin
OK, we’ve covered lots of numbers so far. Let’s take a look at some architecture drawings. <NEXT>
Walk the audience through the basics of redundancy throughout the typical hosting tiers, while pointing out EMC products in the mix. BUT, this focuses on a single data center, which is still a single point of failure.
This slide expands the config to a second data center, points out a smaller DR config, plus the EMC technologies in play.
Earlier I mentioned how we don’t get to choose our disasters. You might suffer a localized failure like a cabinet or two of damage due to a burst pipe or fire or electrical surge. Or, you might face a facility wide failure like a flood, cooling failure (melting servers), or massive explosion (bomb, plane, volcano). Now your provider faces replacing thousands of devices, or even tens of thousands, Channel issues, manpower issues, MRR stack ranking, etc. Don’t complain about downtime if you are running out of a single data center.
This slide expands the config to a second data center, points out a smaller DR config, plus the EMC technologies in play.
This slide expands the config to a second data center, points out a smaller DR config, plus the EMC technologies in play.
This slide expands the config to a second data center, points out a smaller DR config, plus the EMC technologies in play.
Showing how you can deploy Dev/Staging in a second DC and then use it as your DR site. Get more bank for your hosting dollar.
Showing how you can deploy Dev/Staging in a second DC and then use it as your DR site. Get more bank for your hosting dollar.
Showing how you can deploy Dev/Staging in a second DC and then use it as your DR site. Get more bank for your hosting dollar.
If you need to depend on your DR location longer than expected, you can make it more robust.
Here are the areas where Rackspace can lend a helping hand, and the areas that the customer must own. The top level is the holistic DR strategy. This is owned by the customer. Remember when we defined disaster recovery? It’s encompasses more than just the technology, but also the policies, people, and process. The customer is responsible for creating the DR plan, training the appropriate employees, and creating the failover run book, testing the failover often, making the “go-time” decision to failover after a disruption occurs, and then deciding to failback once the primary DC comes online. Rackspace is responsible for failing over to the Target DC once the authorization has been given by the customer. Rackspace also monitors the VM Replication virtual appliance, and alerts the customer when a replication fails to complete. As part of the Managed Virtualization service, Rackspace also manages the VM, guest OS, and hypervisor layer. In addition to the software layers, dedicated hardware, network and the DC is also covered. Failover is the customer’s responsibility but we assist and are on-call during the process.

Calculating Downtime Costs: How Much Should You Spend on DR?

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (6)

Similar to Calculating Downtime Costs: How Much Should You Spend on DR?

Similar to Calculating Downtime Costs: How Much Should You Spend on DR? (20)

More from Rackspace

More from Rackspace (20)

Recently uploaded

Recently uploaded (20)

Calculating Downtime Costs: How Much Should You Spend on DR?

Editor's Notes