1
High Availability Mantra:
How DCIM Can Help
2
Today’s Topics
• High Availability Mantra Revisited
• Anatomy of a DCIM Software: GFS Crane
• How GFS Crane DCIM Deliver...
3
The High Availability Mantra RevisitedThe High Availability Mantra Revisited
Amazon Data Centers (built to Tier 4 standa...
4
Did You Know?
90% of DC Failures Are From Common Preventable Causes90% of DC Failures Are From Common Preventable Causes
5
Did You Know?
Average Failure of an Online System: 36 hours per annum.
That’s only 99.6% Uptime
Average Failure of an On...
6
Did You Know?
75% of Businesses Without a BC Plan Fail Within 3 Years after a Major
Disruption in their IT Systems
75% o...
7
Anatomy of a DCIM Software: GFS Crane
8
Improves Availability: Predictability, Visibility & Change Tracking
 Advanced Alarm Management and analytics helps in f...
9
Improves Availability: Predictability from Proactive Alarms
Proactive Real-time alarms
 Alarms on power, PUE and enviro...
10
Improves Availability: Visibility from Power Chain
Maps relationships among critical
components of electrical infrastru...
11
Improves Availability: Change Tracking
 Maintains an audit trail for all
Installation/Move/Add/Change activity in
the ...
12
Reduces Cost: Capex & Opex
Better visibility helps discovering under-utilized computing capacities
-> defers capex pur...
13
Reduces CapEx: Monitoring IT Utilization
Visibility of hidden compute capacity
 Calculates the average utilization of ...
14
Reduces Capex: Minimizing Stranded Capacities
Visibility of consumed power against max
capacity in a rack
 Provides re...
15
Reduces OpEx: Power Costs
Multi-level PUE Comparison
 Compares PUE calculated at
multiple levels and identifies power
...
16
Reduces Opex: Process Automation & Improved Productivity
Automated discovery and inventory of
both IT and infrastructur...
17
Reduces Opex: Asset Rationalization
Asset Rationalization
 Asset Management module tracks & maintains inventory of all...
18
How GFS Crane DCIM Helps
• Helps Data Center Manager avoid unnecessary over-provisioning
• Helps plan investments and n...
19
GFS Crane DCIM Case Study 1: Financial Services
Industry Project Financing & Mutual Funds
Data Center Location India
Da...
20
Industry Mobile Operator
Data Center Location South Asia
Data Center Details Multiple data centers spread across 4 loca...
21
Thank You
http://www.greenfieldsoft.com
Email: sales@greenfieldsoft.com
See other two in this series:
- The Modern Data...
Upcoming SlideShare
Loading in …5
×

The High Availability Mantra - How DCIM Can Help

516 views

Published on

This is the last of a 3-part series "DCIM for High Availability" presented by GreenField Software. It first defines "high availability" and then gives instances of some recent high profile Data Center failures in spite of their robustness and extreme in-built redundancies. The business impact of Data Center failures is highlighted.

Data Center topology has changed in the last two decades as a result of the High Availability Mantra and new tools are required to effectively manage the Modern Data Center. DCIM Software today has matured to a level where it is no longer an option. Data Centers of all sized need to implement DCIM not just to reduce risks of Data Center failures, but also to arrest increasing capital costs and operating expenses.
GFS Crane DCIM Software is a great example as the two DCIM Case Studies show in this presentation.
The following GFS Crane capabilities have been included in this presentation:
- Improved Availability through Predictability, Visibility and Change Tracking.
- Controlling Capex Costs though better visibility of under-utilized capacities and therefore deferring expensive capital expenditures; and minimizing stranded capacities.
- Reducing Operating Expenses: Real-time monitoring and multi-level PUE helps to reduce power costs; automation of processes improves productivity; and rationalization of assets reduces AMC and space rentals.
The presentation concludes with two GFS Crane DCIM Case Studies: in Financial Services and Telecom verticals.

GreenField Software’s Mission is to help Data Centers control capital expenditures reduce operating expenses and mitigate the risks of Data Center failures. Besides DCIM Software, GFS offers Data Center Advisory Services in the areas of best practices, capacity planning, energy efficiency and business continuity of data centers.

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
516
On SlideShare
0
From Embeds
0
Number of Embeds
8
Actions
Shares
0
Downloads
27
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

The High Availability Mantra - How DCIM Can Help

  1. 1. 1 High Availability Mantra: How DCIM Can Help
  2. 2. 2 Today’s Topics • High Availability Mantra Revisited • Anatomy of a DCIM Software: GFS Crane • How GFS Crane DCIM Delivers Higher Availability • How GFS Crane DCIM Helps to Reduce Costs • GFS Crane DCIM Case Studies
  3. 3. 3 The High Availability Mantra RevisitedThe High Availability Mantra Revisited Amazon Data Centers (built to Tier 4 standards and with an expected availability of 99.995%) had two outages in 2012 – each over 3 hours! • Tier 3/Tier 4 just defined by hardware redundancies • Glaring gaps in operating procedures to prevent fatal human errors • Lack of purpose-built BCP software to predict failures • Lack of chain of custody to detect root cause Amazon Data Centers (built to Tier 4 standards and with an expected availability of 99.995%) had two outages in 2012 – each over 3 hours! • Tier 3/Tier 4 just defined by hardware redundancies • Glaring gaps in operating procedures to prevent fatal human errors • Lack of purpose-built BCP software to predict failures • Lack of chain of custody to detect root cause Availability % Downtime per year Downtime per month* Downtime per week 99% ("two nines") 3.65 days 7.20 hours 1.68 hours 99.5% 1.83 days 3.60 hours 50.4 minutes 99.8% 17.52 hours 86.23 minutes 20.16 minutes 99.9% ("three nines") 8.76 hours 43.8 minutes 10.1 minutes 99.95% 4.38 hours 21.56 minutes 5.04 minutes 99.99% ("four nines") 52.56 minutes 4.32 minutes 1.01 minutes 99.999% ("five nines") 5.26 minutes 25.9 seconds 6.05 seconds 99.9999% ("six nines") 31.5 seconds 2.59 seconds 0.605 seconds 99.99999% ("seven nines") 3.15 seconds 0.259 seconds 0.0605 seconds
  4. 4. 4 Did You Know? 90% of DC Failures Are From Common Preventable Causes90% of DC Failures Are From Common Preventable Causes
  5. 5. 5 Did You Know? Average Failure of an Online System: 36 hours per annum. That’s only 99.6% Uptime Average Failure of an Online System: 36 hours per annum. That’s only 99.6% Uptime
  6. 6. 6 Did You Know? 75% of Businesses Without a BC Plan Fail Within 3 Years after a Major Disruption in their IT Systems 75% of Businesses Without a BC Plan Fail Within 3 Years after a Major Disruption in their IT Systems
  7. 7. 7 Anatomy of a DCIM Software: GFS Crane
  8. 8. 8 Improves Availability: Predictability, Visibility & Change Tracking  Advanced Alarm Management and analytics helps in failure predictability, faster turn-around-time, improved availability and SLA  Consolidation of alarms from different facilities helps in centralized monitoring Improved visibility of the power chain and the relationships among critical components of the infrastructure helps in better impact analysis of device malfunction or failure and doing RCA  Change Tracking in the data center environment helps in doing impact analysis of any change and root cause analysis of any outage occurring due to a change Predictive Analytics Predictive Analytics Visibility from Power Chain Visibility from Power Chain Change TrackingChange Tracking
  9. 9. 9 Improves Availability: Predictability from Proactive Alarms Proactive Real-time alarms  Alarms on power, PUE and environmental conditions like temperature, humidity, smoke, fire, WLD, door-open and motion  Alarms can be sent on e-mail & SMS Alarm Dashboard  Alarms from multiple data centers are consolidated on a dashboard  Analysis on alarms based on severity, type, source, duration etc. Advanced Alarm Management helps in failure predictability, faster turn-around-time, improved availability & SLA compliance
  10. 10. 10 Improves Availability: Visibility from Power Chain Maps relationships among critical components of electrical infrastructure  Create power chain for electrical infrastructure  Map asset relationships and redundancies starting from power source to customers and applications Asset Relationship Mapping Improved visibility of the power chain and relationships among critical components of the infrastructure help in better impact analysis of device malfunction or failure and doing root cause analysis
  11. 11. 11 Improves Availability: Change Tracking  Maintains an audit trail for all Installation/Move/Add/Change activity in the data center  Integration with existing ITSM tool enables running the tracked changes through a workflow system for change approvals Audit Trail of DC Configuration Changes Tracking changes in the data center environment helps in doing impact analysis of any change and root cause analysis of any outage occurring due to a change
  12. 12. 12 Reduces Cost: Capex & Opex Better visibility helps discovering under-utilized computing capacities -> defers capex purchases Better visibility helps avoiding stranded capacities on rack space & power use: maximizes utilization of available capacities  Better monitoring & analytics reduces operating cost on power  Automation of processes like Asset Tracking, Provisioning & Monitoring improves productivity  Rationalizing asset base helps in lower maintenance costs like equipment AMC Reduces CapexReduces Capex Reduces OpexReduces Opex
  13. 13. 13 Reduces CapEx: Monitoring IT Utilization Visibility of hidden compute capacity  Calculates the average utilization of all computing devices in the data center  Identifies the unused compute capacity Under-utilized servers can be repurposed  Based on power consumption & utilization patterns, hardware specs and age, ‘Repurpose Candidates’ are identified that helps in deferring new server hardware purchase Hidden Computing Capacity Repurpose Hardware Discovery of hidden compute capacity defers capital investment on new server hardware and software licenses
  14. 14. 14 Reduces Capex: Minimizing Stranded Capacities Visibility of consumed power against max capacity in a rack  Provides real-time information on actual IT load in a rack  Provides maximum power capacity  Provides available power capacity Visibility of occupied rack space against max available space  Provides real-time information on occupied space in the rack in RU  Provides maximum space capacity  Provides available space capacity Hidden Power Capacity Hidden Space Capacity
  15. 15. 15 Reduces OpEx: Power Costs Multi-level PUE Comparison  Compares PUE calculated at multiple levels and identifies power distribution losses that can be rectified to improve efficiency and reduce OpEx on Power Detect Power Distribution Loss L1 PUE: UPS Output L2 PUE: PDU Output L3 PUE: Device-level reading Detection of power distribution losses in the electrical infrastructure helps in improving energy efficiency of the data center and reduce operating cost on power
  16. 16. 16 Reduces Opex: Process Automation & Improved Productivity Automated discovery and inventory of both IT and infrastructure assets  Intelligent assets are automatically discovered using SNMP/IPMI  Manufacturer Repository contains information on static attributes of assets  Assets data imported from spreadsheets or asset management tool  Single management console to manage IT and non-IT assets  Maintenance management for assets done using plug-ins that sends scheduler based proactive alerts  Workflow-based auto-provisioning improves speed and reduces errors Advanced Asset Management
  17. 17. 17 Reduces Opex: Asset Rationalization Asset Rationalization  Asset Management module tracks & maintains inventory of all assets (IT & non-IT) in the data Centre.  Helps identify legacy servers and replacement candidates  Reduces AMC, space rentals Asset Rationalization Asset Rationalization Server Virtualization Server Virtualization Capacity Planning Capacity Planning Data Center Consolidation Data Center Consolidation GFS Crane DC DCIM GFS Crane DC DCIM Legacy Data Center Legacy Data Center Server & Rack Consolidation Server & Rack Consolidation Multiple Data Centers Multiple Data Centers
  18. 18. 18 How GFS Crane DCIM Helps • Helps Data Center Manager avoid unnecessary over-provisioning • Helps plan investments and new capacity • Helps reduce the capital costs • Helps reduce power use and other operating costs • Helps reduce risk of failures through critical alerts • Helps adapting to technical and business change more easily • Helps improvement plans through real-time metrics & dashboard
  19. 19. 19 GFS Crane DCIM Case Study 1: Financial Services Industry Project Financing & Mutual Funds Data Center Location India Data Center Details Tier III certified by 451 Research, Energy Efficient ‘green’ Data Center certified by TÜV Rheinland DCIM Implementation date January, 2012 Business requirement driving DCIM implementation  Improve energy efficiency through better energy management  Comply with Green Grid recommendations and adopt best practices in data center operations  Improve data center availability and meet business SLA through better monitoring, failure prediction and faster turn-around-time Integration Touch Points Power Systems: LT transformer panels, UPS, PDUs and Distribution Panels, BUSBAR panels, Multifunction Energy Meters. Environmental Systems: PAC units, temperature and humidity probes Servers, Network devices, Storage devices Siemens Building Management System
  20. 20. 20 Industry Mobile Operator Data Center Location South Asia Data Center Details Multiple data centers spread across 4 locations, covering 8,500 sq.ft. of whitespace and housing 320 racks DCIM Implementation Date Ongoing Business requirement driving DCIM implementation  Improve data center efficiency through better energy management  Improve operational efficiency through better asset management, capacity planning and converged infrastructure monitoring capability  Improve data center availability and meet business SLA through better monitoring, failure prediction and faster turn-around-time Integration Touch Points Power Systems: LT transformer panels, UPS, A/C & D/C PDUs and Distribution Panels, BUSBAR panels, Multifunction Energy Meters. Environmental Systems: PAC units, temperature and humidity probes Diesel generator, flow and level sensors IBM Netcool (ITSM), VESDA, ACS and IP Surveillance GFS Crane DCIM Case Study 2: Telecom
  21. 21. 21 Thank You http://www.greenfieldsoft.com Email: sales@greenfieldsoft.com See other two in this series: - The Modern Data Center Topology: The High Availability Mantra - Data Center Infrastructure Management: ERP for the Data Center Manager

×