1
MAXIMIZE AVAILABILITY AND UPTIME
BY CLUSTERING PHYSICAL DATA
CENTERS WITHIN METRO DISTANCES
MICHAEL NAKAMURA, SENIOR SOL...
WEBTECH EDUCATIONAL SERIES
Maximize Availability and Uptime by Clustering Your Physical Data Centers
within Metro Distance...
UPCOMING WEBTECHS
November
 Comprehensive and Simplified Management for VMware vSphere
environments, November 14, 11 a.m....
AGENDA
 Customer challenges
 VMware Metro Storage Cluster overview
 Hitachi Storage Cluster for VMware vSphere
technica...
CUSTOMER CHALLENGES
 Downtime
‒ Key component(s) failure in single data center
‒ Planned maintenance
‒ No disaster recove...
VMWARE METRO STORAGE CLUSTER
OVERVIEW
 VMware vSphere Metro Storage Cluster (vMSC) is a new
certified configuration in wh...
WHY USE A METRO STORAGE CLUSTER?
 Maximize availability and uptime by clustering physical
data centers within metro dista...
HITACHI STORAGE CLUSTER FOR VMWARE
VSPHERE: INFRASTRUCTURE OVERVIEW
HITACHI STORAGE CLUSTER FOR VMWARE
VSPHERE: MANAGEMENT OVERVIEW
 vCenter Server contains
these management
components:
‒ v...
HITACHI STORAGE CLUSTER FOR VMWARE
VSPHERE: ARCHITECTURE OVERVIEW
 Hitachi High Availability Manager (HAM)
installed on e...
HITACHI DYNAMIC LINK MANAGER (HDLM) WITH HIGH
AVAILABILITY MANAGER (HAM): INTRODUCTION
 Virtual storage represents P-VOL ...
HDLM WITH HAM: VMOTION AND DYNAMIC
RESOURCE SCHEDULER
 vMotioned VMs
‒ Hosts within the cluster will
use active paths to ...
HDLM WITH HAM:
VMWARE HIGH AVAILABILITY (HA)
 VMware HA failover
‒ VMs failover to existing ESX
nodes in HA cluster
‒ I/O...
HDLM WITH HAM: PATH FAILOVER
 When paths to P-VOL fail,
HDLM PSP handles the
path failover
HDLM WITH HAM: STORAGE FAILOVER
 When all paths to P-VOL or
MCU fail
‒ Paths to S-VOL become
active
‒ Verify data integri...
HDLM WITH HAM: PATH RECOVERY
 Storage recovery will
require reverse sync
‒ pairresync –swaps/swapp
 When storage recover...
QUORUM FAILURE
 Remote mirroring between
P-VOL and S-VOL stops
 P-VOL continues to process
host I/O
REPLICATION LINK FAILURE
 P-VOL continues to
process host I/O
HDLM WITH HAM: SITE FAILURE
 VM failover handled by
VMware HA
 Storage failover handled by
HAM
 Path failover to replic...
WAN LINK FAILURE (UNDER REVIEW)
 Link for replication and
remote site has failed but
links to local site are
active
‒ P-V...
BEST PRACTICE DESIGN
RECOMMENDATIONS
 Performance bottleneck dependent on WAN latency and
bandwidth
‒ Optionally use VMwa...
BEST PRACTICE DESIGN
RECOMMENDATIONS
 Perform storage failback during scheduled downtime
‒ Perform a clean and controlled...
23
QUESTIONS
UPCOMING WEBTECHS
November
 Comprehensive and Simplified Management for VMware vSphere
environments, November 14, 11 a.m....
THANK YOU
MICHAEL NAKAMURA
HENRY CHU
michael.nakamura@hds.com, henry.chu@hds.com
Upcoming SlideShare
Loading in...5
×

Maximize Availability and Uptime by Clustering Your Physical Data Centers Within Metro Distances

685

Published on

As IT infrastructures continue to be virtualized, data center architects are looking for ways to increase the mobility and high availability of virtual machines beyond a single data center. Expanding data centers across multiple locations has become an increasingly common strategy to address high-availability and disaster recovery needs for businesses with high uptime requirements. View this Webinar and learn how you can: Accelerate tier-1 virtualization adoption by providing best-in-class SLAs. Dynamically move workloads within and across data centers to avoid contention, and support utility-on-demand models. Provide automated recovery of applications with high return on investment. For more information on the virtualized data center please read our white paper: http://www.hds.com/assets/pdf/solution-profile-enabling-the-virtual-data-center-hitachi-virtual-storage-platform-for-vmware.pdf

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
685
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
18
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Maximize Availability and Uptime by Clustering Your Physical Data Centers Within Metro Distances

  1. 1. 1 MAXIMIZE AVAILABILITY AND UPTIME BY CLUSTERING PHYSICAL DATA CENTERS WITHIN METRO DISTANCES MICHAEL NAKAMURA, SENIOR SOLUTIONS ARCHITECT HENRY CHU, SENIOR SOLUTIONS ARCHITECT OCTOBER 2012
  2. 2. WEBTECH EDUCATIONAL SERIES Maximize Availability and Uptime by Clustering Your Physical Data Centers within Metro Distances As IT infrastructures continue to be virtualized, data center architects are looking for ways to increase the mobility and high availability of virtual machines beyond a single data center. Expanding data centers across multiple locations has become an increasingly common strategy to address high-availability and disaster recovery needs for businesses with high uptime requirements. Join Hitachi Data Systems for this Webinar and learn how you can: • Accelerate tier-1 virtualization adoption by providing best-in-class SLAs • Dynamically move workloads within and across data centers to avoid contention, and support utility-on-demand models • Provide automated recovery of applications with high return on investment
  3. 3. UPCOMING WEBTECHS November  Comprehensive and Simplified Management for VMware vSphere environments, November 14, 11 a.m. PT, 2 p.m. ET  Microsoft SQL Server 2012 Data Warehouse solutions on Hitachi converged platform, November 27, 9 a.m. PT, 12 p.m. ET Check www.hds.com/webtech for  Links to the recording, the presentation and Q&A (available next week)  Schedule and registration for upcoming WebTech sessions
  4. 4. AGENDA  Customer challenges  VMware Metro Storage Cluster overview  Hitachi Storage Cluster for VMware vSphere technical review  Best practices © Hitachi Data Systems Corporation and Brocade Communications Systems, Inc. 2012. All Rights Reserved.
  5. 5. CUSTOMER CHALLENGES  Downtime ‒ Key component(s) failure in single data center ‒ Planned maintenance ‒ No disaster recovery without downtime  Reluctance to migrate mission-critical apps ‒ Fear of performance degradation ‒ Data recovery is an issue; inability to meet recovery time objectives (RTO) and recovery point objectives (RPO)  Lack of a single point of management across data centers  No ability to pool resources across data centers limits application deployment flexibility
  6. 6. VMWARE METRO STORAGE CLUSTER OVERVIEW  VMware vSphere Metro Storage Cluster (vMSC) is a new certified configuration in which a storage device spans multiple geographical storage systems  Hitachi Storage Cluster certification is complete – on VMware Hardware Compatibility List  Implemented for disaster and downtime avoidance WHAT IS A METRO STORAGE CLUSTER?
  7. 7. WHY USE A METRO STORAGE CLUSTER?  Maximize availability and uptime by clustering physical data centers within metro distances  Leverage VMware infrastructure high-availability benefits with storage-based synchronous replication awareness  Stretched storage clusters provide new architectures that enable  Nondisruptive workload mobility  Cross-site load balancing of resources  Avoidance of disaster and downtime  Uniform host access model – provides a single view of a datastore across sites  Data consistency across 2 sites in the case of failure
  8. 8. HITACHI STORAGE CLUSTER FOR VMWARE VSPHERE: INFRASTRUCTURE OVERVIEW
  9. 9. HITACHI STORAGE CLUSTER FOR VMWARE VSPHERE: MANAGEMENT OVERVIEW  vCenter Server contains these management components: ‒ vCenter ‒ Hitachi Dynamic Link Manager (HDLM) command ‒ vSphere CLI ‒ CCI Raid Manager  Cmd Dev presented from both Hitachi Virtual Storage Platform (VSP) systems  Best practice: Place vCenter at a 3rd site to ensure virtual infrastructure management is not affected from any 1 site during a sitewide failure
  10. 10. HITACHI STORAGE CLUSTER FOR VMWARE VSPHERE: ARCHITECTURE OVERVIEW  Hitachi High Availability Manager (HAM) installed on each VSP  P-VOL and S-VOL seen as a single volume ‒ RCU takes MCU serial number upon failover  Write data transferred from MCU to RCU cache via synchronous Hitachi TrueCopy® ‒ Supports external storage and Hitachi Dynamic Provisioning volumes  Quorum disk on external storage ‒ Used by both MCU and RCU ‒ Unique quorum disk for each MCU-RCU relationship ‒ Allows verification of data integrity before failover ‒ Denotes location of most recent host data
  11. 11. HITACHI DYNAMIC LINK MANAGER (HDLM) WITH HIGH AVAILABILITY MANAGER (HAM): INTRODUCTION  Virtual storage represents P-VOL and S-VOL as a single volume ‒ P-VOL and S-VOL have same VOL ID in SCSI inquiry  HDLM in ESX manages path selection ‒ Active I/O sent to P-VOL ‒ S-VOL in standby state in normal operation ‒ Load balancing algorithm  Extended round robin  Extended least I/O  Extended least blocks  HAM uses synchronous TrueCopy to replicate from P-VOL to S-VOL
  12. 12. HDLM WITH HAM: VMOTION AND DYNAMIC RESOURCE SCHEDULER  vMotioned VMs ‒ Hosts within the cluster will use active paths to P-VOLs
  13. 13. HDLM WITH HAM: VMWARE HIGH AVAILABILITY (HA)  VMware HA failover ‒ VMs failover to existing ESX nodes in HA cluster ‒ I/O continues to active P- VOL paths
  14. 14. HDLM WITH HAM: PATH FAILOVER  When paths to P-VOL fail, HDLM PSP handles the path failover
  15. 15. HDLM WITH HAM: STORAGE FAILOVER  When all paths to P-VOL or MCU fail ‒ Paths to S-VOL become active ‒ Verify data integrity with quorum disk before failover ‒ RCU splits S-VOL with write- enabled status
  16. 16. HDLM WITH HAM: PATH RECOVERY  Storage recovery will require reverse sync ‒ pairresync –swaps/swapp  When storage recovers and paths to P-VOL recover ‒ Paths to S-VOL become standby ‒ P-VOL paths become active
  17. 17. QUORUM FAILURE  Remote mirroring between P-VOL and S-VOL stops  P-VOL continues to process host I/O
  18. 18. REPLICATION LINK FAILURE  P-VOL continues to process host I/O
  19. 19. HDLM WITH HAM: SITE FAILURE  VM failover handled by VMware HA  Storage failover handled by HAM  Path failover to replicated storage handled by HDLM
  20. 20. WAN LINK FAILURE (UNDER REVIEW)  Link for replication and remote site has failed but links to local site are active ‒ P-VOL cannot process host I/O ‒ HDLM switches the I/O path to S-VOL ‒ Site 1: I/O paths to S-VOL also cannot be used, so Site 1 cannot continue to access both P-VOL and S-VOL ‒ Site 2: S-VOL continues to process host I/O
  21. 21. BEST PRACTICE DESIGN RECOMMENDATIONS  Performance bottleneck dependent on WAN latency and bandwidth ‒ Optionally use VMware HA with N+1 settings with combination of DRS affinity rules to keep VMs on same site where the active volume resides  Quorum disk should be located at 3rd site to ensure quorum access is not affected from any 1 site during sitewide failure.  vCenter should be located at 3rd site to ensure virtual infrastructure management is not affected from any 1 site during sitewide failure.
  22. 22. BEST PRACTICE DESIGN RECOMMENDATIONS  Perform storage failback during scheduled downtime ‒ Perform a clean and controlled storage failback by migrating high-uptime virtual machines to a single host via VMware vMotion and then performing storage failback  Avoid single points of failure by architecting with redundancy in mind
  23. 23. 23 QUESTIONS
  24. 24. UPCOMING WEBTECHS November  Comprehensive and Simplified Management for VMware vSphere environments, November 14, 11 a.m. PT, 2 p.m. ET  Microsoft SQL Server 2012 Data Warehouse solutions on Hitachi converged platform, November 27, 9 a.m. PT, 12 p.m. ET Check www.hds.com/webtech for  Links to the recording, the presentation and Q&A (available next week)  Schedule and registration for upcoming WebTech sessions
  25. 25. THANK YOU MICHAEL NAKAMURA HENRY CHU michael.nakamura@hds.com, henry.chu@hds.com
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×