Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Operating and Architecting a vSphere Metro Storage
Cluster based infrastructure
Lee Dilworth, VMware
Duncan Epping, VMware...
2
Interact!
 If you use Twitter, feel free to tweet about this session and use
hashtag #BCO4872
 Feel free to take pictu...
3
Agenda for Today
 Availability Basics
 vSphere Metro Storage Cluster Basics
 Architecting and Operating
 Failure Sce...
4
Availability Basics
5
Disaster Avoidance
 Avoidance NOT Recovery
• Two sites, One vSphere Cluster
• One vCenter manages BOTH sites
• One site...
6
Disaster Recovery
Replication
 Recovery NOT avoidance
• Two sites, typically two vSphere Clusters
• Each sites usually ...
7
vSphere High Availability – Setting the Baseline
 vSphere HA minimizes unplanned downtime
 Provides automatic VM recov...
8
vSphere 5.0+ Architecture
 HA Agent
• Called the Fault Domain Manager (FDM)
• Provides all the HA on-host functionality...
9
Master and Slave Roles
 Any host can be master, selected by
election
• All others assume the role of slaves
 The Maste...
10
Network Used for Communication
 Network is default communication method
• Used for selecting a Master
• Used for heart...
11
Datastores Used for Communication
 Datastores are used when management network is
not available
• It is used to determ...
12
vSphere Metro Storage Cluster
the Basics (well sort of)
13
What is a vSphere Metro Storage Cluster
 Stretched cluster solution, not a feature!
 Requires:
• storage system that ...
14
vSphere Metro Storage Cluster – Growing Ecosystem
15
vMSC Certified Storage
Typical vSphere vMSC Setup
vCenter
Stretched Network
vSphere HA Cluster
Network
Storage
16
Latency Support Requirements
 ESXi management network max supported latency 10 milliseconds
Round Trip Time (RTT)
• No...
17
When to Use Stretched vSphere Clusters?
 Campus / nearby sites
• Sites within Synchronous distance
• Two buildings on ...
18
Two Architectures: Uniform Host Access Configuration
(1/2)
Stretched Cluster
Storage A
LUN (R/W)
Storage B
LUN (R/O)
FC...
19
Two architectures: Non-Uniform Host Access Configuration
(2/2)
Stretched Cluster
Storage A
LUN (R/W)
Storage B
LUN (R/W...
20
Defining Some Failure Terminology
 All Paths Down (APD) – Aaahhhh where has that device gone?
• Incorrect storage remo...
21
Architecting and Operating
vSphere Metro Storage Cluster
22
Will Use Our Environment to Illustrate…
 Two sites
 Four hosts in total
 Stretched network
 Stretched storage
 One...
23
HA & DRS – Site Awareness
DRS
HA
network
 What they think…..
 What you’ve actually got…..
DRS
HA
?
?
24
Why Should I Care About Site Awareness?
 Operational Simplicity
• Group dependent workloads
• Increase HA predictabili...
25
DRS Design Considerations – Affinity Rules (1/2)
DRS Host Group Per Site
DRS VM Group Per Site
Align Dependent VM Workl...
26
DRS Design Considerations – Affinity Rules (2/2)
 Use the “should” rules
• HA does not violate “must” therefore avoid ...
27
Storage DRS Design Considerations
 Cluster datastores based on
“site affinity”
 Avoid unnecessary site-to-site
migrat...
28
Network Design Considerations
 Network teams usually don’t like the words “Stretch” and “Cluster”
 Site-to-Site vMoti...
29
HA Design Considerations – Admission Control
 What about Admission Control?
• We typically recommend setting it to 50%...
30
HA Design Considerations – Isolation Response
 Isolation response
• Configure it based on your infrastructure!
• We ca...
31
HA Design Considerations – Isolation Addresses
 Isolation addresses
• Specify two, one at each site, using the advance...
32
HA Design Considerations – Heartbeat Datastores
 Each site needs a heartbeat datastore defined to ensure each
site can...
33
HA Design Consideration – Restart Order
 You can use “restart priority” to determine restart order
 This applies even...
34
Operations - Maintaining the Configuration
 Storage Device <-> DRS Affinity Group
Mappings
 Validate DRS Affinity reg...
35
Failure Scenarios
36
Face Your Fears!
 Understand the possibilities
 Test them
 Test them again and keeping going until they feel normal!...
37
Scenario - Single Host Failure (Non-Uniform)
Storage A
LUN (R/W)
Storage B
LUN (R/W)
FC / IP
fabricfabric
management
 ...
38
Scenario - Full Compute Failure in One Site (Non-Uniform)
Storage A
LUN (R/W)
Storage B
LUN (R/W)
FC / IP
fabricfabric
...
39
Scenario - Storage Partition (Uniform)
Stretched Cluster
Storage A
LUN (R/W)
Storage B
LUN (R/O)
FC / IP
fabricfabric
m...
40
Scenario - Storage Partition (Non-uniform)
Stretched Cluster
Storage A
LUN (R/W)
Storage B
LUN (R/W)
FC / IP
fabricfabr...
41
Permanent Device Loss (PDL) Requirements (1/2)
 Ensure PDL enhancements are configured
• Cluster Advanced Option
• Set...
42
Permanent Device Loss (PDL) Requirements (2/2)
 Ensure PDL enhancements are configured
• ESXi Host Level changes
• 5.1...
43
Scenario - Datacenter Partition (Uniform) (1/3)
Stretched Cluster
Storage A
LUN (R/W)
Storage B
LUN (R/O)
FC / IP
fabri...
44
Scenario - Datacenter Partition (Uniform) (2/3)
Stretched Cluster
Storage A
LUN (R/W)
Storage B
LUN (R/O)
FC / IP
fabri...
45
Scenario - Datacenter Partition (Uniform) (3/3
• VM restarted in site with “storage site-affinity”
• Now you have two a...
46
Scenario - Loss of full datacenter (Non-Uniform)
Stretched Cluster
Storage A
LUN (R/W)
Storage B
LUN (R/W)
FC / IP
fabr...
47
Wrapping Up
48
Key Takeaways
 Design a cluster that meets your needs don’t forget operations!
 Understand HA / DRS play key part in ...
49
Questions?
50
Other VMware Activities Related to This Session
 Group Discussions:
BCO1001-GD
Stretched Clusters for Availability wit...
THANK YOU
Operating and Architecting a vSphere Metro Storage
Cluster based infrastructure
Lee Dilworth, VMware
Duncan Epping, VMware...
VMworld 2013: Operating and Architecting a vSphere Metro Storage Cluster based infrastructure
Upcoming SlideShare
Loading in …5
×

VMworld 2013: Operating and Architecting a vSphere Metro Storage Cluster based infrastructure

1,430 views

Published on

VMworld 2013

Lee Dilworth, VMware

Learn more about VMworld and register at http://www.vmworld.com/index.jspa?src=socmed-vmworld-slideshare
Duncan Epping, VMware

Published in: Technology, Business
  • Be the first to comment

VMworld 2013: Operating and Architecting a vSphere Metro Storage Cluster based infrastructure

  1. 1. Operating and Architecting a vSphere Metro Storage Cluster based infrastructure Lee Dilworth, VMware Duncan Epping, VMware BCO4872 #BCO4872
  2. 2. 2 Interact!  If you use Twitter, feel free to tweet about this session and use hashtag #BCO4872  Feel free to take pictures, shoot video, and share it on twitter / facebook  Blog about it • We would love to read your thoughts, your opinion, design decisions!
  3. 3. 3 Agenda for Today  Availability Basics  vSphere Metro Storage Cluster Basics  Architecting and Operating  Failure Scenarios  Wrapping up
  4. 4. 4 Availability Basics
  5. 5. 5 Disaster Avoidance  Avoidance NOT Recovery • Two sites, One vSphere Cluster • One vCenter manages BOTH sites • One site effectively put into maintenance mode • Hot VM Mobility solution Intra-cluster vMotion
  6. 6. 6 Disaster Recovery Replication  Recovery NOT avoidance • Two sites, typically two vSphere Clusters • Each sites usually managed by own vCenter • vMSC solutions CAN support disaster recovery via HA restarts • Cold VM Mobility Solutions (SRM or vMSC “Federated HA”)
  7. 7. 7 vSphere High Availability – Setting the Baseline  vSphere HA minimizes unplanned downtime  Provides automatic VM recovery in minutes  Protects against various types of failures • Host failure • Host network isolation • Permanent loss of datastore • VM crashes (including VMX) • Guest OS / Application crashes / hangs  Does not require complex configuration changes  Is Operating System and application-independent
  8. 8. 8 vSphere 5.0+ Architecture  HA Agent • Called the Fault Domain Manager (FDM) • Provides all the HA on-host functionality  Operation • vCenter Server manages the cluster • Failover is not dependent on vCenter  Communicate over • Management Network • Datastores vCenter Server
  9. 9. 9 Master and Slave Roles  Any host can be master, selected by election • All others assume the role of slaves  The Master • Monitors hosts and VMs • Manages VM restarts after failures • Reports cluster state to vCenter Server  The Slave • Forwards critical state changes to the Master • Restart VMs when directed by the Master • Elects new Master vCenter Server
  10. 10. 10 Network Used for Communication  Network is default communication method • Used for selecting a Master • Used for heartbeating • Used for reporting state to vCenter Server  Network Heartbeating • Used by a Master to monitor the state of a Slave • When Master receives no heartbeats it will ping the Slave • When Slave receives no heartbeats from Master it will ping isolation address
  11. 11. 11 Datastores Used for Communication  Datastores are used when management network is not available • It is used to determine state (isolated vs failed) • Only when a failure has occurred! • vCenter selects two for each host  Files used on datastores • host-<id>-hb • Heartbeat file! • host-<id>-poweron • Contains power state of VMs and used to communicate isolation • First line, either a “0” or a “1” where “1” means isolated • protectedlist • Owned by the master, its view of the world
  12. 12. 12 vSphere Metro Storage Cluster the Basics (well sort of)
  13. 13. 13 What is a vSphere Metro Storage Cluster  Stretched cluster solution, not a feature!  Requires: • storage system that “stretches” across sites • stretched network across sites  Hardware Compatibility List (HCL) – Certified vMSC • “iSCSI Metro Cluster Storage” • “FC Metro Cluster Storage” • “NFS Metro Cluster Storage”
  14. 14. 14 vSphere Metro Storage Cluster – Growing Ecosystem
  15. 15. 15 vMSC Certified Storage Typical vSphere vMSC Setup vCenter Stretched Network vSphere HA Cluster Network Storage
  16. 16. 16 Latency Support Requirements  ESXi management network max supported latency 10 milliseconds Round Trip Time (RTT) • Note: 10ms supported with Enterprise+ licenses only (Metro vMotion), default is 5ms  Synchronous storage replication link is 5 milliseconds RTT • Note: some storage vendors have different support requirements! network
  17. 17. 17 When to Use Stretched vSphere Clusters?  Campus / nearby sites • Sites within Synchronous distance • Two buildings on a common campus • Two datacenters within a city  Planned migration important • Long-distance vMotion for planned maintenance, disaster avoidance, or load balancing  DR Features less critical • No testing, orchestration, or automation • VMware HA typically not sufficient for automation – requires scripting / manual process due to VM placement with primary / secondary arrays • RTOs typically longer
  18. 18. 18 Two Architectures: Uniform Host Access Configuration (1/2) Stretched Cluster Storage A LUN (R/W) Storage B LUN (R/O) FC / IP fabricfabric Site A Site B
  19. 19. 19 Two architectures: Non-Uniform Host Access Configuration (2/2) Stretched Cluster Storage A LUN (R/W) Storage B LUN (R/W) fabric fabric FC / IP distributed Site A Site B
  20. 20. 20 Defining Some Failure Terminology  All Paths Down (APD) – Aaahhhh where has that device gone? • Incorrect storage removal i.e. yanked! • Sudden storage failure • No time for storage to tell us anything  Permanent Device Loss (PDL) – Aaahhhh the device has gone, OK I understand • Much nicer than APD, graceful handing of state change • Storage notifies of device state change via SCSI sense code • Allows HA to failover VM’s  Split Brain – Hmmm the other half has disappeared, now what? • Election of second HA master • Check heartbeat datastore region • Restart VM’s (if needed)
  21. 21. 21 Architecting and Operating vSphere Metro Storage Cluster
  22. 22. 22 Will Use Our Environment to Illustrate…  Two sites  Four hosts in total  Stretched network  Stretched storage  One vCenter Server  One vSphere HA Cluster fabricfabric management Site A Site B Storage A LUN (R/W) Storage B LUN (R/W) FC / IP distributed
  23. 23. 23 HA & DRS – Site Awareness DRS HA network  What they think…..  What you’ve actually got….. DRS HA ? ?
  24. 24. 24 Why Should I Care About Site Awareness?  Operational Simplicity • Group dependent workloads • Increase HA predictability • Reduce impact of full cluster partition • Orchestrate allocation of workloads to “sites” • Even distribution & consumption of cluster resources  Alignment with Storage • Locate VM’s above read/write device • Remove unnecessary east/west IO traffic • Access anywhere devices, align with partition winner per device
  25. 25. 25 DRS Design Considerations – Affinity Rules (1/2) DRS Host Group Per Site DRS VM Group Per Site Align Dependent VM Workloads
  26. 26. 26 DRS Design Considerations – Affinity Rules (2/2)  Use the “should” rules • HA does not violate “must” therefore avoid for these configurations
  27. 27. 27 Storage DRS Design Considerations  Cluster datastores based on “site affinity”  Avoid unnecessary site-to-site migrations  Set Storage DRS to “Manual”, take control, migration *could* impact availability  Align VM’s with storage / site boundary  Group *similar* devices!
  28. 28. 28 Network Design Considerations  Network teams usually don’t like the words “Stretch” and “Cluster”  Site-to-Site vMotion – handle carefully  Ingress point to the network? Load balanced / redundant?  Consider application users – site affinity affects data flow to!  Network options are changing (OTV, EoMPLS)  L3 Routing impacts (and options LISP?)  Co-locate Multi-VM applications  Consider east-west traffic network
  29. 29. 29 HA Design Considerations – Admission Control  What about Admission Control? • We typically recommend setting it to 50%, to allow full site fail-over • Admission control is not a resource management tool • Only guarantees power-on
  30. 30. 30 HA Design Considerations – Isolation Response  Isolation response • Configure it based on your infrastructure! • We cannot make this decision for you, however…
  31. 31. 31 HA Design Considerations – Isolation Addresses  Isolation addresses • Specify two, one at each site, using the advanced setting “das.isolationaddress” • Note that “default gateway” is an isolation address already! isolation address 02 isolation address 01
  32. 32. 32 HA Design Considerations – Heartbeat Datastores  Each site needs a heartbeat datastore defined to ensure each site can update heartbeat region for storage local to that site  With multiple storage systems consider increasing default from 2 to 4 => 2 per site
  33. 33. 33 HA Design Consideration – Restart Order  You can use “restart priority” to determine restart order  This applies even when there is no contention  Only about order in restarts occur, not about when VM is booted
  34. 34. 34 Operations - Maintaining the Configuration  Storage Device <-> DRS Affinity Group Mappings  Validate DRS Affinity regularly  Are there VM dependencies? Co-locate!  Remember HA doesn’t speak vApp (wont’ respect restart order)  …automate if you can!  Some vendors offer tools DRS HA
  35. 35. 35 Failure Scenarios
  36. 36. 36 Face Your Fears!  Understand the possibilities  Test them  Test them again and keeping going until they feel normal! vm mobility P A R T I T I O N
  37. 37. 37 Scenario - Single Host Failure (Non-Uniform) Storage A LUN (R/W) Storage B LUN (R/W) FC / IP fabricfabric management  A normal HA event  No network or datastore heartbeats  Host will be declared dead  All VMs will be restarted  Could violate affinity rules X Site A Site B distributed
  38. 38. 38 Scenario - Full Compute Failure in One Site (Non-Uniform) Storage A LUN (R/W) Storage B LUN (R/W) FC / IP fabricfabric management  Normal HA event  No datastore or network heartbeats  All virtual machines will be restarted  Note, max 32 concurrent restarts per host  “Sequencing” start up order!  Will violate affinity rules! (should rule) X X Site A Site B distributed
  39. 39. 39 Scenario - Storage Partition (Uniform) Stretched Cluster Storage A LUN (R/W) Storage B LUN (R/O) FC / IP fabricfabric management  Virtual machines remained running with no impact!  Will virtual machines be restarted on the other site? • No  Network heartbeats! X Site A Site B
  40. 40. 40 Scenario - Storage Partition (Non-uniform) Stretched Cluster Storage A LUN (R/W) Storage B LUN (R/W) FC / IP fabricfabric management  Virtual machines remained running with no impact!  Will virtual machines be restarted on the other site? • Yes  PDL Sense code issued. • VM will be killed • HA will detect and restart!X PDL Site A Site B preferred
  41. 41. 41 Permanent Device Loss (PDL) Requirements (1/2)  Ensure PDL enhancements are configured • Cluster Advanced Option • Set “Das.maskCleanShutdownEnabled” to “true”, in advanced settings • Set to “false” by default in 5.0, change it! • Set to “true” by default in 5.1 and up
  42. 42. 42 Permanent Device Loss (PDL) Requirements (2/2)  Ensure PDL enhancements are configured • ESXi Host Level changes • 5.1 and earlier: Set “disk.terminateVMonPDLDefault” to “true” in “/etc/vmware/settings” • 5.5 and up: Set advanced setting “VMkernel.Boot.terminateVMOnPDL”
  43. 43. 43 Scenario - Datacenter Partition (Uniform) (1/3) Stretched Cluster Storage A LUN (R/W) Storage B LUN (R/O) FC / IP fabricfabric management  Virtual machines remained running with no impact!  Remember the affinity rules  Without affinity rules this would result in APD condition… X X X Site A Site B
  44. 44. 44 Scenario - Datacenter Partition (Uniform) (2/3) Stretched Cluster Storage A LUN (R/W) Storage B LUN (R/O) FC / IP fabricfabric management  Affinity rule was violated  Same VM restarted in Site A  Results in APD for Site B  Same VM  Same IP address  Same name  Yes, could result in weird behavior! X X X Site A Site B
  45. 45. 45 Scenario - Datacenter Partition (Uniform) (3/3 • VM restarted in site with “storage site-affinity” • Now you have two active instances of same VM! • When partition is lifted, VM will be killed!
  46. 46. 46 Scenario - Loss of full datacenter (Non-Uniform) Stretched Cluster Storage A LUN (R/W) Storage B LUN (R/W) FC / IP fabricfabric management  All virtual machines will be restarted  Note in many cases requires manual intervention from a storage perspective!  HA will retry 5 times and has a compatibility list  Run DRS when site returns, to apply affinity rules and balance load! Site A Site B distributed
  47. 47. 47 Wrapping Up
  48. 48. 48 Key Takeaways  Design a cluster that meets your needs don’t forget operations!  Understand HA / DRS play key part in your vMSC success  Testing is critical, don’t just test the easy stuff!  Document process changes, gain operational acceptance  Do not assume it is “Next > Next > Finish”  Ongoing maintenance/checks will be required  Automate as much as you can!
  49. 49. 49 Questions?
  50. 50. 50 Other VMware Activities Related to This Session  Group Discussions: BCO1001-GD Stretched Clusters for Availability with Lee Dilworth
  51. 51. THANK YOU
  52. 52. Operating and Architecting a vSphere Metro Storage Cluster based infrastructure Lee Dilworth, VMware Duncan Epping, VMware BCO4872 #BCO4872

×