Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

STO7534 VSAN Day 2 Operations (VMworld 2016)


Published on

VMworld 2016 - Virtual SAN (VSAN) Day 2 Operations - STO7534

Published in: Technology
  • Be the first to comment

STO7534 VSAN Day 2 Operations (VMworld 2016)

  1. 1. Virtual SAN - Day 2 Operations Cormac Hogan, VMware, Inc Paudie ORiordan, VMware, Inc STO7534 #STO7534
  2. 2. • This presentation may contain product features that are currently under development. • This overview of new technology represents no commitment from VMware to deliver these features in any generally available product. • Features are subject to change, and must not be included in contracts, purchase orders, or sales agreements of any kind. • Technical feasibility and market demand will affect final delivery. • Pricing and packaging for any new technologies or features discussed or presented have not been determined. Disclaimer CONFIDENTIAL 2
  3. 3. This Session… • Virtual SAN has been available since March 2014, almost 2.5 years • To date, we have over 5,000 VSAN customers. • VMware recognises that dealing with Virtual SAN Operations on a day to day basis requires more than 2 clicks • Since the launch of Virtual SAN, additional tools for managing, monitoring and troubleshooting Virtual SAN have become available. • In this session, approaches to common problems that actual Virtual SAN administrators face will be discussed. • We will discuss how various tools and approaches to various problems can help you manage your data now the VMware consultant left the building…. 3CONFIDENTIAL
  4. 4. Agenda 4 1 Introduction to Session 2 Monitor – Getting The Basics Right 3 Alerting – What Are My Options? 4 Virtual SAN Upgrade 5 Bring it all together – Handling a Failure (Demo) CONFIDENTIAL
  5. 5. Monitoring – Get the Basics Right vSphere Logging Virtual SAN Trace Files ESXi Core Files
  6. 6. Persistent Logging Challenges with ESXi Boot Devices • vSphere Hosts can be deployed on multiple different types of media with draw backs and advantages – SCSI, SSD, USB, SATADOM • If you are already in production consider how logging gets laid out – SCSI /SAS/ SATA / SSD / VMFS automatically added Scratch located on VMFS – SATADOM VMFS automatically added Scratch located on VMFS • USB / SD (any capacity) – No VMFS No persistent Scratch area – 512 MB RAMDISK instead VMFS/scratch (RAMDISK) /bootbank system vmkDiagnostic /altbootbank /store VMware strongly recommends setting up syslog in all cases CONFIDENTIAL 6
  7. 7. Virtual SAN Trace files • Provides extremely low-level logging for VSAN – VSAN traces require ~500MB of disk space. – Majority of traces in binary format • Persisted to VMFS or NFS if available – VSAN Datastore does not support log redirection at this time • Stored on RAMDISK if no persistent storage available • In case of reboot, Most recent/important VSAN traces persisted to ”store” partition • In case of crash, VSAN traces persisted to diagnostic partition • Since Virtual SAN 6.2 ”urgent trace files” can be redirected to syslog target /bootbank system vmkDiagnostic /altbootbank /store VMFS/scratch (RAMDISK) /store vmkDiagnostic CONFIDENTIAL 7
  8. 8. ESXi Core Dump Partition • Special Partition incase of diagnostic crash – 2.2GB space set aside for memory dump • Ensures full memory dump gets written to persistent media • ESXI hosts with less than <= 512GB Physical Memory – We can safely fit memory dump to USB/SD • ESXi hosts greater than > 512GB Physical Memory – Use SAS/SATA , SATADOM, • vSphere ESXi Network Dump Collector – if no suitable persistent media available vmkDiagnostic /scratch (RAMDISK) /bootbank system /altbootbank /store CONFIDENTIAL 8
  9. 9. Alerting – What Are My Options? vSphere Built-In vRealize Operations vRealize Log Insight
  10. 10. vSphere Built-in • vSphere Native Alerting – 70+ Virtual SAN Health Alarms – Many more vSphere alarms – Alert via SNMP / SMTP • Create custom alarms – Use VMware ESXi VOBs or Observation IDs for VSAN • Virtual SAN Management API 6.2 interface for bespoke solution CONFIDENTIAL 11
  11. 11. vRealize Operations + Log Insight • Virtual SAN awareness with Storage Management Pack – Virtual SAN Dashboards and Heat Maps – Host and Device Statistics – Health Alerts • LogInsight also have Virtual SAN awareness – Virtual SAN content pack – Log aggregation from Virtual SAN nodes – Integration with VROPS alerting CONFIDENTIAL 12
  12. 12. Virtual SAN Upgrade Prerequisites Workflow Monitoring Gotchas
  13. 13. Upgrade Overview • Virtual SAN 6.2 has a new on disk format for disk groups and exposes new Data Services • Upgrades are performed in multiple phases – Phase 1: Upgrade to vSphere 6.0 U2 – Phase 2: Object and Disk format conversion (DFC) Virtual SAN 6.2 vSphere 6.2 Cluster: Manual Mode Phase 1 Phase 2 rvc > But before you begin Phase 0: Validate your current enviroment CONFIDENTIAL 15
  14. 14. Phase 0 – Please Read Before You Start • Virtual SAN 6.2 Release Notes • VMware Product Interoperability • VMware Virtual SAN Hardware • Server, Controller, SSD, Disk on HCL • Controller Firmware, Disk Firmware, • Controller Driver, Enclosure Firmware CONFIDENTIAL 16
  15. 15. Phase 1 - Upgrading from Virtual SAN 5.5 CONFIDENTIAL 17 • You can upgrade from VSAN 5.5 to VSAN 6.X • However…patching is critical • During upgrade some older releases of vSphere 5.5 may cause VMware Virtual SAN Data Unavailability and Instability. • Make sure all critical patches are installed prior to upgrade • Not an issue between VSAN 6.0 and VSAN 6.X More details – please read VMware KB 2113024 and VMware KB 2139969
  16. 16. Phase 1 – VSAN Disk Format Conversion Table CONFIDENTIAL 18 Virtual SAN Starting Version Virtual SAN Target Version Post-upgrade on-disk format upgrade required? Version Virtual SAN 5.5 U1 Virtual SAN 5.5 Update X No - Virtual SAN 5.5 Update X Virtual SAN 6.X Yes 1.0 to 2.0 / 3.0 Virtual SAN 6.0 Virtual SAN 6.1 No - Virtual SAN 6.0 or 6.1 Virtual SAN 6.2 Yes 2.0 to 3.0
  17. 17. Phase 1 – vSphere Software Upgrade • Step 1 – Upgrade vCenter Server to 6.0 U2 • Step 2 – Upgrade ESXi hosts to 6.0 U2 • Maintenance Mode? – Ensure accessibility • Fast, but with risk – Full data migration • Slower, but no risk CONFIDENTIAL 19
  18. 18. Phase 1 – vSphere Software Health Check GOTCHA • vCenter 6.0 Update 2 installed – Health check will not work when ESXi version is < 6.0 U2 CONFIDENTIAL 20
  19. 19. Phase 1 – vSphere Software Health Check • Software Upgraded? – Check your Virtual SAN Health – Update your HCL Database files – Make sure it’s all Green Address any failed tests BEFORE proceeding to the On Disk Format Upgrade! CONFIDENTIAL 21
  20. 20. Phase 2 – Disk Upgrade Prechecks… –All hosts in cluster are connected to vCenter Server –All host upgraded to ESXi 6.2 –No network partitions in the VSAN cluster. –No hosts with auto-claim storage. –No hosts in Maintenance Mode CONFIDENTIAL 22
  21. 21. Phase 2 – Are You Sure? CONFIDENTIAL 23
  22. 22. Phase 2 – Virtual SAN Object and Disk Format Conversion • Two Conversion steps • Objects • On Disk Format Version <= 2 Object Conversion Step Version 2.5 Version 3 Disk Format Conversion Step CONFIDENTIAL 24
  23. 23. Phase 2 – Upgrade Process • 1MB alignment of existing objects < Virtual SAN 6.0. Realigns vsanSparse objects to be on 4K boundaries for Virtual SAN 6.2 Data Services • During Virtual SAN on-disk format phase , a disk group evacuation is performed. – Data is evacuated – The disk group is removed – The disk group is re-added – Rinse and Repeat Evacuate Evacuate Evacuate Version 3 Version 3 Version 3 Version 3 EvacuateEvacuate Evacuate CONFIDENTIAL 25
  24. 24. Phase 2 – Disk Format Upgrade Gotcha • For two-node or three-node clusters, upgrade will fail • Virtual SAN allows upgrades to be performed in “reduced redundancy mode” • Caveats – You are now in “unprotected” mode – Any failures during upgrade process may lead to data unavailability. – Workaround available with Ruby vSphere Console (RVC) vsan.ondisk_upgrade –h hosts_and_clusters: Path to all HostSystems of cluster or ClusterComputeResource -a, --allow-reduced-redundancy Removes the need for one disk group worth of free space, by allowing reduced redundancy during disk upgrade -f, --force Automatically answer all confirmation questions with 'proceed' CONFIDENTIAL 26
  25. 25. Phase 2 – Disk Format Upgrade Gotchas More details – please read VMware KB 2146221 • Mismatched disk group versions • After vSphere upgraded to 6.0 U2, any new disk groups will be formatted with the latest version • This means there will be incompatible Disk Group versions, if you have not yet upgraded the on disk format • Workaround is to reset the disk group version of the new disk group to match what is in the cluster CONFIDENTIAL 27
  26. 26. Virtual SAN Stretched Cluster Upgrade Gotchas • Witness Appliance Considerations • Stretched Cluster Witness Appliances must be treated like ESXi hosts • Avoid rip and replace of Witness Appliance as this will lead to On-Disk format mismatches as discussed earlier • Health Check is unavailable until Witness Appliance is upgraded CONFIDENTIAL 28
  27. 27. Monitoring Disk Format Upgrades • Object Conversion and Disk format upgrades can be monitored using Ruby vSphere Console vsan.upgrade_status Datacenter/computers/VSAN-Cluster –r 60 • Disk format upgrades can be monitored using RVC and/or vSphere Web Client vsan.resync_dashboard Datacenter/computers/VSAN-Cluster –r 60 CONFIDENTIAL 29
  28. 28. Demo time…
  29. 29. Finally Get Somebody “Competent” to Replace Your Hardware! CONFIDENTIAL 31
  30. 30. Q & A CONFIDENTIAL 32
  31. 31. CONFIDENTIAL33
  32. 32. CONFIDENTIAL34