SlideShare a Scribd company logo
1 of 76
vSAN Resiliency and
Performance @ Scale
Sumit Lahiri Product Line Manager
Eric Knauft Staff Engineer
#vmworld#HCI2427BU
HCI2427BU
Agenda
2©2018 VMware, Inc.
​vSAN quick deep dive
​I/O flow
​Resynchronization
​Availability
​Performance Deep Dive : Eric
Disk layout in a single vSAN server
disk groupdisk group disk group disk group disk group
Disk groups contribute to single vSAN datastore in vSphere cluster
Cache
Capacity
vSAN Datastore
§ Max 64 nodes
§ Min 2 nodes (ROBO)
§ Max 5 Disk Groups per
host
§ 2 – Tiers per Disk Group
vSAN very quick Overview
vSAN Datastore
§ Pools local storage intoa
single resource pool
§ Delivers Enterprise grade
scale & performance
§ Managed through policies
§ Integrates compute& storage
management into a single
pane
vSAN Component Layout
VMDK (512GB)
R1
R0 R0
C1 C2(components)
(RAID-1)
(RAID-0)
C1 C2(components)
R0
C1 C2(components)
HFT =2, FTM = RAID-1 , Stripe Width = 2
Note: No blocks are allocated at this time
(SIZE = 256GB)
Witness components not shown
(RAID-0) (RAID-0)
Each replica on different Fault Domain (e.g. host)
R1
R0 R0
C1 C2(components)
(RAID-1)
(RAID-0)
C1 C2(components)
R0
C1 C2(components)
HFT =2, FTM = RAID-1 , Stripe Width = 2
(SIZE = 256GB)
Witness components not shown
VMDK (512GB)
(RAID-0) (RAID-0)
(BLOCKS:4MB)
7©2018 VMware, Inc.
CMMDS: Maintains inventory of all things vSAN
C: Cluster
M: Membership
M: Monitoring
D: Directory
S: Service
v Distributed Directory Service
v In-memory
v Persisted on disk
v Elects Object Owner
​vSAN Objects and
Placement
​Storage
Policies
​RAID
Configurations
​Cluster Membership
8©2018 VMware, Inc.
Master receives updates from all other nodes
Backup Node
Agent Node
Master Node
​vSAN Objects and
Placement
​Storage
Policies
​RAID
Configurations
Receives updates
from
all hosts in the
cluster
Other Nodes subscribe for object
specific updates
​Cluster Membership
9©2018 VMware, Inc.
CLOM : ensures object has configuration that matches the policy
CLOM: cluster level object manager
C
C
C
C
C
CLOM: Cluster level Object Manager
v One per node
v Find placement configuration that will meet the
policy
v Needs to be aware of the placement of all objects on
the node
v Communicates with CMMDS service running on the
same node
Master
C C
C C
10©2018 VMware, Inc.
DOM : Manages I/O flow from VM
DOM: Distributed Object Manager
C
C
C
C
DOM : Distributed Object Manager
v One per object
v Implements the placement configuration prescribed by
CLOM
v Ensure Object consistency (creation, rebuild &
reconfiguration)
v Implements distributed RAID logic
Master
One per Object
C C
C C
11©2018 VMware, Inc.
Schematic Representation of single VMDK deployment
Steady state layout
C1
C2
C1
C2
W
Master
DOM:Distributed object owner
12©2018 VMware, Inc.
Schematic Representation of single VMDK deployment
Each partition elects its CMMDS
C1
C2
C1
C2
W
Master
DOM:Distributed object owner Each partition elects it’s Master
Objects has Quorum
andAvailability
1
2
Partition-01 Partition-02
DOM owner created
In-accessible state
13©2018 VMware, Inc.
Schematic Representation of single VMDK deployment
Each partition elects its CMMDS when there is a network partition
C1
C2
C1
C2
W
Master
DOM : Distributed object owner 1
2
VM HA to the partition that meets the
liveness criteria
Partition meets the liveness
criteria for object
Partition-01 Partition-02
Each partition elects it’s Master
Object has Quorum
& Availability
Agenda
14©2018 VMware, Inc.
​vSAN quick deep dive
​I/O flow
​Resync
​Availability
15©2018 VMware, Inc.
All Flash I/O flow: architecturallayout
H1 H2 H3
VMDK
Cache Tier
Capacity Tier
Replica -1 Replica -2
Capacity Tier
DOM:Distributed object owner
16©2018 VMware, Inc.
All Flash I/O flow: DOM and LSOM
H1 H2 H3
VMDK
Cache Tier
Capacity Tier
Replica -1 Replica -2
DOM:Distributed object owner
Log structured object manager
17©2018 VMware, Inc.
All Flash I/O flow: I/O issued by VM
H1 H1
1
VMDK vSAN Object
VM issues write DOM: one per object
VM DOM LSOM
18©2018 VMware, Inc.
All Flash I/O flow: DOM checks for free space
H1 H1
1
VMDK vSAN Object
2
VM issues write
v Check for conflicting I/Os on the
same I/O range
v and serialize the request
VM DOM LSOM
19©2018 VMware, Inc.
All Flash I/O flow: DOM sends prepare request to LSOM
H1 H1
1 VM issues write
VMDK vSAN Object
2
Check for conflicting I/Os
3
3
Send prepare
request to LSOM
VM DOM LSOM
20©2018 VMware, Inc.
All Flash I/O flow: LSOM commits to cache
H1 H1
1 VM issues write
VMDK vSAN Object
2
Check for conflicting I/Os
3
3
Send prepare request
to LSOM
4
v LSOM commits to cache
v No Dedupe
4
VM DOM LSOM
21©2018 VMware, Inc.
All Flash I/O flow: CMMDS master is not on the I/O path
H1 H1
1 VM issues write
VMDK vSAN Object
2
Check for conflicting I/Os
3
3
Send prepare request
to LSOM
4
4
VM DOM LSOM
I/O flow doesn’t go through the CMMDS master
LSOM commits
to cache
22©2018 VMware, Inc.
All Flash I/O flow: I/O ack propagated back to VM
H1 H1
1 VM issues write
VMDK vSAN Object
2
Check for conflicting I/Os
3
3
Send prepare request
to LSOM
4
LSOM commits
to cache
4
5
SendsAck back
to DOM
6
SendsAck back
to VM
VM DOM LSOM
23©2018 VMware, Inc.
All Flash I/O flow: DOM sends ack back to LSOM
H1 H1
1 VM issues write
VMDK vSAN Object
2
3
3
Send prepare request
to LSOM
4
LSOM commits
to cache
4
5
SendsAck back
to DOM6
SendsAck back
to VM
VM DOM LSOM
Check for conflicting I/Os
DOM sends ack back to LSOM7
24©2018 VMware, Inc.
All Flash I/O flow: Elevator de-stages to capacity
VMDK vSAN Object
1 Block Allocation:
Is Allocated?
Over-write block
Allocate logical
block at 4MB chunk
NOYes
2 Dedupe, compress, encrypt
3 Write to media @ 4KB chunk
Agenda
25©2018 VMware, Inc.
​vSAN quick deep dive
​I/O flow
​Resync
Availability
26©2018 VMware, Inc.
Schematic representation of how Resync works
Example of full Resync
R1
R0 R0
W
C2C2C2C2
Degraded state
Witness component
27©2018 VMware, Inc.
Schematic representation of how Resync works
Full Resync is initiated
R1
R0 R0
W
C2C2C2C2
Witness component
A
R0
C1 C2
Begin Resync
Begin ResyncDegraded state
28©2018 VMware, Inc.
Schematic representation of how Resync works
Full Resync completesand degradedcomponent is marked for deletion
R1
R0 R0
W
C2C2C2C2
Witness component
R0
C1 C2
Marked for deletion
B Resync completes &
degraded components are marked for deletion
Degraded state
29©2018 VMware, Inc.
Schematic representation of how Resync works
By contrast partial rebuilds have fewer blocks to resync
R1
R0 R0
W
C2C2C2C2
Degraded state
Witness component
Partial Repair
30©2018 VMware, Inc.
Examples of partial rebuild
R0
C2C2
Degraded state
Partial Repair
R0
C2C2
A
R0
C1 C2
Begin Resync
Begin Resync
Partial Rebuild Full Rebuild
Host comes out of maintenance mode
Recovery from transient failure
Partial or full
reconstruction
of RAID tree
v Block level copy
v No RAID tree construction
31©2018 VMware, Inc.
Examples of rebuilds
R0
C2C2
Degraded state
Partial Repair
R0
C2C2
A
R0
C1 C2
Begin Resync
Begin Resync
Partial Rebuild Full Rebuild
Host comes out of maintenance mode
Recovery from transient failure
Permanent disk or host failure
Disk Rebalancing
Delta Writes
32©2018 VMware, Inc.
Finally changing storage config is full rebuild
R0
C2C2
Degraded state
Partial Repair
R0
C2C2
A
R0
C1 C2
Begin Resync
Begin Resync
Partial Rebuild Full Rebuild
Host comes out of maintenance mode
Recovery from transient failure
Permanent disk or host failure
Disk Rebalancing
Delta Writes
Storage policy change
Agenda
33©2018 VMware, Inc.
​vSAN quick deep dive
​I/O flow
​Resync
Availability
34©2018 VMware, Inc.
First permanent failure initiates rebuild
Replica -1 Replica -2
Replica -3
Event 1: The first host
is down
1
2 vSAN begins full
rebuild
35©2018 VMware, Inc.
Intuition on planning for Availability
Probability of Availability Impact is:
Joint probability of:
v First failure followed by
v at least 2 more failures before rebuild
completes
36©2018 VMware, Inc.
Factors affectingAvailability
Probability of
component failure
v Type of failure: disk,
disk group, server
v Size of the cluster
v MTBF ratings
37©2018 VMware, Inc.
Factors affectingAvailability
Probability of
component failure
v Scope of failure: disk,
disk group, server
v Size of the cluster
v MTBF ratings
Data to Resync
v Duration of failure:
permanent vs. transient
v Type of failure: disk,
disk group and server
38©2018 VMware, Inc.
Factors affectingAvailability
Probability of
component failure
v Type of failure: disk,
disk group, server
v Size of the cluster
v MTBF ratings
Data to Resync
v Duration of failure:
permanent vs. transient
v Type of failure: disk,
disk group and server
Time to Resync
v Size of Cluster: larger
cluster have higher
resync parallelization
v Resync bandwidth
allocation
39©2018 VMware, Inc.
v Select enterprise grade drives
with higher endurance and
higher MTBFs
v Degraded device handling
Approaches to improvingAvailability (and Durability)
​Reduce Component Failures
40©2018 VMware, Inc.
v Select enterprise grade drives
with higher endurance and
higher MTBFs
v Degraded device handling
v CLOM repair delay settings
v Avoid policy changes
v Point Fix
v Smart Repairs
v What-if Assessments
Approaches to improvingAvailability (and Durability)
​Reduce Component Failures ​Amount of data to Resync
41©2018 VMware, Inc.
v Select enterprise grade drives
with higher endurance and
higher MTBFs
v Degraded device handling
v CLOM repair delay settings
v Avoid policy changes
v Point Fix
v Smart Repairs
v What-if Assessments
v Adaptive Resynchronization
v General performance
Improvements
Approaches to improvingAvailability (and Durability)
​Reduce Component Failures ​Amount of data to Resync ​Resync ETAs
42©2018 VMware, Inc.
Performance Deep Dive
Agenda
• Performance Fundamentals
• Adaptive Resync Architecture
• Monitoring Tools
43©2018 VMware, Inc.
Write BufferArchitecture
​Writes go to a first tier device in a fast sequential log
​Native device bandwidth to absorb short bursts
​Cold data is deduplicated and compressed as it moves out to
second tier
Guest Writes First Tier
Capacity Tier
destaging
44©2018 VMware, Inc.
Write BufferArchitecture
​Writes go to a first tier device in a fast sequential log
​Native device bandwidth to absorb short bursts
​Cold data is deduplicated and compressed as it moves out to
second tier
This de-staging process is slower than first tier writes
If we have sustained write workloads, we need to smoothly find
equilibrium
Guest Writes First Tier
Capacity Tier
destaging
Time
Bandwidth
1st
Tier Bandwidth
Capacity Tier Bandwidth
45©2018 VMware, Inc.
Congestion In Action (Pre-Adaptive Resync)
​We make this transition via a congestion signal
​Congestion is adaptive – apply a greater throttle until we reach
equilibrium
​Congestion stops rising when incoming rate equals de-staging
rate
Guest Writes First Tier
Capacity Tier
destaging
Time
Bandwidth
1st
Tier Bandwidth
Capacity Tier Bandwidth
CongestionEquilibrium
46©2018 VMware, Inc.
​Storage devices have some parallelism, but thereis a limit
​At first, more outstanding IO means more bandwidth (same latency)
​Once we hit max parallelism, more outstanding IO means more latency (same bandwidth)
Queueing Delay
Is high latency a hardware problem or a sizing problem?
Outstanding IO
Bandwidth
Outstanding IO
Latency
​Storage devices have some parallelism, but thereis a limit
​At first, more outstanding IO means more bandwidth (same latency)
​Once we hit max parallelism, more outstanding IO means more latency (same bandwidth)
​Storage devices have some parallelism, but thereis a limit
​At first, more outstanding IO means more bandwidth (same latency)
​Once we hit max parallelism, more outstanding IO means more latency (same bandwidth)
47©2018 VMware, Inc.
​Often high latency is the most visible symptom
Queueing Delay
Is high latency a hardware problem or a sizing problem?
Outstanding IO
Bandwidth
Outstanding IO
Latency
​Did we push the system to far?
​Or is there an issue with hardware
48©2018 VMware, Inc.
​Before: The more resyncs were happening, the larger the share of destage bandwidth.
• Many resyncs + low workload → drive up latency of vm IO
• Few resyncs + high workload → resync takes a long time
​Adaptive Resync: resync should get 20% of the bandwidth (if contended)
• We can use more if the guest IO is underutilizing the device
​Upgrade, policy change, rebalance should not be scary or take too long due to unfairness.
Adaptive Resync Customer Visible Before-and-After
49©2018 VMware, Inc.
​We are using Congestion to provide three different properties:
• Discover the bandwidth of the devices
• Fairly balance different classes of IO (80% guest IO, 20% resync IO)
• Push back on clients to slow down
​New approach: have a separate layer for
each guarantee.
What does Congestion try to do beforeAdaptive Resync
Bandwidth Regulator
Fairness Scheduler
Back Pressure
Backend
50©2018 VMware, Inc.
Adaptive Resync Deep Dive
​Per Disk-Group scheduler
​Bandwidth regulator discover the destaging rate
• Adaptive signal: write buffer fill
• Adaptive throttle: bandwidth limit
​Dispatch Scheduler fairly balances different
classes of IO
• (80% guest IO, 20% resync IO)
​Back pressure congestion pushes back on clients to
slow down
• Adaptive signal: scheduler queue fill
• Adaptive throttle: latency per op
Bandwith RegulatorDOM
LSOM Fullness signal
(LSOM congestion)
Dispatch Scheduler
Queues generate
Back-pressure
Clients
Back-pressure
Congestion
WB Fill
Adaptive
Adaptive
51©2018 VMware, Inc.
​We want to fairly share the write buffer
​Adaptively discover the bandwidth
• Adding latency is not fair
​Fairly share between IO Classes
• Resync
• VM
• Namespace
​Easy because you can see what’s waiting.
​Difficult to share bandwidth across hosts
​Can’t see across the wire into what’s waiting on the
other side
​Need to allocate and reclaim shares.
• Complex timing based
​Instead we use latency
• Don’t need to see what’s waiting
​Manage Write Buffer Fullness ​Put Backpressure on the Clients
The Technical Challenges
52©2018 VMware, Inc.
And you can monitor this all in
vSphere
We’ll show the graphs at every layer
53©2018 VMware, Inc.
Cluster LevelView
Sequential Write Workload
54©2018 VMware, Inc.
Diagram
Bandwith RegulatorDOM
LSOM Fullness signal
(congestion)
Dispatch Scheduler
Queues generate
Back-pressure
Clients
Back-pressure
Congestion
WB Fill
55©2018 VMware, Inc.
Virtual MachineView
Sequential Write Workload
56©2018 VMware, Inc.
xxx
xxx
xxx
xxx
xxx
57©2018 VMware, Inc.
58©2018 VMware, Inc.
Diving into the backend
Answer the following questions:
• Too many Outstanding IO?
• Is it first tier latency?
• Is it de-staging latency?
• Device or Network Issue?
59©2018 VMware, Inc.
Diagram
Bandwith RegulatorDOM
LSOM Fullness signal
(congestion)
Dispatch Scheduler
Queues generate
Back-pressure
Clients
Back-pressure
Congestion
WB Fill
The top half shows if we have
too much Outstanding IO
60©2018 VMware, Inc.
Diagram
Bandwith RegulatorDOM
LSOM Fullness signal
(congestion)
Dispatch Scheduler
Queues generate
Back-pressure
Clients
Back-pressure
Congestion
WB Fill
This is where we can see if it is
a sizing issue (too much IO
queuing up)
61©2018 VMware, Inc.
Diagram
Bandwith RegulatorDOM
LSOM Fullness signal
(congestion)
Dispatch Scheduler
Queues generate
Back-pressure
Clients
Back-pressure
Congestion
WB Fill
A very high amount of
Outstanding IO causes
backpressure congestion
62©2018 VMware, Inc.
Diagram
Bandwith RegulatorDOM
LSOM Fullness signal
(congestion)
Dispatch Scheduler
Queues generate
Back-pressure
Clients
Back-pressure
Congestion
WB Fill
Backend = Latency including
queues and below
63©2018 VMware, Inc.
Diagram
Bandwith RegulatorDOM
LSOM Fullness signal
(congestion)
Dispatch Scheduler
Queues generate
Back-pressure
Clients
Back-pressure
Congestion
WB Fill
Disk Groups are where we
see first tier latency
64©2018 VMware, Inc.
Diagram
Bandwith RegulatorDOM
LSOM Fullness signal
(congestion)
Dispatch Scheduler
Queues generate
Back-pressure
Clients
Back-pressure
Congestion
WB Fill
This is where we see the de-
stage rate
65©2018 VMware, Inc.
Diagram
Bandwith RegulatorDOM
LSOM Fullness signal
(congestion)
Dispatch Scheduler
Queues generate
Back-pressure
Clients
Back-pressure
Congestion
WB Fill
Disk Groups congestion
shows the signal from
LSOM
66©2018 VMware, Inc.
Diagram
Bandwith RegulatorDOM
LSOM Fullness signal
(congestion)
Dispatch Scheduler
Queues generate
Back-pressure
Clients
Back-pressure
Congestion
WB Fill
Disk Groups congestion
comes from WB Fill Also:
• Many log entries (small
writes, many objects)
• Component Congestion
(small writes, one object)
• Memory usage (rare)
67©2018 VMware, Inc.
Diving into the backend
Answer the following questions:
• Is it first tier performance?
• Is it de-staging performance?
• Too many Outstanding IO?
• Device or Network Issue?
68©2018 VMware, Inc.
©2018 VMware, Inc. 69
What about resync fairness?
70©2018 VMware, Inc.
• Should be in 4:1 ratio
• Ratio is measured on
normalized bandwidth (penalty
for small IOs)
• If one type is not using the
whole bandwidth, he other can
claim the leftover
71©2018 VMware, Inc.
Resync Fairness Applies even
when we have congestion
72©2018 VMware, Inc.
​Now you can upgrade and do
maintenancewith peace of mind
73©2018 VMware, Inc.
Get Ahead of the Curve – vSAN Private Beta
​vSAN Data Protection
​Native enterprise-grade
protection
​vSAN File Services
​Expanding vSAN beyond
block storage
​Cloud Native Storage
​Persistent storage for
containers
Sign up at http://www.vmware.com/go/vsan-beta
vSAN Performance and Resiliency at Scale
vSAN Performance and Resiliency at Scale
vSAN Performance and Resiliency at Scale

More Related Content

What's hot

Hyper-Converged Infrastructure: Concepts
Hyper-Converged Infrastructure: ConceptsHyper-Converged Infrastructure: Concepts
Hyper-Converged Infrastructure: ConceptsNick Scuola
 
VMware Advance Troubleshooting Workshop - Day 5
VMware Advance Troubleshooting Workshop - Day 5VMware Advance Troubleshooting Workshop - Day 5
VMware Advance Troubleshooting Workshop - Day 5Vepsun Technologies
 
Building a Stretched Cluster using Virtual SAN 6.1
Building a Stretched Cluster using Virtual SAN 6.1Building a Stretched Cluster using Virtual SAN 6.1
Building a Stretched Cluster using Virtual SAN 6.1Duncan Epping
 
Presentation v mware virtual san 6.0
Presentation   v mware virtual san 6.0Presentation   v mware virtual san 6.0
Presentation v mware virtual san 6.0solarisyougood
 
NF101: Nutanix 101
NF101: Nutanix 101NF101: Nutanix 101
NF101: Nutanix 101NEXTtour
 
VMware - Virtual SAN - IT Changes Everything
VMware - Virtual SAN - IT Changes EverythingVMware - Virtual SAN - IT Changes Everything
VMware - Virtual SAN - IT Changes EverythingVMUG IT
 
Hci solution with VxRail
Hci solution with VxRailHci solution with VxRail
Hci solution with VxRailAnton An
 
Advanced performance troubleshooting using esxtop
Advanced performance troubleshooting using esxtopAdvanced performance troubleshooting using esxtop
Advanced performance troubleshooting using esxtopAlan Renouf
 
VMware vSAN - Novosco, June 2017
VMware vSAN - Novosco, June 2017VMware vSAN - Novosco, June 2017
VMware vSAN - Novosco, June 2017Novosco
 
NSX for vSphere Logical Routing Deep Dive
NSX for vSphere Logical Routing Deep DiveNSX for vSphere Logical Routing Deep Dive
NSX for vSphere Logical Routing Deep DivePooja Patel
 
Can Apache Kafka Replace a Database?
Can Apache Kafka Replace a Database?Can Apache Kafka Replace a Database?
Can Apache Kafka Replace a Database?Kai Wähner
 
TechWiseTV Workshop: 5th Generation UCS
TechWiseTV Workshop: 5th Generation UCSTechWiseTV Workshop: 5th Generation UCS
TechWiseTV Workshop: 5th Generation UCSRobb Boyd
 
VMware - HCX - Architecture and Design .pdf
VMware - HCX - Architecture and Design .pdfVMware - HCX - Architecture and Design .pdf
VMware - HCX - Architecture and Design .pdfGiancarloSampaolesi
 
VxRail Appliance - Modernize your infrastructure and accelerate IT transforma...
VxRail Appliance - Modernize your infrastructure and accelerate IT transforma...VxRail Appliance - Modernize your infrastructure and accelerate IT transforma...
VxRail Appliance - Modernize your infrastructure and accelerate IT transforma...Maichino Sepede
 
Cassandra Operations at Netflix
Cassandra Operations at NetflixCassandra Operations at Netflix
Cassandra Operations at Netflixgreggulrich
 
VMware vSphere Performance Troubleshooting
VMware vSphere Performance TroubleshootingVMware vSphere Performance Troubleshooting
VMware vSphere Performance TroubleshootingDan Brinkmann
 
What CloudStackers Need To Know About LINSTOR/DRBD
What CloudStackers Need To Know About LINSTOR/DRBDWhat CloudStackers Need To Know About LINSTOR/DRBD
What CloudStackers Need To Know About LINSTOR/DRBDShapeBlue
 
VMware VSAN Technical Deep Dive - March 2014
VMware VSAN Technical Deep Dive - March 2014VMware VSAN Technical Deep Dive - March 2014
VMware VSAN Technical Deep Dive - March 2014David Davis
 
VMware Advance Troubleshooting Workshop - Day 4
VMware Advance Troubleshooting Workshop - Day 4VMware Advance Troubleshooting Workshop - Day 4
VMware Advance Troubleshooting Workshop - Day 4Vepsun Technologies
 
VMware HCI solutions - 2020-01-16
VMware HCI solutions - 2020-01-16VMware HCI solutions - 2020-01-16
VMware HCI solutions - 2020-01-16David Pasek
 

What's hot (20)

Hyper-Converged Infrastructure: Concepts
Hyper-Converged Infrastructure: ConceptsHyper-Converged Infrastructure: Concepts
Hyper-Converged Infrastructure: Concepts
 
VMware Advance Troubleshooting Workshop - Day 5
VMware Advance Troubleshooting Workshop - Day 5VMware Advance Troubleshooting Workshop - Day 5
VMware Advance Troubleshooting Workshop - Day 5
 
Building a Stretched Cluster using Virtual SAN 6.1
Building a Stretched Cluster using Virtual SAN 6.1Building a Stretched Cluster using Virtual SAN 6.1
Building a Stretched Cluster using Virtual SAN 6.1
 
Presentation v mware virtual san 6.0
Presentation   v mware virtual san 6.0Presentation   v mware virtual san 6.0
Presentation v mware virtual san 6.0
 
NF101: Nutanix 101
NF101: Nutanix 101NF101: Nutanix 101
NF101: Nutanix 101
 
VMware - Virtual SAN - IT Changes Everything
VMware - Virtual SAN - IT Changes EverythingVMware - Virtual SAN - IT Changes Everything
VMware - Virtual SAN - IT Changes Everything
 
Hci solution with VxRail
Hci solution with VxRailHci solution with VxRail
Hci solution with VxRail
 
Advanced performance troubleshooting using esxtop
Advanced performance troubleshooting using esxtopAdvanced performance troubleshooting using esxtop
Advanced performance troubleshooting using esxtop
 
VMware vSAN - Novosco, June 2017
VMware vSAN - Novosco, June 2017VMware vSAN - Novosco, June 2017
VMware vSAN - Novosco, June 2017
 
NSX for vSphere Logical Routing Deep Dive
NSX for vSphere Logical Routing Deep DiveNSX for vSphere Logical Routing Deep Dive
NSX for vSphere Logical Routing Deep Dive
 
Can Apache Kafka Replace a Database?
Can Apache Kafka Replace a Database?Can Apache Kafka Replace a Database?
Can Apache Kafka Replace a Database?
 
TechWiseTV Workshop: 5th Generation UCS
TechWiseTV Workshop: 5th Generation UCSTechWiseTV Workshop: 5th Generation UCS
TechWiseTV Workshop: 5th Generation UCS
 
VMware - HCX - Architecture and Design .pdf
VMware - HCX - Architecture and Design .pdfVMware - HCX - Architecture and Design .pdf
VMware - HCX - Architecture and Design .pdf
 
VxRail Appliance - Modernize your infrastructure and accelerate IT transforma...
VxRail Appliance - Modernize your infrastructure and accelerate IT transforma...VxRail Appliance - Modernize your infrastructure and accelerate IT transforma...
VxRail Appliance - Modernize your infrastructure and accelerate IT transforma...
 
Cassandra Operations at Netflix
Cassandra Operations at NetflixCassandra Operations at Netflix
Cassandra Operations at Netflix
 
VMware vSphere Performance Troubleshooting
VMware vSphere Performance TroubleshootingVMware vSphere Performance Troubleshooting
VMware vSphere Performance Troubleshooting
 
What CloudStackers Need To Know About LINSTOR/DRBD
What CloudStackers Need To Know About LINSTOR/DRBDWhat CloudStackers Need To Know About LINSTOR/DRBD
What CloudStackers Need To Know About LINSTOR/DRBD
 
VMware VSAN Technical Deep Dive - March 2014
VMware VSAN Technical Deep Dive - March 2014VMware VSAN Technical Deep Dive - March 2014
VMware VSAN Technical Deep Dive - March 2014
 
VMware Advance Troubleshooting Workshop - Day 4
VMware Advance Troubleshooting Workshop - Day 4VMware Advance Troubleshooting Workshop - Day 4
VMware Advance Troubleshooting Workshop - Day 4
 
VMware HCI solutions - 2020-01-16
VMware HCI solutions - 2020-01-16VMware HCI solutions - 2020-01-16
VMware HCI solutions - 2020-01-16
 

Similar to vSAN Performance and Resiliency at Scale

Running DataStax Enterprise in VMware Cloud and Hybrid Environments
Running DataStax Enterprise in VMware Cloud and Hybrid EnvironmentsRunning DataStax Enterprise in VMware Cloud and Hybrid Environments
Running DataStax Enterprise in VMware Cloud and Hybrid EnvironmentsDataStax
 
infraxstructure: Stas Levitan, "Always On" business in cloud - 2016"
infraxstructure: Stas Levitan, "Always On" business in cloud - 2016"infraxstructure: Stas Levitan, "Always On" business in cloud - 2016"
infraxstructure: Stas Levitan, "Always On" business in cloud - 2016"PROIDEA
 
Containerize Legacy .NET Framework Web Apps for Cloud Migration - ENT201 - Ch...
Containerize Legacy .NET Framework Web Apps for Cloud Migration - ENT201 - Ch...Containerize Legacy .NET Framework Web Apps for Cloud Migration - ENT201 - Ch...
Containerize Legacy .NET Framework Web Apps for Cloud Migration - ENT201 - Ch...Amazon Web Services
 
Containerize Legacy .NET Framework Web Apps for Cloud Migration
Containerize Legacy .NET Framework Web Apps for Cloud Migration Containerize Legacy .NET Framework Web Apps for Cloud Migration
Containerize Legacy .NET Framework Web Apps for Cloud Migration Amazon Web Services
 
Building enterprise class disaster recovery as a service to aws - session spo...
Building enterprise class disaster recovery as a service to aws - session spo...Building enterprise class disaster recovery as a service to aws - session spo...
Building enterprise class disaster recovery as a service to aws - session spo...Amazon Web Services
 
Containerize Legacy .NET Framework Web Apps for Cloud Migration (WIN305) - AW...
Containerize Legacy .NET Framework Web Apps for Cloud Migration (WIN305) - AW...Containerize Legacy .NET Framework Web Apps for Cloud Migration (WIN305) - AW...
Containerize Legacy .NET Framework Web Apps for Cloud Migration (WIN305) - AW...Amazon Web Services
 
Run Stateful Apps on Kubernetes with VMware PKS - Highlight WebLogic Server
Run Stateful Apps on Kubernetes with VMware PKS - Highlight WebLogic Server Run Stateful Apps on Kubernetes with VMware PKS - Highlight WebLogic Server
Run Stateful Apps on Kubernetes with VMware PKS - Highlight WebLogic Server Simone Morellato
 
Monitoring CloudStack and components
Monitoring CloudStack and componentsMonitoring CloudStack and components
Monitoring CloudStack and componentsShapeBlue
 
Connectivity Options for VMware Cloud on AWS Software Defined Data Centers (S...
Connectivity Options for VMware Cloud on AWS Software Defined Data Centers (S...Connectivity Options for VMware Cloud on AWS Software Defined Data Centers (S...
Connectivity Options for VMware Cloud on AWS Software Defined Data Centers (S...Amazon Web Services
 
Running Production Workloads in VMware Cloud on AWS (ENT313-S) - AWS re:Inven...
Running Production Workloads in VMware Cloud on AWS (ENT313-S) - AWS re:Inven...Running Production Workloads in VMware Cloud on AWS (ENT313-S) - AWS re:Inven...
Running Production Workloads in VMware Cloud on AWS (ENT313-S) - AWS re:Inven...Amazon Web Services
 
ENT208 Transform your Business with VMware Cloud on AWS
ENT208 Transform your Business with VMware Cloud on AWSENT208 Transform your Business with VMware Cloud on AWS
ENT208 Transform your Business with VMware Cloud on AWSAmazon Web Services
 
EMC VSPEX for Virtualizing Your Data Center
EMC VSPEX for Virtualizing Your Data CenterEMC VSPEX for Virtualizing Your Data Center
EMC VSPEX for Virtualizing Your Data CenterCTI Group
 
Zerto Virtual Replication 4.5
Zerto Virtual Replication 4.5Zerto Virtual Replication 4.5
Zerto Virtual Replication 4.5BusinesstoVirtual
 
Storage and hyper v - the choices you can make and the things you need to kno...
Storage and hyper v - the choices you can make and the things you need to kno...Storage and hyper v - the choices you can make and the things you need to kno...
Storage and hyper v - the choices you can make and the things you need to kno...Louis Göhl
 
The Unofficial VCAP / VCP VMware Study Guide
The Unofficial VCAP / VCP VMware Study GuideThe Unofficial VCAP / VCP VMware Study Guide
The Unofficial VCAP / VCP VMware Study GuideVeeam Software
 
Hyper-V Best Practices & Tips and Tricks
Hyper-V Best Practices & Tips and TricksHyper-V Best Practices & Tips and Tricks
Hyper-V Best Practices & Tips and TricksAmit Gatenyo
 
AWS Summit Auckland - Sponsor Presentation - Zerto
AWS Summit Auckland - Sponsor Presentation - ZertoAWS Summit Auckland - Sponsor Presentation - Zerto
AWS Summit Auckland - Sponsor Presentation - ZertoAmazon Web Services
 
ZERTO Introduction to End User Presentation
ZERTO Introduction to End User PresentationZERTO Introduction to End User Presentation
ZERTO Introduction to End User PresentationBusinesstoVirtual
 
Load Balancing, Failover and Scalability with ColdFusion
Load Balancing, Failover and Scalability with ColdFusionLoad Balancing, Failover and Scalability with ColdFusion
Load Balancing, Failover and Scalability with ColdFusionColdFusionConference
 
Modern Application Configuration in Kubernetes
Modern Application Configuration in KubernetesModern Application Configuration in Kubernetes
Modern Application Configuration in KubernetesVMware Tanzu
 

Similar to vSAN Performance and Resiliency at Scale (20)

Running DataStax Enterprise in VMware Cloud and Hybrid Environments
Running DataStax Enterprise in VMware Cloud and Hybrid EnvironmentsRunning DataStax Enterprise in VMware Cloud and Hybrid Environments
Running DataStax Enterprise in VMware Cloud and Hybrid Environments
 
infraxstructure: Stas Levitan, "Always On" business in cloud - 2016"
infraxstructure: Stas Levitan, "Always On" business in cloud - 2016"infraxstructure: Stas Levitan, "Always On" business in cloud - 2016"
infraxstructure: Stas Levitan, "Always On" business in cloud - 2016"
 
Containerize Legacy .NET Framework Web Apps for Cloud Migration - ENT201 - Ch...
Containerize Legacy .NET Framework Web Apps for Cloud Migration - ENT201 - Ch...Containerize Legacy .NET Framework Web Apps for Cloud Migration - ENT201 - Ch...
Containerize Legacy .NET Framework Web Apps for Cloud Migration - ENT201 - Ch...
 
Containerize Legacy .NET Framework Web Apps for Cloud Migration
Containerize Legacy .NET Framework Web Apps for Cloud Migration Containerize Legacy .NET Framework Web Apps for Cloud Migration
Containerize Legacy .NET Framework Web Apps for Cloud Migration
 
Building enterprise class disaster recovery as a service to aws - session spo...
Building enterprise class disaster recovery as a service to aws - session spo...Building enterprise class disaster recovery as a service to aws - session spo...
Building enterprise class disaster recovery as a service to aws - session spo...
 
Containerize Legacy .NET Framework Web Apps for Cloud Migration (WIN305) - AW...
Containerize Legacy .NET Framework Web Apps for Cloud Migration (WIN305) - AW...Containerize Legacy .NET Framework Web Apps for Cloud Migration (WIN305) - AW...
Containerize Legacy .NET Framework Web Apps for Cloud Migration (WIN305) - AW...
 
Run Stateful Apps on Kubernetes with VMware PKS - Highlight WebLogic Server
Run Stateful Apps on Kubernetes with VMware PKS - Highlight WebLogic Server Run Stateful Apps on Kubernetes with VMware PKS - Highlight WebLogic Server
Run Stateful Apps on Kubernetes with VMware PKS - Highlight WebLogic Server
 
Monitoring CloudStack and components
Monitoring CloudStack and componentsMonitoring CloudStack and components
Monitoring CloudStack and components
 
Connectivity Options for VMware Cloud on AWS Software Defined Data Centers (S...
Connectivity Options for VMware Cloud on AWS Software Defined Data Centers (S...Connectivity Options for VMware Cloud on AWS Software Defined Data Centers (S...
Connectivity Options for VMware Cloud on AWS Software Defined Data Centers (S...
 
Running Production Workloads in VMware Cloud on AWS (ENT313-S) - AWS re:Inven...
Running Production Workloads in VMware Cloud on AWS (ENT313-S) - AWS re:Inven...Running Production Workloads in VMware Cloud on AWS (ENT313-S) - AWS re:Inven...
Running Production Workloads in VMware Cloud on AWS (ENT313-S) - AWS re:Inven...
 
ENT208 Transform your Business with VMware Cloud on AWS
ENT208 Transform your Business with VMware Cloud on AWSENT208 Transform your Business with VMware Cloud on AWS
ENT208 Transform your Business with VMware Cloud on AWS
 
EMC VSPEX for Virtualizing Your Data Center
EMC VSPEX for Virtualizing Your Data CenterEMC VSPEX for Virtualizing Your Data Center
EMC VSPEX for Virtualizing Your Data Center
 
Zerto Virtual Replication 4.5
Zerto Virtual Replication 4.5Zerto Virtual Replication 4.5
Zerto Virtual Replication 4.5
 
Storage and hyper v - the choices you can make and the things you need to kno...
Storage and hyper v - the choices you can make and the things you need to kno...Storage and hyper v - the choices you can make and the things you need to kno...
Storage and hyper v - the choices you can make and the things you need to kno...
 
The Unofficial VCAP / VCP VMware Study Guide
The Unofficial VCAP / VCP VMware Study GuideThe Unofficial VCAP / VCP VMware Study Guide
The Unofficial VCAP / VCP VMware Study Guide
 
Hyper-V Best Practices & Tips and Tricks
Hyper-V Best Practices & Tips and TricksHyper-V Best Practices & Tips and Tricks
Hyper-V Best Practices & Tips and Tricks
 
AWS Summit Auckland - Sponsor Presentation - Zerto
AWS Summit Auckland - Sponsor Presentation - ZertoAWS Summit Auckland - Sponsor Presentation - Zerto
AWS Summit Auckland - Sponsor Presentation - Zerto
 
ZERTO Introduction to End User Presentation
ZERTO Introduction to End User PresentationZERTO Introduction to End User Presentation
ZERTO Introduction to End User Presentation
 
Load Balancing, Failover and Scalability with ColdFusion
Load Balancing, Failover and Scalability with ColdFusionLoad Balancing, Failover and Scalability with ColdFusion
Load Balancing, Failover and Scalability with ColdFusion
 
Modern Application Configuration in Kubernetes
Modern Application Configuration in KubernetesModern Application Configuration in Kubernetes
Modern Application Configuration in Kubernetes
 

Recently uploaded

The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfkalichargn70th171
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - InfographicHr365.us smith
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio, Inc.
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureDinusha Kumarasiri
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样umasea
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataBradBedford3
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfjoe51371421
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEOrtus Solutions, Corp
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...soniya singh
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
buds n tech IT solutions
buds n  tech IT                solutionsbuds n  tech IT                solutions
buds n tech IT solutionsmonugehlot87
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfPower Karaoke
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ
 
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxTier1 app
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptkotipi9215
 

Recently uploaded (20)

The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - Infographic
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with Azure
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdf
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
buds n tech IT solutions
buds n  tech IT                solutionsbuds n  tech IT                solutions
buds n tech IT solutions
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdf
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStack
 
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.ppt
 

vSAN Performance and Resiliency at Scale

  • 1. vSAN Resiliency and Performance @ Scale Sumit Lahiri Product Line Manager Eric Knauft Staff Engineer #vmworld#HCI2427BU HCI2427BU
  • 2. Agenda 2©2018 VMware, Inc. ​vSAN quick deep dive ​I/O flow ​Resynchronization ​Availability ​Performance Deep Dive : Eric
  • 3. Disk layout in a single vSAN server disk groupdisk group disk group disk group disk group Disk groups contribute to single vSAN datastore in vSphere cluster Cache Capacity vSAN Datastore § Max 64 nodes § Min 2 nodes (ROBO) § Max 5 Disk Groups per host § 2 – Tiers per Disk Group
  • 4. vSAN very quick Overview vSAN Datastore § Pools local storage intoa single resource pool § Delivers Enterprise grade scale & performance § Managed through policies § Integrates compute& storage management into a single pane
  • 5. vSAN Component Layout VMDK (512GB) R1 R0 R0 C1 C2(components) (RAID-1) (RAID-0) C1 C2(components) R0 C1 C2(components) HFT =2, FTM = RAID-1 , Stripe Width = 2 Note: No blocks are allocated at this time (SIZE = 256GB) Witness components not shown (RAID-0) (RAID-0)
  • 6. Each replica on different Fault Domain (e.g. host) R1 R0 R0 C1 C2(components) (RAID-1) (RAID-0) C1 C2(components) R0 C1 C2(components) HFT =2, FTM = RAID-1 , Stripe Width = 2 (SIZE = 256GB) Witness components not shown VMDK (512GB) (RAID-0) (RAID-0) (BLOCKS:4MB)
  • 7. 7©2018 VMware, Inc. CMMDS: Maintains inventory of all things vSAN C: Cluster M: Membership M: Monitoring D: Directory S: Service v Distributed Directory Service v In-memory v Persisted on disk v Elects Object Owner ​vSAN Objects and Placement ​Storage Policies ​RAID Configurations ​Cluster Membership
  • 8. 8©2018 VMware, Inc. Master receives updates from all other nodes Backup Node Agent Node Master Node ​vSAN Objects and Placement ​Storage Policies ​RAID Configurations Receives updates from all hosts in the cluster Other Nodes subscribe for object specific updates ​Cluster Membership
  • 9. 9©2018 VMware, Inc. CLOM : ensures object has configuration that matches the policy CLOM: cluster level object manager C C C C C CLOM: Cluster level Object Manager v One per node v Find placement configuration that will meet the policy v Needs to be aware of the placement of all objects on the node v Communicates with CMMDS service running on the same node Master C C C C
  • 10. 10©2018 VMware, Inc. DOM : Manages I/O flow from VM DOM: Distributed Object Manager C C C C DOM : Distributed Object Manager v One per object v Implements the placement configuration prescribed by CLOM v Ensure Object consistency (creation, rebuild & reconfiguration) v Implements distributed RAID logic Master One per Object C C C C
  • 11. 11©2018 VMware, Inc. Schematic Representation of single VMDK deployment Steady state layout C1 C2 C1 C2 W Master DOM:Distributed object owner
  • 12. 12©2018 VMware, Inc. Schematic Representation of single VMDK deployment Each partition elects its CMMDS C1 C2 C1 C2 W Master DOM:Distributed object owner Each partition elects it’s Master Objects has Quorum andAvailability 1 2 Partition-01 Partition-02 DOM owner created In-accessible state
  • 13. 13©2018 VMware, Inc. Schematic Representation of single VMDK deployment Each partition elects its CMMDS when there is a network partition C1 C2 C1 C2 W Master DOM : Distributed object owner 1 2 VM HA to the partition that meets the liveness criteria Partition meets the liveness criteria for object Partition-01 Partition-02 Each partition elects it’s Master Object has Quorum & Availability
  • 14. Agenda 14©2018 VMware, Inc. ​vSAN quick deep dive ​I/O flow ​Resync ​Availability
  • 15. 15©2018 VMware, Inc. All Flash I/O flow: architecturallayout H1 H2 H3 VMDK Cache Tier Capacity Tier Replica -1 Replica -2 Capacity Tier DOM:Distributed object owner
  • 16. 16©2018 VMware, Inc. All Flash I/O flow: DOM and LSOM H1 H2 H3 VMDK Cache Tier Capacity Tier Replica -1 Replica -2 DOM:Distributed object owner Log structured object manager
  • 17. 17©2018 VMware, Inc. All Flash I/O flow: I/O issued by VM H1 H1 1 VMDK vSAN Object VM issues write DOM: one per object VM DOM LSOM
  • 18. 18©2018 VMware, Inc. All Flash I/O flow: DOM checks for free space H1 H1 1 VMDK vSAN Object 2 VM issues write v Check for conflicting I/Os on the same I/O range v and serialize the request VM DOM LSOM
  • 19. 19©2018 VMware, Inc. All Flash I/O flow: DOM sends prepare request to LSOM H1 H1 1 VM issues write VMDK vSAN Object 2 Check for conflicting I/Os 3 3 Send prepare request to LSOM VM DOM LSOM
  • 20. 20©2018 VMware, Inc. All Flash I/O flow: LSOM commits to cache H1 H1 1 VM issues write VMDK vSAN Object 2 Check for conflicting I/Os 3 3 Send prepare request to LSOM 4 v LSOM commits to cache v No Dedupe 4 VM DOM LSOM
  • 21. 21©2018 VMware, Inc. All Flash I/O flow: CMMDS master is not on the I/O path H1 H1 1 VM issues write VMDK vSAN Object 2 Check for conflicting I/Os 3 3 Send prepare request to LSOM 4 4 VM DOM LSOM I/O flow doesn’t go through the CMMDS master LSOM commits to cache
  • 22. 22©2018 VMware, Inc. All Flash I/O flow: I/O ack propagated back to VM H1 H1 1 VM issues write VMDK vSAN Object 2 Check for conflicting I/Os 3 3 Send prepare request to LSOM 4 LSOM commits to cache 4 5 SendsAck back to DOM 6 SendsAck back to VM VM DOM LSOM
  • 23. 23©2018 VMware, Inc. All Flash I/O flow: DOM sends ack back to LSOM H1 H1 1 VM issues write VMDK vSAN Object 2 3 3 Send prepare request to LSOM 4 LSOM commits to cache 4 5 SendsAck back to DOM6 SendsAck back to VM VM DOM LSOM Check for conflicting I/Os DOM sends ack back to LSOM7
  • 24. 24©2018 VMware, Inc. All Flash I/O flow: Elevator de-stages to capacity VMDK vSAN Object 1 Block Allocation: Is Allocated? Over-write block Allocate logical block at 4MB chunk NOYes 2 Dedupe, compress, encrypt 3 Write to media @ 4KB chunk
  • 25. Agenda 25©2018 VMware, Inc. ​vSAN quick deep dive ​I/O flow ​Resync Availability
  • 26. 26©2018 VMware, Inc. Schematic representation of how Resync works Example of full Resync R1 R0 R0 W C2C2C2C2 Degraded state Witness component
  • 27. 27©2018 VMware, Inc. Schematic representation of how Resync works Full Resync is initiated R1 R0 R0 W C2C2C2C2 Witness component A R0 C1 C2 Begin Resync Begin ResyncDegraded state
  • 28. 28©2018 VMware, Inc. Schematic representation of how Resync works Full Resync completesand degradedcomponent is marked for deletion R1 R0 R0 W C2C2C2C2 Witness component R0 C1 C2 Marked for deletion B Resync completes & degraded components are marked for deletion Degraded state
  • 29. 29©2018 VMware, Inc. Schematic representation of how Resync works By contrast partial rebuilds have fewer blocks to resync R1 R0 R0 W C2C2C2C2 Degraded state Witness component Partial Repair
  • 30. 30©2018 VMware, Inc. Examples of partial rebuild R0 C2C2 Degraded state Partial Repair R0 C2C2 A R0 C1 C2 Begin Resync Begin Resync Partial Rebuild Full Rebuild Host comes out of maintenance mode Recovery from transient failure Partial or full reconstruction of RAID tree v Block level copy v No RAID tree construction
  • 31. 31©2018 VMware, Inc. Examples of rebuilds R0 C2C2 Degraded state Partial Repair R0 C2C2 A R0 C1 C2 Begin Resync Begin Resync Partial Rebuild Full Rebuild Host comes out of maintenance mode Recovery from transient failure Permanent disk or host failure Disk Rebalancing Delta Writes
  • 32. 32©2018 VMware, Inc. Finally changing storage config is full rebuild R0 C2C2 Degraded state Partial Repair R0 C2C2 A R0 C1 C2 Begin Resync Begin Resync Partial Rebuild Full Rebuild Host comes out of maintenance mode Recovery from transient failure Permanent disk or host failure Disk Rebalancing Delta Writes Storage policy change
  • 33. Agenda 33©2018 VMware, Inc. ​vSAN quick deep dive ​I/O flow ​Resync Availability
  • 34. 34©2018 VMware, Inc. First permanent failure initiates rebuild Replica -1 Replica -2 Replica -3 Event 1: The first host is down 1 2 vSAN begins full rebuild
  • 35. 35©2018 VMware, Inc. Intuition on planning for Availability Probability of Availability Impact is: Joint probability of: v First failure followed by v at least 2 more failures before rebuild completes
  • 36. 36©2018 VMware, Inc. Factors affectingAvailability Probability of component failure v Type of failure: disk, disk group, server v Size of the cluster v MTBF ratings
  • 37. 37©2018 VMware, Inc. Factors affectingAvailability Probability of component failure v Scope of failure: disk, disk group, server v Size of the cluster v MTBF ratings Data to Resync v Duration of failure: permanent vs. transient v Type of failure: disk, disk group and server
  • 38. 38©2018 VMware, Inc. Factors affectingAvailability Probability of component failure v Type of failure: disk, disk group, server v Size of the cluster v MTBF ratings Data to Resync v Duration of failure: permanent vs. transient v Type of failure: disk, disk group and server Time to Resync v Size of Cluster: larger cluster have higher resync parallelization v Resync bandwidth allocation
  • 39. 39©2018 VMware, Inc. v Select enterprise grade drives with higher endurance and higher MTBFs v Degraded device handling Approaches to improvingAvailability (and Durability) ​Reduce Component Failures
  • 40. 40©2018 VMware, Inc. v Select enterprise grade drives with higher endurance and higher MTBFs v Degraded device handling v CLOM repair delay settings v Avoid policy changes v Point Fix v Smart Repairs v What-if Assessments Approaches to improvingAvailability (and Durability) ​Reduce Component Failures ​Amount of data to Resync
  • 41. 41©2018 VMware, Inc. v Select enterprise grade drives with higher endurance and higher MTBFs v Degraded device handling v CLOM repair delay settings v Avoid policy changes v Point Fix v Smart Repairs v What-if Assessments v Adaptive Resynchronization v General performance Improvements Approaches to improvingAvailability (and Durability) ​Reduce Component Failures ​Amount of data to Resync ​Resync ETAs
  • 42. 42©2018 VMware, Inc. Performance Deep Dive Agenda • Performance Fundamentals • Adaptive Resync Architecture • Monitoring Tools
  • 43. 43©2018 VMware, Inc. Write BufferArchitecture ​Writes go to a first tier device in a fast sequential log ​Native device bandwidth to absorb short bursts ​Cold data is deduplicated and compressed as it moves out to second tier Guest Writes First Tier Capacity Tier destaging
  • 44. 44©2018 VMware, Inc. Write BufferArchitecture ​Writes go to a first tier device in a fast sequential log ​Native device bandwidth to absorb short bursts ​Cold data is deduplicated and compressed as it moves out to second tier This de-staging process is slower than first tier writes If we have sustained write workloads, we need to smoothly find equilibrium Guest Writes First Tier Capacity Tier destaging Time Bandwidth 1st Tier Bandwidth Capacity Tier Bandwidth
  • 45. 45©2018 VMware, Inc. Congestion In Action (Pre-Adaptive Resync) ​We make this transition via a congestion signal ​Congestion is adaptive – apply a greater throttle until we reach equilibrium ​Congestion stops rising when incoming rate equals de-staging rate Guest Writes First Tier Capacity Tier destaging Time Bandwidth 1st Tier Bandwidth Capacity Tier Bandwidth CongestionEquilibrium
  • 46. 46©2018 VMware, Inc. ​Storage devices have some parallelism, but thereis a limit ​At first, more outstanding IO means more bandwidth (same latency) ​Once we hit max parallelism, more outstanding IO means more latency (same bandwidth) Queueing Delay Is high latency a hardware problem or a sizing problem? Outstanding IO Bandwidth Outstanding IO Latency ​Storage devices have some parallelism, but thereis a limit ​At first, more outstanding IO means more bandwidth (same latency) ​Once we hit max parallelism, more outstanding IO means more latency (same bandwidth) ​Storage devices have some parallelism, but thereis a limit ​At first, more outstanding IO means more bandwidth (same latency) ​Once we hit max parallelism, more outstanding IO means more latency (same bandwidth)
  • 47. 47©2018 VMware, Inc. ​Often high latency is the most visible symptom Queueing Delay Is high latency a hardware problem or a sizing problem? Outstanding IO Bandwidth Outstanding IO Latency ​Did we push the system to far? ​Or is there an issue with hardware
  • 48. 48©2018 VMware, Inc. ​Before: The more resyncs were happening, the larger the share of destage bandwidth. • Many resyncs + low workload → drive up latency of vm IO • Few resyncs + high workload → resync takes a long time ​Adaptive Resync: resync should get 20% of the bandwidth (if contended) • We can use more if the guest IO is underutilizing the device ​Upgrade, policy change, rebalance should not be scary or take too long due to unfairness. Adaptive Resync Customer Visible Before-and-After
  • 49. 49©2018 VMware, Inc. ​We are using Congestion to provide three different properties: • Discover the bandwidth of the devices • Fairly balance different classes of IO (80% guest IO, 20% resync IO) • Push back on clients to slow down ​New approach: have a separate layer for each guarantee. What does Congestion try to do beforeAdaptive Resync Bandwidth Regulator Fairness Scheduler Back Pressure Backend
  • 50. 50©2018 VMware, Inc. Adaptive Resync Deep Dive ​Per Disk-Group scheduler ​Bandwidth regulator discover the destaging rate • Adaptive signal: write buffer fill • Adaptive throttle: bandwidth limit ​Dispatch Scheduler fairly balances different classes of IO • (80% guest IO, 20% resync IO) ​Back pressure congestion pushes back on clients to slow down • Adaptive signal: scheduler queue fill • Adaptive throttle: latency per op Bandwith RegulatorDOM LSOM Fullness signal (LSOM congestion) Dispatch Scheduler Queues generate Back-pressure Clients Back-pressure Congestion WB Fill Adaptive Adaptive
  • 51. 51©2018 VMware, Inc. ​We want to fairly share the write buffer ​Adaptively discover the bandwidth • Adding latency is not fair ​Fairly share between IO Classes • Resync • VM • Namespace ​Easy because you can see what’s waiting. ​Difficult to share bandwidth across hosts ​Can’t see across the wire into what’s waiting on the other side ​Need to allocate and reclaim shares. • Complex timing based ​Instead we use latency • Don’t need to see what’s waiting ​Manage Write Buffer Fullness ​Put Backpressure on the Clients The Technical Challenges
  • 52. 52©2018 VMware, Inc. And you can monitor this all in vSphere We’ll show the graphs at every layer
  • 53. 53©2018 VMware, Inc. Cluster LevelView Sequential Write Workload
  • 54. 54©2018 VMware, Inc. Diagram Bandwith RegulatorDOM LSOM Fullness signal (congestion) Dispatch Scheduler Queues generate Back-pressure Clients Back-pressure Congestion WB Fill
  • 55. 55©2018 VMware, Inc. Virtual MachineView Sequential Write Workload
  • 58. 58©2018 VMware, Inc. Diving into the backend Answer the following questions: • Too many Outstanding IO? • Is it first tier latency? • Is it de-staging latency? • Device or Network Issue?
  • 59. 59©2018 VMware, Inc. Diagram Bandwith RegulatorDOM LSOM Fullness signal (congestion) Dispatch Scheduler Queues generate Back-pressure Clients Back-pressure Congestion WB Fill The top half shows if we have too much Outstanding IO
  • 60. 60©2018 VMware, Inc. Diagram Bandwith RegulatorDOM LSOM Fullness signal (congestion) Dispatch Scheduler Queues generate Back-pressure Clients Back-pressure Congestion WB Fill This is where we can see if it is a sizing issue (too much IO queuing up)
  • 61. 61©2018 VMware, Inc. Diagram Bandwith RegulatorDOM LSOM Fullness signal (congestion) Dispatch Scheduler Queues generate Back-pressure Clients Back-pressure Congestion WB Fill A very high amount of Outstanding IO causes backpressure congestion
  • 62. 62©2018 VMware, Inc. Diagram Bandwith RegulatorDOM LSOM Fullness signal (congestion) Dispatch Scheduler Queues generate Back-pressure Clients Back-pressure Congestion WB Fill Backend = Latency including queues and below
  • 63. 63©2018 VMware, Inc. Diagram Bandwith RegulatorDOM LSOM Fullness signal (congestion) Dispatch Scheduler Queues generate Back-pressure Clients Back-pressure Congestion WB Fill Disk Groups are where we see first tier latency
  • 64. 64©2018 VMware, Inc. Diagram Bandwith RegulatorDOM LSOM Fullness signal (congestion) Dispatch Scheduler Queues generate Back-pressure Clients Back-pressure Congestion WB Fill This is where we see the de- stage rate
  • 65. 65©2018 VMware, Inc. Diagram Bandwith RegulatorDOM LSOM Fullness signal (congestion) Dispatch Scheduler Queues generate Back-pressure Clients Back-pressure Congestion WB Fill Disk Groups congestion shows the signal from LSOM
  • 66. 66©2018 VMware, Inc. Diagram Bandwith RegulatorDOM LSOM Fullness signal (congestion) Dispatch Scheduler Queues generate Back-pressure Clients Back-pressure Congestion WB Fill Disk Groups congestion comes from WB Fill Also: • Many log entries (small writes, many objects) • Component Congestion (small writes, one object) • Memory usage (rare)
  • 67. 67©2018 VMware, Inc. Diving into the backend Answer the following questions: • Is it first tier performance? • Is it de-staging performance? • Too many Outstanding IO? • Device or Network Issue?
  • 69. ©2018 VMware, Inc. 69 What about resync fairness?
  • 70. 70©2018 VMware, Inc. • Should be in 4:1 ratio • Ratio is measured on normalized bandwidth (penalty for small IOs) • If one type is not using the whole bandwidth, he other can claim the leftover
  • 71. 71©2018 VMware, Inc. Resync Fairness Applies even when we have congestion
  • 72. 72©2018 VMware, Inc. ​Now you can upgrade and do maintenancewith peace of mind
  • 73. 73©2018 VMware, Inc. Get Ahead of the Curve – vSAN Private Beta ​vSAN Data Protection ​Native enterprise-grade protection ​vSAN File Services ​Expanding vSAN beyond block storage ​Cloud Native Storage ​Persistent storage for containers Sign up at http://www.vmware.com/go/vsan-beta