Master VMware Performance and Capacity ManagementIwan Rahabok
12 Sep 2016 update: See this http://virtual-red-dot.info/operationalize-sddc-program-2/ for details.
-------------
Based on the book http://virtual-red-dot.info/performance-and-capacity-management/
Master performance and capacity management of VMware SDDC
Building vSphere Perf Monitoring ToolsPablo Roesch
Balaji and Ravi present on how to build vSphere monitoring tools using the vSphere APIs - this is a must view for anyone managing a large complex environment. For vSphere SDKs, API visit http://developer.vmware.com Blogs, Forums, Sample Code
XPDDS18: The Art of Virtualizing Cache Maintenance - Julien Grall, ArmThe Linux Foundation
The Arm architecture allows for a wide variety of cache configurations, levels and features. This enables building systems that will optimally fit power/area budgets set for the target application.
A consequence of this is that architecturally compliant software has to cater for a much wider range of behaviors than on other architectures. While most software uses cache instructions that don't need special treatment in a virtualized environment, some will want to directly manage a given cache using set/way instructions and will introduce challenges for the hypervisor to handle them.
This talk will give an overview of how caches behave in the Arm architecture, especially in the context of virtualization. It will then describe the problem of using set/way instructions in a virtualized environment. We will also discuss the modifications required in Xen to handle those instructions.
XPDDS18: Memory Overcommitment in XEN - Huang Zhichao, HuaweiThe Linux Foundation
This talk will introduce our practice on Memory Overcommitment in XEN, and share some findings and lessons. i.e.: The best practice of POD(Populate On Demand), including live migration of the POD pages; Introduce the mem-shr, a memory-saving de-duplication feature, to merge the samepages; Xenpaging optimizing, including some policy enhancements; Scalability investigation and enhancements on Memory Overcommitment; What fields Memory Overcommitment benefits Huawei Cloud, and some performance data
Master VMware Performance and Capacity ManagementIwan Rahabok
12 Sep 2016 update: See this http://virtual-red-dot.info/operationalize-sddc-program-2/ for details.
-------------
Based on the book http://virtual-red-dot.info/performance-and-capacity-management/
Master performance and capacity management of VMware SDDC
Building vSphere Perf Monitoring ToolsPablo Roesch
Balaji and Ravi present on how to build vSphere monitoring tools using the vSphere APIs - this is a must view for anyone managing a large complex environment. For vSphere SDKs, API visit http://developer.vmware.com Blogs, Forums, Sample Code
XPDDS18: The Art of Virtualizing Cache Maintenance - Julien Grall, ArmThe Linux Foundation
The Arm architecture allows for a wide variety of cache configurations, levels and features. This enables building systems that will optimally fit power/area budgets set for the target application.
A consequence of this is that architecturally compliant software has to cater for a much wider range of behaviors than on other architectures. While most software uses cache instructions that don't need special treatment in a virtualized environment, some will want to directly manage a given cache using set/way instructions and will introduce challenges for the hypervisor to handle them.
This talk will give an overview of how caches behave in the Arm architecture, especially in the context of virtualization. It will then describe the problem of using set/way instructions in a virtualized environment. We will also discuss the modifications required in Xen to handle those instructions.
XPDDS18: Memory Overcommitment in XEN - Huang Zhichao, HuaweiThe Linux Foundation
This talk will introduce our practice on Memory Overcommitment in XEN, and share some findings and lessons. i.e.: The best practice of POD(Populate On Demand), including live migration of the POD pages; Introduce the mem-shr, a memory-saving de-duplication feature, to merge the samepages; Xenpaging optimizing, including some policy enhancements; Scalability investigation and enhancements on Memory Overcommitment; What fields Memory Overcommitment benefits Huawei Cloud, and some performance data
Intel(R) Xeon(R) E7 v3-based X6 platforms + Lenovo Flex System Interconnect Fabric solutions deliver a highly-reliable, cost-efficient and scalable system for your data center.
WinConnections Spring, 2011 - 30 Bite-Sized Tips for Best vSphere and Hyper-V...Concentrated Technology
At the end of the day, virtualization is all about performance. If you squish together 20 VMs onto a single host and they don’t perform well, then you’ve failed at your job. Conversely, if you’ve constructed the environment correctly, you win. In this fun and exciting session, Friend-of-the-Virtual-Machine Greg Shields presents 30 of his very best tips that you can immediately implement. Who knows, you might find one or two that solve your performance problems overnight!
Авторский учебный курс от Архитектора Microsoft Алексея Кибкало.
Что такое Hyper-V
Версии Windows Server 2012 Hyper-V
Аппаратные требования к Windows Server 2012 Hyper-V
Установка Hyper-V
Сетевые возможности Windows Server 2012 Hyper-V
Что такое Live Migration
Высокодоступные кластеры Windows Server 2012 Hyper-V
Аварийное восстановление и Hyper-V Replica
Азы управления при помощи System Center
При поддержке "Звезды и С" www.stars-s.ru
IBM Storage is ideally suited for SAP HANA workloads. Learn more about our certified offerings for SAP HANA TDI, and all about our flash storage technologies and systems. Our Storage is built to power the future of IT!
This presentation provides an introduction to the current activities leading to software architectures and methodologies for new NVM technologies, including the activities of the SNIA Non-Volatile Memory (NVM) Technical Working Group. This session includes a review and discussion of the impacts of the SNIA NVM Programming Model (NPM). We will preview the current work on new technologies, including remote access, high availability, clustering, atomic transactions, error management, and current methodologies for dealing with NVM.
Intel(R) Xeon(R) E7 v3-based X6 platforms + Lenovo Flex System Interconnect Fabric solutions deliver a highly-reliable, cost-efficient and scalable system for your data center.
WinConnections Spring, 2011 - 30 Bite-Sized Tips for Best vSphere and Hyper-V...Concentrated Technology
At the end of the day, virtualization is all about performance. If you squish together 20 VMs onto a single host and they don’t perform well, then you’ve failed at your job. Conversely, if you’ve constructed the environment correctly, you win. In this fun and exciting session, Friend-of-the-Virtual-Machine Greg Shields presents 30 of his very best tips that you can immediately implement. Who knows, you might find one or two that solve your performance problems overnight!
Авторский учебный курс от Архитектора Microsoft Алексея Кибкало.
Что такое Hyper-V
Версии Windows Server 2012 Hyper-V
Аппаратные требования к Windows Server 2012 Hyper-V
Установка Hyper-V
Сетевые возможности Windows Server 2012 Hyper-V
Что такое Live Migration
Высокодоступные кластеры Windows Server 2012 Hyper-V
Аварийное восстановление и Hyper-V Replica
Азы управления при помощи System Center
При поддержке "Звезды и С" www.stars-s.ru
IBM Storage is ideally suited for SAP HANA workloads. Learn more about our certified offerings for SAP HANA TDI, and all about our flash storage technologies and systems. Our Storage is built to power the future of IT!
This presentation provides an introduction to the current activities leading to software architectures and methodologies for new NVM technologies, including the activities of the SNIA Non-Volatile Memory (NVM) Technical Working Group. This session includes a review and discussion of the impacts of the SNIA NVM Programming Model (NPM). We will preview the current work on new technologies, including remote access, high availability, clustering, atomic transactions, error management, and current methodologies for dealing with NVM.
that was my first presentation at school .. i'll be glad to have some opinions to improve my next presentations :) thank u
this is my mail for contact : aalhaj45@gmail.com
Исследование поведения на рынке ипотечного кредитования. Спрос на кредитные и ипотечные продукты. Описание потенциального заемщика и его основные проблемы. Рефинансирование. Финансовая грамотность населения.
STARTPAGE-HOME.COM is classified as a browser hijacker since it utilizes pop-up messages and advertisements designed to profit at the expense of computer users. And, it also displays alerts trying to convince computer users that their Web browser or other software is out of date, so that the users may allow the setup of Potentially Unwanted Programs and unsafe content from STARTPAGE-HOME.COM. However, STARTPAGE-HOME.COM does no good to a computer system. We recommend you remove STARTPAGE-HOME.COM immediately once it is traced in your system.
VMworld 2013
John Dodge, VMware
Andre Leibovici, VMware
Learn more about VMworld and register at http://www.vmworld.com/index.jspa?src=socmed-vmworld-slideshare
Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webi...Amazon Web Services
Amazon Elastic Compute Cloud (Amazon EC2) provides a broad selection of instance types to accommodate a diverse mix of workloads. In this technical session, we provide an overview of the Amazon EC2 instance platform, key platform features, and the concept of instance generations. We dive into the design choices of the different instance families, including the General Purpose, Compute Optimized, Storage Optimized, and Memory Optimized families. We also detail best practices and share performance tips for getting the most out of your Amazon EC2 instances.
Learning Objectives: • Understand the differences between instances • Learn best practices and tips for getting the most out of EC2 instances
Amazon EC2 provides a broad selection of instance types to deliver high performance for a diverse mix of applications. In this session, we overview the drivers of system performance and discuss in depth how Amazon EC2 instances deliver system performance while also providing elasticity and complete control over your infrastructure. We also detail best practices and share performance tips for getting the most out of your Amazon EC2 instances.
Get Your GeekOn with Ron - Session One: Designing your VDI ServersUnidesk Corporation
Join virtualization expert and industry veteran Ron Oglesby as he breaks down how to select and configure servers, including:
• Server CPU selection - they were not made equal!
• Desktop-to-core guesstimation?
• Memory - and its temperamental relationship with disk design
• Local storage options - yes, it's an option
• And, overall best practices for VDI implementation
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
Generating a custom Ruby SDK for your web service or Rails API using Smithyg2nightmarescribd
Have you ever wanted a Ruby client API to communicate with your web service? Smithy is a protocol-agnostic language for defining services and SDKs. Smithy Ruby is an implementation of Smithy that generates a Ruby SDK using a Smithy model. In this talk, we will explore Smithy and Smithy Ruby to learn how to generate custom feature-rich SDKs that can communicate with any web service, such as a Rails JSON API.
2. VMware ESX Architecture
VMkernel
Guest
Physical
Hardware
Monitor (BT, HW, PV)
Guest
Memory
Allocator
NIC Drivers
Virtual Switch
I/O Drivers
File System
Monitor
Scheduler
Virtual NIC Virtual SCSI
TCP/IP
File
System
CPU is controlled by
scheduler and virtualized
by monitor
Memory is allocated by the
VMkernel and virtualized by
the monitor
Network and I/O devices
are emulated and proxied
though native device
drivers
Monitor supports:
• BT (Binary Translation)
• HW (Hardware assist)
• PV (Paravirtualization)
3. Can Your Application Be Virtualized?
Red:
Exceeds capabilities
of virtual platform
Yellow:
Runs well under right
conditions
Green:
Runs perfectly
out-of-box
4. Can Your Application Be Virtualized?
No Worries!
Plan Accordingly
Don’t Virtualize!
6. CPU Bound Workloads Usually “Green”
SPECcpu results: http://www.vmware.com/pdf/asplos235_adams.pdf
Websphere results published jointly by IBM/VMware
SPEC results used for comparison only and not submitted to SPEC
7. Maximum reported
storage: 365K IOPS
• 100K on VI3
Maximum reported
network: 16 Gb/s
• Measured on VI3
I/O Utilization Above Maximums: Usually “Red”
8. 0
2
4
6
8
1 2 4 8
Scaling Ra0o
v/p CPUs
Na0ve VM
IO In Action: Oracle/TPC-C*
58000
IOPS
ESX achieves 85% of
native performance with
an industry standard
OLTP workload on an 8-
vCPU VM
1.9x increase in
throughput with each
doubling of vCPUs
9. Eight vCPU Oracle System Characteristics
Metric 8 vcpu VM
Business transactions per minute 250,000
Disk IOPS 60,000
Disk Bandwidth 258 MB/s
Network Packets/sec 27,000
Network Throughput 77 Mb/s
* Our benchmark was a fair-use implementation of the TPC-C business model;
our results are not TPC-C compliant results, and not comparable to official TPC-C results
10. Oracle/TPC-C* Experimental Details
Host was an 8 CPU system with an Xeon 5500
OLTP Benchmark: fair-use implementation of TPC-C workload
Software stack includes: RHEL5.1, Oracle 11g R1, internal build of
ESX (ESX 4.0 RC)
Were there many Tweaks in getting this result? Not really…
– ESX development build with these features
• Async I/O, pvscsi driver, virtual Interrupt coalescing, topology-aware
scheduling
• EPT: H/W MMU enabled processor
– The only ESX “tunable” applied: static vmxnet TX coalescing
• 3% improvement in performance
12. Platform: Choose Newer Hardware
If Possible Choose Latest Hardware
Older processors with longer pipelines and smaller caches can
be particularly challenging for virtualized workloads
Newer processors have hardware virtualization support
for
Privileged instructions
Virtual machine memory management
Most applications perform better with Hardware-assisted
monitors (Intel VT, AMD RVI)
Enable hardware virtualization in BIOS.
13. Intel Architecture Virtualization Performance
0
200
400
600
800
1000
1200
1400
Prescott
Cedar Mill
Merom
Penryn
Nehalem
Intel Architecture VMEXIT Latencies
Latency (cycles)
HW virtualization support improving from CPU generation to
generation
14. Memory Virtualization in Hardware
Hardware memory
management units
(MMU) improve
efficiency
AMD RVI currently available
Dramatic gains can be seen
But some workloads
see little or no value
And a small few actually
slow down
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
SQL Server Citrix
XenApp
Apache
Compile
AMD RVI Speedup
16. General Best Practices: VM Setup
During VM creation select right guest OS type
Determines the monitor type and related optimizations
Determines default optimal devices and their settings
Do not choose ‘other’
Install 64-bit OS if large amounts of memory are needed
Choose a OS version with fewer timer interrupts
Windows, Linux 2.4 100/sec per vCPU
Some Linux 2.6 250/sec per vCPU
Some Linux 2.6 1000/sec per vCPU
Disable unused devices that use a polling scheme
USB, CDROM
Consume CPU when idle
17. Large Pages
Increases TLB memory
coverage
Removes TLB misses, improves
efficiency
Improves performance of
applications that are
sensitive to TLB miss costs
Configure OS and application
to leverage large pages
LP will not be enabled by default
0%
2%
4%
6%
8%
10%
12%
Performance Gains
Gain (%)
18. VM Configuration: HW or SW Memory Management?
Example: number crunching
financial software
SW and HW virtualizations
perform equally well
Example: Citrix, Apache
web server
HW virtualization
performs better
Example: Java applications
With large pages, HW, with
small pages, SW
Example: databases
Depends on which cost is
higher: memory virt
overhead or TLB cost?
Benchmark!
SensitivetoTLBmisscosts?
App is memory management intensive?
No Yes
No
Yes
19. Platform Optimization: Network
Use a network adapter that supports:
Checksum offload, TCP segmentation offload (TSO),
Jumbo frames (JF)
Enable JF when hardware is available (default is off!)
Capability to handle high memory DMA (64-bit DMA addresses)
Capability to handle multiple scatter/gather elements per Tx frame
Check configuration
Ensure host NICs are running with highest supported speed
and full-duplex
NIC teaming distributes networking load across multiple NICs
Better throughput and allows passive failover
Use separate NICs to avoid traffic contention
For Console OS (host management traffic), VMKernel
(vmotion, iSCSI, NFS traffic), and VMs
20. Jumbo Frames
Before transmitting, IP layer fragments data into MTU
(Maximum Transmission Unit) sized packets
Ethernet MTU is 1500 bytes
Receive side reassembles the data
Jumbo Frames
Ethernet frame with bigger MTU
Typical MTU is 9000 bytes
Reduces number of packets transmitted
Reduces the CPU utilization on transmit and receive side
21. Jumbo Frames
Linux
ifconfig eth0 mtu 9000
Windows
Device Manager ->
Network adapters ->
VMware PCI Ethernet
Adapter -> Properties ->
Advanced -> MTU to 9000
Switches/
Routers
NIC Driver
Client
TCP/IP Stack
Guest (VM)
vNIC
Virtual Switch
TCP/IP Stack
ESX
25. VMware vSphere enables you to use all those cores…
1
10
100
1000
1990 1995 2000 2005 2010 2015
NumberofCoresused
Year
Oracle
SQL Server
Exchange
Avg. Cust,
Application
Avg. Four
Socket
Web Servers
Most applications
don’t scale beyond
4/8 way
Virtualization provides a
means to exploit
the hardware’s
increasing parallelism
VMware ESX Scaling:
Keeping up with core
counts
26. Virtualization-aware Architecture: Building Blocks
Many applications lack
scalability beyond
certain CPUs
Apache web server,
WebSphere, Exchange
Configure vCPUs to
application scalability
limits
For additional capacity
instantiate more of such
VMs
SPECweb2005 Native and Virtual Scaling
http://www.vmware.com/files/pdf/consolidating_webapps_vi3_wp.pdf
27. Scheduler Opportunities
vCPUs from one VM
stay on one socket*
With two quad-core
sockets, there are only
two positions for a 4-
way VM
1- and 2-way VMs can
be arranged many ways
on quad core socket
Newer ESX schedulers
more efficiency use
fewer options
Relaxed co-scheduling
Socket 0 Socket 1 VM Size Options
2
12
8
(*) The cell limit has been removed in vSphere
28. The Performance Cost of SMP
From: http://blogs.vmware.com/performance/2009/06/measuring-the-cost-of-smp-with-mixed-workloads.html
30. “Bonus” Memory During Consolidation: Sharing!
Content-based
Hint (hash of page
content) generated for 4K
pages
Hint is used for a match
If matched, perform bit by
bit comparison
COW (Copy-on-Write)
Shared pages are marked
read-only
Write to the page breaks
sharing
VM 1 VM 2 VM 3
Hyper
visor
VM 1 VM 2 VM 3
Hyper
visor
31. Memory footprint of four idle VMs quickly decreased to
300MB due to aggressive page sharing.
Page Sharing in XP
32. Page Sharing in Vista
Memory footprint of four idle VMs quickly decreased to
800MB. (Vista has larger memory footprint.)
33. Expand
Shrink
May page content
out to virtual disk
May bring content
from virtual disk
Borrow
Pages
Lend
Pages
ESX Server Memory Ballooning
Guest OS has better
information than VMkernel
Which pages are stale
Which pages are unused
Guest Driver installed with
VMware Tools
Artificially induces memory
pressure
VMkernel decides how much
memory to reclaim, but guest
OS gets to choose particular
pages
VM with
VMware Tools
Installed
VM with
VMware Tools
Installed
34. Ballooning Pins Pages
Memory has been reduced and pinned to induce guest to
page, if needed
If memory is short, ESX must choose which pages to swap to
disk
App
VM
Hyper
visor
OS
Balloon App
VM
Hyper
visor
OS
Balloon
Inflating
38. Managing Memory in Java Environments
Calculate OS
memory
Estimate JVM
needs
Specify heap
exactly
Reservations =
OS + JVM + heap
39. Monitor guest paging using traditional tools
Consider putting guest swap file on its own VMDK
Put all guest swap VMDKs on the same LUN
vSphere client can then monitor guest paging by watching that LUN’s traffic
Use vSphere Client to track host memory usage
There is no way to predict this before hand
Run workloads and analyze performance
Statistic VirtualCenter esxtop
Active Memory (recently
used by guest OS)
Active Memory %ACTV, %ACTVS,
%ACTVS
Swap rate (VC on VI3
reports swap magnitude)
VI3: Swap In/Out
vSphere: Swap In/Out Rate
SWW/s, SWR/s
Getting Memory Sizing Just Right
41. Platform Optimization: Storage
Over 90% of storage related
performance problems stem
from misconfigured storage
hardware
Consult SAN Configuration Guides
Ensure disks are correctly
distributed
Ensure caching is enabled
Consider tuning layout of LUNs
across RAID sets
Spread I/O requests across
available paths
FC Switch
VMware ESX
HBA1 HBA2 HBA3 HBA4
Storage array
SP2SP1
1 2 3 4
42. Platform Optimization: File System
Always use VMFS
Negligible performance cost and
superior functionality
Align VMFS on 64K boundaries
Automatic with vCenter
www.vmware.com/pdf/
esx3_partition_align.pdf
VMFS is a distributed file system
Be aware of the overhead of
excessive metadata updates
If possible schedule maintenance
for off-peak hours
0
1000
2000
3000
4000
5000
6000
7000
8000
4K IO 16K IO 64K IO
VMFS
RDM
(virtual)
RDM
(physical)
IOspersecond
VMFS Scalability
43. Server Consolidation: Storage Planning
Win2k3
SQL
Win2k3
SQL
Win2k3
SQL
ESX Server ESX Server
VI3
VMDK VMDK VMDK5 Disks 5 Disks 5 Disks
Physical setup: each instance
provided 5-spindle LUN
Virtual architecture: Each VM
provided its own VMDK
• But now do they map to disks?
44. Server Consolidation: Storage Planning
Nine spindles for VMFS volume
This is clearly less than the 15
disks in the physical deployment
15 spindles for virtual deployment
matches physical
But this configuration inferior to
multiple LUNs and access pattern
changes (see following)
ESX Server ESX Server
VI3
VMDK VMDK VMDK
9 Dsk
ESX Server ESX Server
VI3
VMDK VMDK VMDK
15 Dsk
46. Storage Analysis and vscsiStats
vCenter reports latencies
for FC and iSCSI only
• Device latency for
hardware
• Kernel latency for queuing
VI3 and vSphere have
instrumented the virtual
SCSI bus for stats on all
VMs
• vscsiStats
VMkernel
Guest
Physical
Hardware
Monitor
Guest
I/O Drivers
File System
Monitor
Virtual SCSI
File
System
Device
Latency
Kernel
Latency
47. Workload Characterization Using vscsiStats
vscsiStats characterizes IO for
each virtual disk
Allows us to separate out each
different type of workload into its
own container and observe trends
Histograms only collected if
enabled; no overhead otherwise
Technique:
For each virtual machine I/O request
in ESX, we insert some values
into histograms
E.g., size of I/O request → 4KB
Data
collected
per-
virtual
disk
48. vscsiStats Reports Results Using Histograms
Read/Write Distributions are
available for our histograms
Overall Read/Write ratio?
Are Writes smaller or larger than
Reads in this workload?
Are Reads more sequential than
Writes?
Which type of I/O is incurring
more latency?
In reality, the problem is not
knowing which question to ask
Collect data, see what you find
I/O Size
All, Reads, Writes
Seek Distance
All, Reads, Writes
Seek Distance Shortest
Among Last 16
Outstanding IOs
All, Reads, Writes
I/O Interarrival Times
All, Reads, Writes
Latency
All, Reads, Write
50. >95% of Applications Match or Exceed Native
Performance on VMware Infrastructure
ESX Version
ESX 2 ESX 3
AppsSupported
100%
ESX 3.5 ESX 4.0
Overhead
VM CPU
VM Memory
IO
• 30% - 60%
• 1 vCPU
• 3.6 GB
• 20% - 30%
• 2 vCPU
• 800 MBits
• 4 vCPU
• 64 GB
• 100,000 IOPS
• 9 GBits
• <2% - 10%
• 8 vCPU
• 255 GB
• >350,000 IOPS
• 40 GBits
• <10,000 IOPS
• 380 MBits
• 16 GB
• <10% - 20%
Source: VMware Capacity Planner analysis of > 700,000 servers in customer production environments
51. OS
APP
OS
APP
Storage
Networking
Virtual Machines
CPU
Memory
64 cores and 1 TB
physical RAM
Hardware Scale Up
Lowest CPU overhead
Hardware Assist
Purpose Built Scheduler
Maximum memory efficiency
Hardware Assist
Page Sharing
Ballooning
Wirespeed network access
VMXNET3
VMDirectPath I/O
Greater than 350k iops per second
Lower than 2 millisecond latency
Storage stack optimization
VMDirectPath I/O
Virtual hardware scale out
8-way vSMP and 255 GB of
RAM per VM
VM Scale Up
vCompute vStorage vNetwork
Current NEW
ESX
OS
APP
OS
APP
OS
APP
“Speeds and Feeds” Optimization for the Highest Consolidation Ratios
52. Exchange 2007 on vSphere: SMP Efficiency
0
50
100
150
200
250
300
350
1,000 2,000 4,000 6,000 8,000
95thPercentileSendMailLatency(ms)
# of Heavy Users
(1,000 users/2-way VM & 2,000 users/4-
way VM)
2 vCPU
VM
4 vCPU
VM
0
10
20
30
40
50
60
70
80
90
100
1,000 2,000 4,000 6,000 8,000
%ofCPUutilization
# of Heavy Users
(1,000 users/2-way VM & 2,000 users/4-
way VM)
2 vCPU
VM
4 vCPU
VM
57. Summary
Newer hardware improves virtualization performance
Traditional application, storage, networking best
practices must be followed
Consolidation provides new challenges and
opportunities that must be planned for
58. Performance Resources
The performance community
http://communities.vmware.com/community/vmtn/general/performance
Performance web page for white papers
http://www.vmware.com/overview/performance
VROOM!—VMware performance blog
http://blogs.vmware.com/performance
63. Virtual Machine Sizing—NUMA
Memory accesses from CPU 0
To Memory 0 is local
To Memory 1 is remote
Remote access latency >> local
access latency
# of vCPUs ≤ # of CPUs / node
ESX enables NUMA scheduling
If VM MemSize < Node Memory
size
No remote access penalty
Node 1
CPU 2
CPU 3
Memory 1
Node 0
CPU 0
CPU 1
Memory 0
Remote Memory
64. Host Configuration: Storage Queues
ESX queues can be
modified to increase
throughput
This can benefit benchmarks
to a single LUN
Rarely required in production
systems
Oversized ESX queues on
multiple servers can
overload array
Kernel latency is a sign
that ESX queues should
be increased
Guest
Queues
ESX
Queues
Array
Queues
65. Choose the Right Virtualization Software
Hosted products aren’t
designed for meet the
most extreme needs
ESX demonstrates better
host and VM scaling
VMware Server
Windows
Server 2003
SQL Apache
Host Operating System
Java
Linux
66. VMware ESX Compared to VMware Server
Single tile score higher than reference system
0
1
2
3
4
5
6
1 2 3 4
Score
Tiles
VMmark Score
ESX Server
67. Address Translation
Virtual addresses (VA) mapped to
machine addresses (MA) via page
tables
Page table walks are expensive
Translation look-aside buffer
(TLB) stores recent mappings and
avoids page walks
Improvements:
Larger pages means more TLB
hits
Hardware assistance to virtual
mapping means more efficient
page table and TLB maintenance
MachineMemory
TLB
VA MA
Page Tables
Page
Page
Page
Page
VA
MA
69. Hardware Configuration In Action: SAP
100 Mb/s
Ethernet
TX: 2.4Mb/s
RX: 0.3Mb/s
~100 IO/s
EMC CX3-40 SAN
Windows Server 2008
VMWare ESXi
2 X quad-core AMD “Barcelona” B3
with RVI, 32GB memory
SAP Application Server
Unicode PL146
Flat mode + mprotect (false)
MS SQLServer 2005
SAP Benchmark
Driver
70. SAP SD Performance on ESX
ESX achieves 95% of
native performance on
a 4vCPU VM
85% of native
performance on an 8
vCPU VM on 8 pCPU
host
Linear scaling from
1 vCPU -> 4 vCPU
71. SAP SD 2-Tier performance on ESX
SAP SD performance sensitive to software configuration and ESX
monitor type:
SAP configuration
Mode
Deployment
Recommended
Monitor Type
Guest tunable Effect
View Model Production RVI (Default) Large pages
H/W assist
reduces MMU
overheads
Flat model + mprotect =
true
Production RVI (Default) Large pages
H/W assist
reduces MMU
overheads
*Flat model + mprotect
= false
Mostly
benchmark
SVM (UI Option)
Larges pages up
to 12% benefit
S/W MMU
benefits up to 5%
* Configuration used in our experiments
In most cases, default H/W MMU provides best results
– Experiment with your individual workloads