Five things you need to ask your VM Admin (and you may not like the answers!)
A few key concepts detailed are:
1) Shifting how we understand cost in our virtual infrastructure
2) The predominant role that storage plays in VM reliability
3) Real world issues with multi-hypervisor environments
4) Tackling administration issues in a holistic way
1. 5 Things you need to ask your
Virtualization Administrator
John Maxwell
VP Product Management, Dell Software
2. Overview
• 5 Questions and
Background
• The Solution that can
give you answers and
solve the problems
• Primer on virtual
optimization
terminology
2
Performance Monitoring
3. Question #1
• What is our VM Density?
– # VMs / # Servers
• Why is this important?
– VM Density is a measurement of how effective you use
virtualization
• Fact
– Customers of Dell Foglight for Virtualization have been able to
increase VM density by 1-2 VMs per host by simply optimizing
their environment
– In one case this equated to over 200 additional VMs on the
same hardware
3
Performance Monitoring
4. Question #2
• How much wasted disk space do we have?
– Over-allocated virtual files?
– Abandoned templates?
– Powered off VMs?
• Why is this important?
– Over-allocation of files is rampant and in some cases VM’s are
“lost” and not visible in vCenter yet still exist and consume space
– Even if you thin-provision disks within a storage array, there is
waste
• Fact
– Dell Foglight for Virtualization has found 1000’s of terabytes of
wasted disk space
4
Performance Monitoring
5. Question #3
• How many Zombie VM’s do we have?
– VM is running, but is anyone using it?
– In 24 hours?
– In 1 week, month, quarter, year?
• Why is this important?
– VM Sprawl has created millions of VMs that are powered on but
never used
• Fact
– Dell Foglight for Virtualization has found 1000’s of Zombie VMs
and removed them
5
Performance Monitoring
6. Question #4
• How do we maintain SLAs, and pinpoint
virtual performance problems? How long
does it take, hours, minutes, seconds?
– Do we proactively know if a host or VM is “beginning” to have problems?
–
–
–
–
Proactive alerts?
Single pane of glass?
Email/text alerts?
Automation?
• Why is this important?
– The optimum VM to Admin ratio is 1:150 – without intelligent analytics and
automation, it is impossible to meet SLAs and a high ratio
• Fact
– Dell Foglight for Virtualization is the de-facto standard for mid-to-large
scale virtual infrastructure at almost 8,000 installations.
6
Performance Monitoring
7. Question #5
• How do we know when a host is running
out of resources before it happens?
– How many additional VMs can we add to our environment?
– What is the gating factor to growth? CPU? Memory? Storage?
• Why is this important?
– Would you rather know weeks or hours before you hit a
CPU/Memory/Storage limitation?
– How do you plan for future server and storage acquisitions?
• Fact
– Dell Foglight for Virtualization has predictive analytics to tell you
when resources are going to run out
7
Performance Monitoring
9. Real-time Visualization
• The one solution that does it all
– Real-Time and Historical Analysis
– Single –pane-of-glass for enterprise wide virtual monitoring
– Go from real-time to any-point-in-time performance and resource analysis
Enterprise
View
Real-Time
View
Historical
View
9
Performance Monitoring
10. Proactive, Actionable Insights
• The one solution that does it all
– Expert Advice – Find Performance Problems in Seconds
– Pinpoint the problem for immediate resolution
– Proactively identify potential future problems
Exception
Alarms
+ Expert
Advice
Proactive
Insights
10
Performance Monitoring
11. Analyze and Forecast Capacity Trends
• The one solution that does it all
– Capacity Management
– View current growth trends and resource consumption
– Forecast future resource requirements
– Scenario modeling and what-if scenarios
Capacity
Trending
Growth
Scenarios
Forecast
Future
Capacity
11
Performance Monitoring
12. Optimize and Reduce Data Center TCO
• The one solution that does it all
– Optimization: Improve VM Density and Control OPEX
– Right-size CPU and Memory; Reclaim wasted resources
Optimize CPU
and Memory
Optimize
Storage
Resources
Reclaim
Wasted
Resources
12
Performance Monitoring
13. Automation you can Trust
• The one solution that does it all
– Flexible Automation
– Automate remediation to alarms & common tasks
– Proactively optimize virtual resources
– Powerful AND easy to use automation
Automation
Custom
Workflows
Proactive
Optimization
13
Performance Monitoring
15. Terminology
• Snapshots
–Definition: “A delta file that is created to be able to roll
back to a point in time by intercepting all changes,
thus allowing a user or product (e.g. backup software)
to backup the static image of the VM.”
–If this snapshot is not deleted and left open, it will
grow continuously, causing degraded performance,
wasted space and risking an outage.
–Orphaned snapshots are snapshots that vCenter has
lost control of. This can happen when deleting
snapshots and it fails to merge with base disk.
15
Performance Monitoring
16. Terminology
• Abandoned VMs
–Definition: “When user selects Remove from Inventory
option in vCenter the VM is removed only from
vCenter but all data files are kept on disk.”
–Accidental – user selected wrong option when
deleting VM
–On purpose - leave it there to make sure it isn’t
needed and then forgets it
16
Performance Monitoring
17. Terminology
• Zombie VMs
–Definition: “A running VM that is using very low
CPU, Memory and Storage resources.”
–Probably a decommissioned VM that can be deleted
–Typically happens when owner doesn´t notify VM
Admins about decommission
17
Performance Monitoring
18. Terminology
• Powered Off VMs
–Definition: “A VM that have been powered off more
than X number of months.”
–Probably safe to delete VM
–Remember some systems might only be powered
on at certain period over time to do a special
task, not common but they exist.
18
Performance Monitoring
19. Terminology
• Unused Templates
–Definition: “Templates that nobody have deployed new
VMs from within the last X months.”
–Probably safe to delete the template
19
Performance Monitoring
20. Terminology
• CPU over allocation
–Based on configuration
–Over allocation leads to increased overhead
–CPU scheduler in vSphere has to work harder to find
available resources
–Can affect your VM density / datacenter ROI
20
Performance Monitoring
21. Terminology
• Memory over allocation
–Based on configuration
–Over allocation leads to increased overhead
–Wasted storage due to VMware swapfile is same as
allocated memory
–Can affect your VM density / datacenter ROI
21
Performance Monitoring
22. Thank you for your
participation
Join the conversation…
More conversations on line
John Maxwell
@VMMaxwell
Visit us on the Web:
www.software.dell.com
Learn More on Enterprise Edition
http://software.dell.com/products/fo
glight-for-virtualization-enterpriseedition/
Foglight on Facebook
facebook.com/Foglight
Foglight for Virtualization on Twitter
@DellVirt
The Foglight for Virtualization Community
http://communities.quest.com/community/
vfoglight
22
Performance Monitoring
Efficiently plan, budget design, and make infrastructure decisions with certainty and best use of expensive systems to control CAPEX. Quickly and agilely scale the capacity of the data center up or down as business requirements change at the lowest possible cost. Proactively avoid performance problems due to virtualization load balancing and resource utilization.
A single pane of glass provides enterprise wide virtual environment visualization. From real-time insight to any-point-in time performance and resource analysis.Our enterprise view provides insight into overall performance and health at a glance. You can view all data centers where ever they are located with a single instance, as Foglight for Virtualization, Enterprise edition is scalable to connect to multiple virtual centers, where ever they are located in the world. Easily manage multiple virtual data centers, multiple hypervisors and provide multi-tenant customer-scoped viewsGet real time granular view into your virtual environment, performance of ESX down to the CPU. Role based views that help administrators do their job better.Our historical view allows administrators to go back to any point in time to troubleshoot spikes or issues that occurred during off work hours.
Proactively manage service level agreements through the real-time performance dashboards that allow you to find performance bottlenecks and potential issues across your heterogeneous environment in seconds. Quickly pinpoint the problem and resolve at the click of a button. Prevent future issues with alarms, expert advice, and alerts that help you address issues before they impact applications and end users.
Intelligent capacity management and planning allows I/O admins to avoid resource contention, eliminate bottlenecks, optimize existing capacity, and control future CAPEX requirements. Easily determine the number of additional VMs that can be safely added to hosts, clusters, and resource pools while accounting for existing workload demands, resource availability and performance, high availability settings in clusters, and VM reservations. Schedule hardware changes and deploy VMs with greater peace of mind by identifying available slots and forecasting exhaustion of resources in VMs, reserving capacity ahead of time, and coordinating scheduling for all deployments. Estimate and plan for future server purchasing requirements with a clear understanding of capital expenditures you’ll need by modeling and predicting hardware Maximize resource utilization with advanced capacity management and CAPEX scenario modelingEffectively manage the growth of your virtual infrastructure by taking the guesswork out of capacity management, planning, VM placement and workload allocation. Intelligent analysis and automation capabilities help you predict future hardware resource requirements, reserve capacity for planned VM deployments, and auto-deploy VMs into reserved slots. Dynamic capacity planning helps you determine how to accommodate future workloads and allows “what-if” scenario modeling to accurately forecast CAPEX requirements.When IT teams rationalize their utilization into capacity, they can see: Current bottleneck (processor, memory, network, disk) to determine what to purchase next Current available capacity by resource (to determine next bottleneck) Expected number of VMs that can be hosted per cluster/host Number of VMs that are currently hosted per cluster/host Remaining capacity and emergency capacity above your normal maximum utilization
Flexible Automation allows you to reduce costs, speed deployments, and simplify complexity. User controlled automation puts you in control and improves your productivety and efficiency.Automate remediation to alarms & common tasks Proactively optimize virtual resourcesPowerful AND easy to use automation. AND IS better!
Beyond one-dimensional performance monitoring – get the insight to know what is happening in your virtualized data center, as well as the advice and automation necessary to take action now.