V center operations enterprise standalone technical presentation

© 2010 VMware Inc. All rights reserved
vCenter Operations Enterprise - Standalone
Real-time Performance Management for the Entire Enterprise
Technical Presentation

12
Management Challenges
 Performance problems often occur with no real warning
– Many times end users are the first to notice problems
– Root cause determination is difficult and time-consuming
– Solving problems requires all-hands-on-deck bridge calls
 Real-time understanding of performance is lacking
– No reliable understanding of the health of IT infrastructure makes IT too reactive
– Siloed monitoring tools do not allow a common “truth”
– No correlation across IT silos
 Optimizing IT infrastructure is difficult if not impossible
– Understanding the abnormal metric behaviors that lead to degradation of Key
Performance Indicators is not possible with current tools
– Understanding the abnormal behaviors that define your worst performing devices is
not possible with current tools
– Heavy reliance on “Tribal Knowledge” of a few application experts

13
What If You Could…
 Automate
• Eliminate time-consuming problem resolution
processes
 Correlate and Accelerate
• “One Click” to root cause of emerging
performance problems to reduce MTTI/MTTR
 Get Proactive
• Avert end user and business impact of
building performance problems
 Collaborate
• Aggregate and correlate data from monitoring
landscape to create a single “truth”
 Optimize
• Tune components to deliver optimal
performance for application transactions

15
1st Generation - Event-Centric, Hard-Threshold Based
3/4/08 16:45 Host 1 processingTimeServ The Processing Time Service Level on process… n/a n/a n/a
3/4/08 16:45 Host 1 Processor_Table 0 Processor 0 is at 87.0%. A CPU Bottleneck is….. n/a 0 Windows_System
3/4/08 16:44 Host 2 System_Table The number of hardware interrupts per second… n/a 0 Windows_System
3/4/08 16:30 Host 2 Processor_Table 1 Processor 1 is at 84.0%. A CPU Bottleneck is …. n/a 0 Windows_System
3/4/08 16:25 n/a responseTimeServ… The Response Time Service Level on Toadwor.. n/a n/a n/a
3/4/08 16:20 n/a processingTimeServ.. The Processing Time Service Level on Prospec.. n/a n/a n/a
3/4/08 16:08 Host 1 Ora_Sql_Hogs_Alert Oracle: SFPRD A CPU Hog has been detected n/a OraSF Oracle
3/4/08 16:08 Host 1 Ora_Sql_Hogs_Alert Oracle: SFPRD SQL with high I/O has been de.. n/a OraSF Oracle
3/4/08 14:40 n/a responseTimeServ… The Response Time Service Level on Siebel Sa.. n/a n/a n/a
3/4/08 14:20 n/a processingTimeServ.. The Processing Time Service Level on Siebel S. n/a n/a n/a
3/4/08 14:39 Host 3 Top_CPU_Table Process ‘siebsh.exe(svc-siebel, 6780)’: is cons.. n/a 0 Windows_System
How “1st Generation” Tools Attempt to Solve These Problems
DATA FEEDS
DATA FEEDS
DATA FEEDS
DATA FEEDS

16
2nd Generation - Rudimentary Baselining, Rules/Templates, Charting
3/4/08 14:40 n/a responseTimeServ… The Response Time Service Level on Siebel Sa.. n/a n/a n/a
3/4/08 14:20 n/a processingTimeServ.. The Processing Time Service Level on Siebel S. n/a n/a n/a
How “2nd Generation” Tools Attempt to Solve These Problems?
DATA FEEDS
DATA FEEDS
DATA FEEDS
DATA FEEDS

17
VMware’s Approach to Real-Time Performance Management
Flexible
INTEGRATION
to many data sources
Enterprise
SCALABILITY
Patented performance
ANALYTICS
I can put all my
monitoring tools to good
use and get better
performance analytics.
Powerful information
DASHBOARDS
3rd Generation – Holistic, Real Time Analytics

18
Slide 18
vCenter Operations 3rd Generation Approach – An Analogy
My brain is understanding the health of my body.
Should I do anything?
Your Brain Understands Context:
 If my heart rate and temperature are increasing I
should go to the hospital
 If I’m tired, rest more
 If I tire easily, start exercising!
Heart RateRespiration Temperature
Muscular Skeletal Cardio Vascular
Monitoring UserEx Metrics Monitoring Business Metrics
Monitoring App Layer Metric – JVM, DB Connections, etc.
Monitoring Server O/S Metrics – CPU, RAM, Disk, I/O, etc.
vCenter Operations is understanding the health of
my enterprise by analyzing millions of
measurements. Should I do anything?
vCenter Operations Understands Context:
 Act based on urgency of emerging problems
 Act based on real-time performance dashboards
 Act based on long term correlations and trends
vCenter
Operations
Nervous

19
Data Agnostic Approach to Data Collection
 Accepts any time series data (examples)
• Server OS
• Server App layer (eg, IIS, Oracle, WebSphere, etc)
• Network
• Storage
• User Experience
• Transactional
• Business Data
• Change Events
 Minimal Required Fields (4)
• Object Name, Metric Name, Value, Timestamp
 Data Extraction - *not* an analytic question
• No rules/templates to Write and Maintain
• vCenter Operations Analytics do all of the “Work”
vCenter
Operations

20
Slide 20
Learn Normal Behavior and Identify Abnormalities
 Doesn’t assume IT data has a normal bell-shaped distribution
 Sophisticated Analytics – 8 different algorithms
 Learns your dynamic ranges of “Normal” without templates
 Learns patterns of behavior and identifies Abnormalities
BLUE LINE
Metric’s
Measured Value
GRAY BAR
Learned Upper and
Lower band of Dynamic
Threshold - “Normal”
RED Zone
Breached Dynamic
Threshold – “Abnormal”

21
Proactive Alerting – Smart Alerts
User Experience (eg, RUM, etc.)
Database Silo (eg, Quest, etc.)
App Data (eg, Wily, etc.)
Network Data (e.g., Ionix IPPM, etc.)
Smart Alert Generation (“When”)
Business Data (eg, Finance)
! SMART ALERT
Business Application

22
Drill down to the Root Cause
Smart Alert Summary (“What”)

23
Early Warning
SMART ALERT
Noise Line Crossed

24
Impact to
application
health
Impact to health of
each technology tier
No major impact to
application key Performance
Indicators (KPIs)…yet.

25
Root cause technology
tier is the DB
Metric-level
root cause
symptoms -
START HERE

26
See change and other
external events
affect on application
health with this
“mash up” view

27
 One Source of Truth Across the
Enterprise
 Health - Objective measure of
performance based on
underlying level of abnormal
behavior
 Analytics provide a Health
score for any resource or
grouping
• A single Server, Device, Resource
• Entire Tier or Silo
• Entire Application or Service
• Entire Datacenter
• Any Arbitrary Group of Resources
Dynamic Performance Dashboards – Health Scores
“How is our world doing?”

32
vCenter Operations - OPEX Savings
Incident Management
Lifecycle Savings
 Manage/Resolve incidents
 Proactive alerts reduce costs
30-40%
Change Lifecycle
Savings
 Manage changes to
apps/infrastructure
 “Before/after” analysis reduces
changed-related incidents 30-40%
Incident Management
Savings
 Managing Service Desk issues
(Incidents)
 Manual threshold elimination
reduces erroneous tickets by
50-60%
Problem Management
Savings
 Closing problems after systems
restored, includes root cause
analysis
 Root cause analysis reduces
problem closure by 30%

33
Customer Success: IT Operations
Before
 400 critical alerts/hour
 End user complaints
alerted IT to the problem
 End users impacted (avg. 2
hours/outage)
 12 Level-2 engineers on
bridge call to address
problem
After
 20 alerts/MONTH
 3 hours advanced warning
of slowdown w/root cause
 NO end user impact
 1 Level-2 Engineer and 1
DBA to address problems
Learn Normal
Smart Alerting
Root Cause
Solve performance issues before end users are affected
and reduce total alerts

34
vCenter Operations Architecture,
Process and Sizing

35
vCenter Operations Enterprise - Standalone Architecture
 Four Main Services:
Collector, Analytics,
Web, ActiveMQ
 Architecture includes
MS SQL or Oracle DB,
plus File-based DB
(FSDB) for raw metric
storage
 Collectors can be
distributed for
scalability, or to span
DCs & firewalls

36
vCenter Operations Enterprise - Standalone Processing
4a: Metric-level anomalies
are tracked for Alerting and
Dashboarding
5: Data
provided to
“Northbound”
integration
with products
like Ionix
SMARTS
SAM
2a: Analytics runs daily to
determine hour-by-hour
DTs for next 24 hours
2b: Full FSDB is scanned
by the 8 analytic algorithms
to determine per metric
best match the next 24
hour period
1a: Collectors and
adapters collect metrics,
topology & change events
- Ongoing -
1b: Data
stored in
FSDB
3: Incoming data points are
tested against DT bands
4b: Correlate anomalies,
generate Smart Alerts,
and determine RC
2c: Store DT data
in SQL DB

37
Deployment Prerequisites and Sizing
 OS Support
• Win Server 2003 R2 (x64)
• Red Hat Linux RHEL 5 (x64)
* Customer supplied
 DB Support*
• SQL Server 2005
• Oracle 10g R2
Size Metrics -
Collected every
5 min on Avg.
Processors
(>2.8Ghz)
Memory Minimum
Initial Disk
Space
Processors
(>2.8Ghz)
Memory Minimum
Initial Disk
Space
Small <250,00 4 Cores 12GB 500GB 2 Cores 4GB 10GB
Medium <1,000,000 8 Cores 24GB 500GB 4 Cores 8GB 25GB
Large <5,000,000 16 Cores 64GB 5TB 8 Cores 16GB 100GB
Very
Large
<10,000,000 24 Cores 128GB 10TB 8 Cores 16GB 100GB
DB ServerAnalytics (Main) Server

38
vCenter Operations Editions

39
VMware vCenter Operations Editions
vCenter Operations Enterprise
+ Full Configuration & Compliance
Management
+ Other VMware & 3rd Party Integrations
(View, management, servers, storage)
Non-VMware (incl. physical) environments
vCenter Operations Advanced
+ Capacity
Planning
VMware Cloud / vCenter
vSphere
vCenter Operations Standard
Performance
Real-time
Capacity
Configuration
Change

40
Understanding the vCenter Operations Editions
vCenter Operations Standard
Edition
vCenter Operations Enterprise
- Standalone
Data Sources vCenter x 1 • Any 3rd party monitoring tools’
time series data
• Change events
• Multiple vCenter Servers
Objects vCenter Objects (i.e.)
• Data Centers
• Clusters
• ESX Hosts
• Datastores
• VMs x 1500
Unlimited Scope (i.e.)
• Applications
• Network Infrastructure
• Storage
• Hosts (ESX, Win, Linux, etc)
• VMs
Users Infrastructure (e.g. VI Admins) Operations, Infrastructure,
Application Teams, Business
Owners, CxOs
Dynamic Thresholds Yes Yes
Performance Root Cause Yes Yes
Proactive Alerting No Yes
Customizable Dashboards No Yes
Notifications No Yes
ScopeFunction

V center operations enterprise standalone technical presentation

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (18)

Similar to V center operations enterprise standalone technical presentation

Similar to V center operations enterprise standalone technical presentation (20)

More from solarisyourep

More from solarisyourep (20)

Recently uploaded

Recently uploaded (20)

V center operations enterprise standalone technical presentation