The document discusses the challenges of monitoring cloud environments and the different requirements for monitoring various cloud types including IaaS, PaaS, SaaS, and private clouds. It emphasizes that monitoring must be driven through the provisioning process to ensure scalability and flexibility. Both cloud consumers and service providers need unified monitoring across cloud boundaries and layers for optimal visibility.
4. How did IT come to this?
Public and
Private Clouds
HP Bladesystem
MATRIX
vCloud Director NRE Coalition
IaaS SaaS PaaS
Private X X
(Internal)
Public
(External)
DATACENTERS
@mrivingt | #nfluence
5. The cloud effect on IT
Traditional systems management is based on
complete control of all components and resources
The physical datacenter embodied this principle
of control
The cloud dissipates the datacenter and disseminates
control beyond organizational boundaries
Now the “datacenter” is a heterogeneous mix of
disparate computing environments
Controlling across cloud boundaries is the challenge
6. Unified monitoring is a major goal
Public and
Private Clouds
HP Bladesystem
MATRIX
vCloud Director NRE Coalition
IaaS SaaS PaaS
Private X X
(Internal)
Public
(External)
DATACENTERS
@mrivingt | #nfluence
7. Cloud layers
Application
Private Virtual Data Center
“Abstraction”
Virtualized Infrastructure
“Virtualization”
Physical Infrastructure and Components
@mrivingt | #nfluence
8. Cloud layers – Depth of vision
1 Application Private Virtual Data Center
“Abstraction”
2 Virtualized Infrastructure
“Virtualization”
3 Physical Infrastructure and Components
@mrivingt | #nfluence
9. Depth of vision
Public Cloud
– SaaS and PaaS
– IaaS
– For the benefit of the consumer 3 IaaS SaaS PaaS
– For the service provider themselves
Private X
(Internal) X
Private Cloud Public
(External)
– In the traditional private datacenter
2 1
– Provided by service providers
DataCenter
– Full Visibility
@mrivingt | #nfluence
10. Monitoring SaaS / PaaS services
In depth visibility into the performance, availability
and status of your instances
SaaS and PaaS Backup as a Service
– URL and web service response
– End user experience – passive and synthetic
– Transaction performance counters
# transactions, latency, service time force.com
Analysis and predictive reporting
– Subscription status
– SLA measurement and reporting
SaaS – You don’t “own” the application
– Specific SaaS application APIs
PaaS – You do “own” the application
– Application instrumentation
– Application frameworks generally expose specific
performance metrics
@mrivingt | #nfluence
11. Monitoring IaaS infrastructure (consumer)
Exposed by Cloud APIs: You‟ll need more:
Virtual server instances - Just as a datacenter:
Network, CPU, Storage details -
– Detailed Server Monitoring
read/write
– Application – Exchange, Sharepoint,
Additional global IaaS offering AD, Notes, DB, etc.
performance
– Web Server – IIS, Apache,
– Server start up times WebSphere, WebLogic etc.
– Availability of servers /
instance types – by location – Multi-tier web application views
– Usage data – End user experience and
transactions
– Plus workflow, automation, usage
metering, integration with Service
Desk, CMDB…
@mrivingt | #nfluence
12. Monitoring must behave well in the cloud
Zero touch configuration and Cloud Hub
deployment of monitoring for
new instances Register
Registration and graceful Policy Report
de-registration of agents
Monitoring policies obtained at
De-register
instantiation time (no stale images)
Connect to management server
and begin reporting Server Instance
Connect securely back to data
centers if they exist Data Centers
@mrivingt | #nfluence
13. A model for IaaS monitoring
IaaS Cloud Monitoring
Architecture 12/14/2009 Mark
Rivington
Self Service Dashboard
“Surge Reports
Computing”
Automation
Launch
Terminate Data Visualization
Monitoring Aggregation
Aggregation
Monitoring
Performance Direct Visualization
Policy
Configuration
Performance Service
Management Service Data
Data Views
Data
Incidents Portal
Integration
Dynamic State Reporting
Service
Provisioning Service Desk
Reporting
@mrivingt | #nfluence
14. A customer example of active management
Configuration data
– Brand name consumer media – Defines the SQL Query that measures
the load (single value) e.g. select
streaming company average (cpu) from server where server
name like „CDN%‟
– Sets the low threshold for the value e.g.
60 and 80.
Configuration
– Highly asymmetric workloads
– Defines the actions (launch or
Select Data
terminate) that occur below the “low”
Performance and above the “high” thresholds.
and user demands
data
– Sets the number of instances to be
NMS “launched” or “terminated”
Performance
– Sets the image name to be launched
data Out of
– Provides any other parameters needed
Range
– Heavily utilized datacenters Thermostat Process
for launching terminating instances.
(Delay) Loop
– Capital intensive datacenter costs Within Range
Launch/Terminate
Requests
Notes
The „Thermostat‟
– Early users of Public IaaS
process implements the Not Too
logic of the system and Enough many Terminat
the „Cloud Control‟ Launch Cloud Control Process
process (CCP) e
Instances
interfaces with the Instances
Amazon
– Shift to Operational Expense
specific cloud services
The Thermostat requests Rackspace
functions from the CCP
via NMS call-back Savvis
functions.
– Used monitoring to determine and There are delays
configured within the
(Other Cloud Providers)
control overspill in to the cloud
system to prevent Cloud Provider Specific
repeated requests for “Plug-ins”
launch or terminate.
@mrivingt | #nfluence
15. Monitoring IaaS infrastructure (service provider)
Customer B View
– Key requirement is to offer self service Customer A View
Customer C View
monitoring of cloud instances to the
consumer
– Graduated levels of monitoring service
with appropriate pricing
– Monitoring must be driven through
provisioning
Master Cloud
View Client A Data
– Multi-tenancy and Scalability are vital Client B Data
Client C Data
– Performance and availability data must Client D Data
be accessible through CSP Portal Client E Data
Client F Data
SLM DS
– Direct data access or portal to portal
integration
@mrivingt | #nfluence
16. Provisioning drives monitoring
– It is all about the APIs Presentation Information
Reporting – Dashboards – Portals and Widgets
– Templated (e.g.
good, better, best) monitoring SLA and Business Service Mgmt
policies deployed at
Automation
Correlation and
Workflow
Root Cause Analysis
instantiation
API
Performance & Availability
– Modifiable through specific Event and Alarm Management
API calls
End User
– Driven entirely through Datacenter Virtualization
Experience
external automation or Cloud and Power and
Custom
provisioning system SaaS Facilities
@mrivingt | #nfluence
20. Private cloud
Effectively a combination of
consumer and service provider
IaaS monitoring requirements
Plus classical datacenter
monitoring for internal private
cloud infrastructure
HP Bladesystem Matrix
Need to Support specific vCloud Director
branded infrastructure stack
solutions e.g. VCE Vblock NRE Coalition
@mrivingt | #nfluence
21. Vblock specific monitoring (as an example)
Discovery and Deployment
– Auto-discovery, auto-monitoring, pre-built
templates
Operational
– Under usage, over-commitment identification
– Vblock root cause analysis
Chassis
– Monitoring of all aspects of the rack
– Compute
– Cisco UCS blades and elements
– Storage
– EMC‟s CLARiiON™, Symmetrix™ and Celera
– Networking and interconnects
– Cisco routers, SAN switches and Nexus™ soft
switches
@mrivingt | #nfluence
23. A customer example of private cloud monitoring
– Global Investment Bank
– Long term user of “other 3” systems
management suite
– Moving from physical to virtual to
private cloud
– Transformation from 6 weeks to 6
minutes in terms of server delivery
– Needed a more flexible monitoring
solution
– Key was integration with new
configuration management application
– Self Service monitoring is vital to private
cloud
– Currently has over 28,000 servers under
management and is still growing
@mrivingt | #nfluence