SlideShare a Scribd company logo
Monitoring Large-scale Cloud
Infrastructures with OpenNebula
Simon Boulet
OpenNebula Consultant
Co-founder of the Cloudnorth.com Project
simon@nostalgeek.com
Goals
1. Show how to configure OpenNebula to
achieve sub-1 minute monitoring interval
2. Demonstrate the use of OpenNebula in
large-scale cloud infrastructures
3. Suggest enhancements to OpenNebula
performance and monitoring
How Big Exactly is Large-scale?
How many hosts?
1,000? 2,000? 10,000 VMs?
Monitoring in OpenNebula
● Detects when a VM or host changes status
(Running, Stopped, etc.)
● Built-in metrics: CPU, memory and network
usage
● You can add as many metrics as you like by
customizing driver
● Can be used to perform various tasks (auto
scaling, high-availability redeployment, etc.)
Don't Expect the Default
Configuration to Perform Optimally
● Database: Use MySQL database backend,
not the default SQLite
● Logs: Use Syslog log system, and disable
debug logging (debug_level=1)
● Number of threads: Adjust the number of
drivers threads (see -t option to your *MAD
config options)
Use OpenNebula >= 4.0
Prior versions did monitoring in two phases:
1. The IM Monitor action monitored Hosts
2. The VMM Poll action monitored VMs
100 Hosts + 1,000 VMs * 15 seconds interval = 4,400
actions per minute
Since OpenNebula 4.0, the IM Monitor action is
capable of returning the information of VMs
running on the monitored host
Monitoring History
By default OpenNebula keeps 24h of
monitoring history
15 seconds interval X 24h = 5760 records per VM
Average record size: 4KB
23MB of monitoring history per VM
100 VM = 2.3GB
10,000 VM = 230GB
HOST_MONITORING_EXPIRATION_TIME and
VM_MONITORING_EXPIRATION_TIME config options
Monitoring History (continued)
● Reduce history to 30 minutes (1800
seconds)
● Use MySQL MEMORY storage engine for
vm_monitoring and host_monitoring tables
It's OK to lose monitoring history when MySQL
is restarted
Most recent monitoring values are stored in VM
template
Set MySQL max_heap_table_size large enough to hold all your monitoring
history
Watch your Load Average
As of 4.2, the maximum number of
simultaneous XML-RPC API connections is
limited to 15
Overloaded OpenNebula = Slow XML-RPC API response =
API Limit / Timeout
● Reduce load at deployment time by
adjusting number of VMs simultaneously
deployed by scheduler
● Watch next release (4.4) for
XML-RPC API concurrency
enhancements
Local Caching Nameserver
OpenNebula use DNS name for monitoring
hosts (unless you named your hosts using their
IP address instead of name)
● Use a local caching nameserver to speed up
DNS lookup (such as dnsmasq).
Beware of SSH Transport
Most OpenNebula drivers (KVM, Xen, etc.) use
SSH connections to perform actions
OK for deploying new VM, but expensive when
doing VM monitoring
Meet Ganglia
<< Ganglia is a scalable distributed system monitor tool for high-performance
computing systems such as clusters and grids. >>
- Wikipedia
OpenNebula has built-in support for Ganglia
By default Ganglia and OpenNebula must run
on the same machine
Set GANGLIA_HOST in /var/lib/one/remotes/im/ganglia.d/ganglia_probe and
/var/lib/one/remotes/vmm/kvm/poll_ganglia
Meet Ganglia (continued)
Ganglia Driver Limitations
1. Currently only 1 Ganglia Collector is
supported
2. Need to run script on each host to export
OpenNebula-specific metric
(OPENNEBULA_VMS_INFORMATION)
3. Ganglia as a maximum length of 1392 bytes
for string metrics
Host sFlow
<< The Host sFlow agent exports physical and virtual server performance
metrics using the sFlow protocol. The agent provides scalable, multi-vendor,
multi-OS performance monitoring with minimal impact on the systems being
monitored.>>
- http://host-sflow.sourceforge.net/
Exports a standard set of hypervisor and VM
metrics
Official support for Xen, KVM and Hyper-V, but
uses Libvirt to gather metrics (and Libvirt has
support LXC, OpenVZ, VMWare, etc.)
Host sFlow (continued)
Source: http://blog.sflow.com/2012/02/ganglia-33-released.html
Host sFlow (continued)
Sample Metrics
Hosts Metrics
VMs Metrics
Not currently supported in OpenNebula. Contact me if you're interested.
vnode_mem_total Hypervisor Total Memory
vnode_domains Hypervisor VM Count
<VM ID>.vcpu_state VM State (Running, Stopped, etc.)
<VM ID>.vmem_util VM Memory Utilization
<VM ID>.vdisk_free VM Free Disk Space
4,000 VMs at Sub-1 Minute Interval
OpenNebula 4.2 + xml-rpc patch (upcoming in 4.4)
Experimental Host sFlow Driver
1 OpenNebula Core (EC2 High-CPU XLarge instance)
1 Sunstone Web Server (EC2 Standard Medium instance)
1 Ganglia Collector (EC2 Standard Medium instance)
100 Hosts (EC2 High-CPU Medium instances)
~40 VMs per Host
~4,000 VMs (OpenVZ)
15 - 60 second monitoring interval
4,000 VMs at Sub-1 Minute Interval
4,000 VMs at Sub-1 Minute Interval
4,000 VMs at Sub-1 Minute Interval
Looking Forward
There’s room for optimizations
● The command line tools can get very slow when
returning very large result sets (but not the API…)
● Distributed driver, for example using ZeroMQ for
distributing tasks to multiple workers
● Investigate PoolSQL locks being held for long period
and blocking other threads (discussed in bug #1818)
● Gather metrics about OpenNebula internals: locks wait,
effective monitoring interval, memory footprints, etc.
● Investigate very large Sunstone memory usage
Thank you!
Questions?
“OpenNebula captured my interest for several technical
reasons besides the fact that it is truly open. It's architecture
is very elegant; it has C++ bones, ruby muscles and bash
tendons. It's extensible and understandable. It has no peer
as far as I can tell.”
Christopher Barry, Infrastructure Engineer, RJMetrics,
September 2012
http://opennebula.org/users:testimonials

More Related Content

What's hot

Supercomputing by API: Connecting Modern Web Apps to HPC
Supercomputing by API: Connecting Modern Web Apps to HPCSupercomputing by API: Connecting Modern Web Apps to HPC
Supercomputing by API: Connecting Modern Web Apps to HPC
OpenStack
 
OpenNebulaConf 2016 - Budgeting: the Ugly Duckling of Cloud computing? by Mat...
OpenNebulaConf 2016 - Budgeting: the Ugly Duckling of Cloud computing? by Mat...OpenNebulaConf 2016 - Budgeting: the Ugly Duckling of Cloud computing? by Mat...
OpenNebulaConf 2016 - Budgeting: the Ugly Duckling of Cloud computing? by Mat...
OpenNebula Project
 
Antoine Coetsier - billing the cloud
Antoine Coetsier - billing the cloudAntoine Coetsier - billing the cloud
Antoine Coetsier - billing the cloud
ShapeBlue
 
Testing, CI Gating & Community Fast Feedback: The Challenge of Integration Pr...
Testing, CI Gating & Community Fast Feedback: The Challenge of Integration Pr...Testing, CI Gating & Community Fast Feedback: The Challenge of Integration Pr...
Testing, CI Gating & Community Fast Feedback: The Challenge of Integration Pr...
OPNFV
 
XCP-ng - past, present and future
XCP-ng - past, present and futureXCP-ng - past, present and future
XCP-ng - past, present and future
ShapeBlue
 
OpenNebulaConf 2016 - Networking, NFVs and SDNs Hands-on Workshop by Rubén S....
OpenNebulaConf 2016 - Networking, NFVs and SDNs Hands-on Workshop by Rubén S....OpenNebulaConf 2016 - Networking, NFVs and SDNs Hands-on Workshop by Rubén S....
OpenNebulaConf 2016 - Networking, NFVs and SDNs Hands-on Workshop by Rubén S....
OpenNebula Project
 
OpenNebulaConf 2016 - OpenNebula, a story about flexibility and technological...
OpenNebulaConf 2016 - OpenNebula, a story about flexibility and technological...OpenNebulaConf 2016 - OpenNebula, a story about flexibility and technological...
OpenNebulaConf 2016 - OpenNebula, a story about flexibility and technological...
OpenNebula Project
 
Simplify Networking for Containers
Simplify Networking for ContainersSimplify Networking for Containers
Simplify Networking for Containers
LinuxCon ContainerCon CloudOpen China
 
John Spray - Ceph in Kubernetes
John Spray - Ceph in KubernetesJohn Spray - Ceph in Kubernetes
John Spray - Ceph in Kubernetes
ShapeBlue
 
OpenNebula Conf 2014 | Understanding the OpenNebula Model for Cloud Provision...
OpenNebula Conf 2014 | Understanding the OpenNebula Model for Cloud Provision...OpenNebula Conf 2014 | Understanding the OpenNebula Model for Cloud Provision...
OpenNebula Conf 2014 | Understanding the OpenNebula Model for Cloud Provision...
NETWAYS
 
OpenNebulaConf 2016 - Hypervisors and Containers Hands-on Workshop by Jaime M...
OpenNebulaConf 2016 - Hypervisors and Containers Hands-on Workshop by Jaime M...OpenNebulaConf 2016 - Hypervisors and Containers Hands-on Workshop by Jaime M...
OpenNebulaConf 2016 - Hypervisors and Containers Hands-on Workshop by Jaime M...
OpenNebula Project
 
[OpenStack Days Korea 2016] Track1 - Mellanox CloudX - Acceleration for Cloud...
[OpenStack Days Korea 2016] Track1 - Mellanox CloudX - Acceleration for Cloud...[OpenStack Days Korea 2016] Track1 - Mellanox CloudX - Acceleration for Cloud...
[OpenStack Days Korea 2016] Track1 - Mellanox CloudX - Acceleration for Cloud...
OpenStack Korea Community
 
OpenNebula Conf 2014: CentOS, QA an OpenNebula - Christoph Galuschka
OpenNebula Conf 2014: CentOS, QA an OpenNebula - Christoph GaluschkaOpenNebula Conf 2014: CentOS, QA an OpenNebula - Christoph Galuschka
OpenNebula Conf 2014: CentOS, QA an OpenNebula - Christoph Galuschka
NETWAYS
 
OpenNebulaConf 2016 - Building a GNU/Linux Distribution by Daniel Dehennin, M...
OpenNebulaConf 2016 - Building a GNU/Linux Distribution by Daniel Dehennin, M...OpenNebulaConf 2016 - Building a GNU/Linux Distribution by Daniel Dehennin, M...
OpenNebulaConf 2016 - Building a GNU/Linux Distribution by Daniel Dehennin, M...
OpenNebula Project
 
TechDay - Toronto 2016 - Hyperconvergence and OpenNebula
TechDay - Toronto 2016 - Hyperconvergence and OpenNebulaTechDay - Toronto 2016 - Hyperconvergence and OpenNebula
TechDay - Toronto 2016 - Hyperconvergence and OpenNebula
OpenNebula Project
 
TechDay - Cambridge 2016 - OpenNebula Corona
TechDay - Cambridge 2016 - OpenNebula CoronaTechDay - Cambridge 2016 - OpenNebula Corona
TechDay - Cambridge 2016 - OpenNebula Corona
OpenNebula Project
 
64-bit ARM Unikernels on uKVM
64-bit ARM Unikernels on uKVM64-bit ARM Unikernels on uKVM
64-bit ARM Unikernels on uKVM
LinuxCon ContainerCon CloudOpen China
 
TECNIRIS@: OpenNebula Tutorial
TECNIRIS@: OpenNebula TutorialTECNIRIS@: OpenNebula Tutorial
TECNIRIS@: OpenNebula TutorialOpenNebula Project
 
Extreme HTTP Performance Tuning: 1.2M API req/s on a 4 vCPU EC2 Instance
Extreme HTTP Performance Tuning: 1.2M API req/s on a 4 vCPU EC2 InstanceExtreme HTTP Performance Tuning: 1.2M API req/s on a 4 vCPU EC2 Instance
Extreme HTTP Performance Tuning: 1.2M API req/s on a 4 vCPU EC2 Instance
ScyllaDB
 
Using CloudStack With Clustered LVM
Using CloudStack With Clustered LVMUsing CloudStack With Clustered LVM
Using CloudStack With Clustered LVM
Marcus L Sorensen
 

What's hot (20)

Supercomputing by API: Connecting Modern Web Apps to HPC
Supercomputing by API: Connecting Modern Web Apps to HPCSupercomputing by API: Connecting Modern Web Apps to HPC
Supercomputing by API: Connecting Modern Web Apps to HPC
 
OpenNebulaConf 2016 - Budgeting: the Ugly Duckling of Cloud computing? by Mat...
OpenNebulaConf 2016 - Budgeting: the Ugly Duckling of Cloud computing? by Mat...OpenNebulaConf 2016 - Budgeting: the Ugly Duckling of Cloud computing? by Mat...
OpenNebulaConf 2016 - Budgeting: the Ugly Duckling of Cloud computing? by Mat...
 
Antoine Coetsier - billing the cloud
Antoine Coetsier - billing the cloudAntoine Coetsier - billing the cloud
Antoine Coetsier - billing the cloud
 
Testing, CI Gating & Community Fast Feedback: The Challenge of Integration Pr...
Testing, CI Gating & Community Fast Feedback: The Challenge of Integration Pr...Testing, CI Gating & Community Fast Feedback: The Challenge of Integration Pr...
Testing, CI Gating & Community Fast Feedback: The Challenge of Integration Pr...
 
XCP-ng - past, present and future
XCP-ng - past, present and futureXCP-ng - past, present and future
XCP-ng - past, present and future
 
OpenNebulaConf 2016 - Networking, NFVs and SDNs Hands-on Workshop by Rubén S....
OpenNebulaConf 2016 - Networking, NFVs and SDNs Hands-on Workshop by Rubén S....OpenNebulaConf 2016 - Networking, NFVs and SDNs Hands-on Workshop by Rubén S....
OpenNebulaConf 2016 - Networking, NFVs and SDNs Hands-on Workshop by Rubén S....
 
OpenNebulaConf 2016 - OpenNebula, a story about flexibility and technological...
OpenNebulaConf 2016 - OpenNebula, a story about flexibility and technological...OpenNebulaConf 2016 - OpenNebula, a story about flexibility and technological...
OpenNebulaConf 2016 - OpenNebula, a story about flexibility and technological...
 
Simplify Networking for Containers
Simplify Networking for ContainersSimplify Networking for Containers
Simplify Networking for Containers
 
John Spray - Ceph in Kubernetes
John Spray - Ceph in KubernetesJohn Spray - Ceph in Kubernetes
John Spray - Ceph in Kubernetes
 
OpenNebula Conf 2014 | Understanding the OpenNebula Model for Cloud Provision...
OpenNebula Conf 2014 | Understanding the OpenNebula Model for Cloud Provision...OpenNebula Conf 2014 | Understanding the OpenNebula Model for Cloud Provision...
OpenNebula Conf 2014 | Understanding the OpenNebula Model for Cloud Provision...
 
OpenNebulaConf 2016 - Hypervisors and Containers Hands-on Workshop by Jaime M...
OpenNebulaConf 2016 - Hypervisors and Containers Hands-on Workshop by Jaime M...OpenNebulaConf 2016 - Hypervisors and Containers Hands-on Workshop by Jaime M...
OpenNebulaConf 2016 - Hypervisors and Containers Hands-on Workshop by Jaime M...
 
[OpenStack Days Korea 2016] Track1 - Mellanox CloudX - Acceleration for Cloud...
[OpenStack Days Korea 2016] Track1 - Mellanox CloudX - Acceleration for Cloud...[OpenStack Days Korea 2016] Track1 - Mellanox CloudX - Acceleration for Cloud...
[OpenStack Days Korea 2016] Track1 - Mellanox CloudX - Acceleration for Cloud...
 
OpenNebula Conf 2014: CentOS, QA an OpenNebula - Christoph Galuschka
OpenNebula Conf 2014: CentOS, QA an OpenNebula - Christoph GaluschkaOpenNebula Conf 2014: CentOS, QA an OpenNebula - Christoph Galuschka
OpenNebula Conf 2014: CentOS, QA an OpenNebula - Christoph Galuschka
 
OpenNebulaConf 2016 - Building a GNU/Linux Distribution by Daniel Dehennin, M...
OpenNebulaConf 2016 - Building a GNU/Linux Distribution by Daniel Dehennin, M...OpenNebulaConf 2016 - Building a GNU/Linux Distribution by Daniel Dehennin, M...
OpenNebulaConf 2016 - Building a GNU/Linux Distribution by Daniel Dehennin, M...
 
TechDay - Toronto 2016 - Hyperconvergence and OpenNebula
TechDay - Toronto 2016 - Hyperconvergence and OpenNebulaTechDay - Toronto 2016 - Hyperconvergence and OpenNebula
TechDay - Toronto 2016 - Hyperconvergence and OpenNebula
 
TechDay - Cambridge 2016 - OpenNebula Corona
TechDay - Cambridge 2016 - OpenNebula CoronaTechDay - Cambridge 2016 - OpenNebula Corona
TechDay - Cambridge 2016 - OpenNebula Corona
 
64-bit ARM Unikernels on uKVM
64-bit ARM Unikernels on uKVM64-bit ARM Unikernels on uKVM
64-bit ARM Unikernels on uKVM
 
TECNIRIS@: OpenNebula Tutorial
TECNIRIS@: OpenNebula TutorialTECNIRIS@: OpenNebula Tutorial
TECNIRIS@: OpenNebula Tutorial
 
Extreme HTTP Performance Tuning: 1.2M API req/s on a 4 vCPU EC2 Instance
Extreme HTTP Performance Tuning: 1.2M API req/s on a 4 vCPU EC2 InstanceExtreme HTTP Performance Tuning: 1.2M API req/s on a 4 vCPU EC2 Instance
Extreme HTTP Performance Tuning: 1.2M API req/s on a 4 vCPU EC2 Instance
 
Using CloudStack With Clustered LVM
Using CloudStack With Clustered LVMUsing CloudStack With Clustered LVM
Using CloudStack With Clustered LVM
 

Viewers also liked

Community Clouds from Scratch
Community Clouds from ScratchCommunity Clouds from Scratch
Community Clouds from Scratch
NETWAYS
 
rOCCI – Providing Interoperability through OCCI 1.1 Support for OpenNebula
rOCCI – Providing Interoperability through OCCI 1.1 Support for OpenNebularOCCI – Providing Interoperability through OCCI 1.1 Support for OpenNebula
rOCCI – Providing Interoperability through OCCI 1.1 Support for OpenNebula
NETWAYS
 
Welcome talk unleashing the future of open-source enterprise cloud computing
Welcome talk   unleashing the future of open-source enterprise cloud computingWelcome talk   unleashing the future of open-source enterprise cloud computing
Welcome talk unleashing the future of open-source enterprise cloud computing
NETWAYS
 
High Performance Computing Cloud at SURFsara: Experiences with OpenNebula 3.x
High Performance Computing Cloud at SURFsara: Experiences with OpenNebula 3.xHigh Performance Computing Cloud at SURFsara: Experiences with OpenNebula 3.x
High Performance Computing Cloud at SURFsara: Experiences with OpenNebula 3.x
NETWAYS
 
OpenNebula in a Multiuser Environment
OpenNebula in a Multiuser EnvironmentOpenNebula in a Multiuser Environment
OpenNebula in a Multiuser Environment
NETWAYS
 
CentOS and OpenNebula, a Perfect Match
CentOS and OpenNebula, a Perfect MatchCentOS and OpenNebula, a Perfect Match
CentOS and OpenNebula, a Perfect Match
NETWAYS
 
Making Clouds: Turning OpenNebula into a Product
Making Clouds: Turning OpenNebula into a ProductMaking Clouds: Turning OpenNebula into a Product
Making Clouds: Turning OpenNebula into a Product
NETWAYS
 
Opening the Path to Technical Excellence
Opening the Path to Technical ExcellenceOpening the Path to Technical Excellence
Opening the Path to Technical ExcellenceNETWAYS
 
Top Ten Security Considerations when Setting up your OpenNebula Cloud
Top Ten Security Considerations when Setting up your OpenNebula CloudTop Ten Security Considerations when Setting up your OpenNebula Cloud
Top Ten Security Considerations when Setting up your OpenNebula Cloud
NETWAYS
 

Viewers also liked (9)

Community Clouds from Scratch
Community Clouds from ScratchCommunity Clouds from Scratch
Community Clouds from Scratch
 
rOCCI – Providing Interoperability through OCCI 1.1 Support for OpenNebula
rOCCI – Providing Interoperability through OCCI 1.1 Support for OpenNebularOCCI – Providing Interoperability through OCCI 1.1 Support for OpenNebula
rOCCI – Providing Interoperability through OCCI 1.1 Support for OpenNebula
 
Welcome talk unleashing the future of open-source enterprise cloud computing
Welcome talk   unleashing the future of open-source enterprise cloud computingWelcome talk   unleashing the future of open-source enterprise cloud computing
Welcome talk unleashing the future of open-source enterprise cloud computing
 
High Performance Computing Cloud at SURFsara: Experiences with OpenNebula 3.x
High Performance Computing Cloud at SURFsara: Experiences with OpenNebula 3.xHigh Performance Computing Cloud at SURFsara: Experiences with OpenNebula 3.x
High Performance Computing Cloud at SURFsara: Experiences with OpenNebula 3.x
 
OpenNebula in a Multiuser Environment
OpenNebula in a Multiuser EnvironmentOpenNebula in a Multiuser Environment
OpenNebula in a Multiuser Environment
 
CentOS and OpenNebula, a Perfect Match
CentOS and OpenNebula, a Perfect MatchCentOS and OpenNebula, a Perfect Match
CentOS and OpenNebula, a Perfect Match
 
Making Clouds: Turning OpenNebula into a Product
Making Clouds: Turning OpenNebula into a ProductMaking Clouds: Turning OpenNebula into a Product
Making Clouds: Turning OpenNebula into a Product
 
Opening the Path to Technical Excellence
Opening the Path to Technical ExcellenceOpening the Path to Technical Excellence
Opening the Path to Technical Excellence
 
Top Ten Security Considerations when Setting up your OpenNebula Cloud
Top Ten Security Considerations when Setting up your OpenNebula CloudTop Ten Security Considerations when Setting up your OpenNebula Cloud
Top Ten Security Considerations when Setting up your OpenNebula Cloud
 

Similar to Monitoring Large-scale Cloud Infrastructures with OpenNebula

Tips Tricks and Tactics with Cells and Scaling OpenStack - May, 2015
Tips Tricks and Tactics with Cells and Scaling OpenStack - May, 2015Tips Tricks and Tactics with Cells and Scaling OpenStack - May, 2015
Tips Tricks and Tactics with Cells and Scaling OpenStack - May, 2015
Belmiro Moreira
 
Streaming Processing with a Distributed Commit Log
Streaming Processing with a Distributed Commit LogStreaming Processing with a Distributed Commit Log
Streaming Processing with a Distributed Commit Log
Joe Stein
 
Scaling Up Logging and Metrics
Scaling Up Logging and MetricsScaling Up Logging and Metrics
Scaling Up Logging and Metrics
Ricardo Lourenço
 
AWS migration: getting to Data Center heaven with AWS and Chef
AWS migration: getting to Data Center heaven with AWS and ChefAWS migration: getting to Data Center heaven with AWS and Chef
AWS migration: getting to Data Center heaven with AWS and Chef
Juan Vicente Herrera Ruiz de Alejo
 
Open stack HA - Theory to Reality
Open stack HA -  Theory to RealityOpen stack HA -  Theory to Reality
Open stack HA - Theory to Reality
Sriram Subramanian
 
ARCHITECTING TENANT BASED QOS IN MULTI-TENANT CLOUD PLATFORMS
ARCHITECTING TENANT BASED QOS IN MULTI-TENANT CLOUD PLATFORMSARCHITECTING TENANT BASED QOS IN MULTI-TENANT CLOUD PLATFORMS
ARCHITECTING TENANT BASED QOS IN MULTI-TENANT CLOUD PLATFORMS
Arun prasath
 
Openstack summit 2015
Openstack summit 2015Openstack summit 2015
Openstack summit 2015
Andrew Yongjoon Kong
 
Testing kubernetes and_open_shift_at_scale_20170209
Testing kubernetes and_open_shift_at_scale_20170209Testing kubernetes and_open_shift_at_scale_20170209
Testing kubernetes and_open_shift_at_scale_20170209
mffiedler
 
Distributed Performance testing by funkload
Distributed Performance testing by funkloadDistributed Performance testing by funkload
Distributed Performance testing by funkload
Akhil Singh
 
Managing Oracle Enterprise Manager Cloud Control 12c with Oracle Clusterware
Managing Oracle Enterprise Manager Cloud Control 12c with Oracle ClusterwareManaging Oracle Enterprise Manager Cloud Control 12c with Oracle Clusterware
Managing Oracle Enterprise Manager Cloud Control 12c with Oracle ClusterwareLeighton Nelson
 
Docker Swarm secrets for creating great FIWARE platforms
Docker Swarm secrets for creating great FIWARE platformsDocker Swarm secrets for creating great FIWARE platforms
Docker Swarm secrets for creating great FIWARE platforms
Federico Michele Facca
 
Cloud Computing in practice with OpenNebula ~ Develer workshop 2012
Cloud Computing in practice with OpenNebula ~ Develer workshop 2012Cloud Computing in practice with OpenNebula ~ Develer workshop 2012
Cloud Computing in practice with OpenNebula ~ Develer workshop 2012Giovanni Toraldo
 
Cloud computing, in practice ~ develer workshop
Cloud computing, in practice ~ develer workshopCloud computing, in practice ~ develer workshop
Cloud computing, in practice ~ develer workshopDeveler S.r.l.
 
OpenDaylight Integration with OpenStack Neutron: A Tutorial
OpenDaylight Integration with OpenStack Neutron: A TutorialOpenDaylight Integration with OpenStack Neutron: A Tutorial
OpenDaylight Integration with OpenStack Neutron: A Tutorial
mestery
 
Ansible & Salt - Vincent Boon
Ansible & Salt - Vincent BoonAnsible & Salt - Vincent Boon
Ansible & Salt - Vincent Boon
MyNOG
 
Sanger OpenStack presentation March 2017
Sanger OpenStack presentation March 2017Sanger OpenStack presentation March 2017
Sanger OpenStack presentation March 2017
Dave Holland
 
Marcelo Perazolo, Lead Software Architect, IBM Corporation - Monitoring a Pow...
Marcelo Perazolo, Lead Software Architect, IBM Corporation - Monitoring a Pow...Marcelo Perazolo, Lead Software Architect, IBM Corporation - Monitoring a Pow...
Marcelo Perazolo, Lead Software Architect, IBM Corporation - Monitoring a Pow...
Nagios
 
Openstack_administration
Openstack_administrationOpenstack_administration
Openstack_administrationAshish Sharma
 
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
javier ramirez
 
Gdg izmir kubernetes
Gdg izmir kubernetesGdg izmir kubernetes
Gdg izmir kubernetes
Gokhan Boranalp
 

Similar to Monitoring Large-scale Cloud Infrastructures with OpenNebula (20)

Tips Tricks and Tactics with Cells and Scaling OpenStack - May, 2015
Tips Tricks and Tactics with Cells and Scaling OpenStack - May, 2015Tips Tricks and Tactics with Cells and Scaling OpenStack - May, 2015
Tips Tricks and Tactics with Cells and Scaling OpenStack - May, 2015
 
Streaming Processing with a Distributed Commit Log
Streaming Processing with a Distributed Commit LogStreaming Processing with a Distributed Commit Log
Streaming Processing with a Distributed Commit Log
 
Scaling Up Logging and Metrics
Scaling Up Logging and MetricsScaling Up Logging and Metrics
Scaling Up Logging and Metrics
 
AWS migration: getting to Data Center heaven with AWS and Chef
AWS migration: getting to Data Center heaven with AWS and ChefAWS migration: getting to Data Center heaven with AWS and Chef
AWS migration: getting to Data Center heaven with AWS and Chef
 
Open stack HA - Theory to Reality
Open stack HA -  Theory to RealityOpen stack HA -  Theory to Reality
Open stack HA - Theory to Reality
 
ARCHITECTING TENANT BASED QOS IN MULTI-TENANT CLOUD PLATFORMS
ARCHITECTING TENANT BASED QOS IN MULTI-TENANT CLOUD PLATFORMSARCHITECTING TENANT BASED QOS IN MULTI-TENANT CLOUD PLATFORMS
ARCHITECTING TENANT BASED QOS IN MULTI-TENANT CLOUD PLATFORMS
 
Openstack summit 2015
Openstack summit 2015Openstack summit 2015
Openstack summit 2015
 
Testing kubernetes and_open_shift_at_scale_20170209
Testing kubernetes and_open_shift_at_scale_20170209Testing kubernetes and_open_shift_at_scale_20170209
Testing kubernetes and_open_shift_at_scale_20170209
 
Distributed Performance testing by funkload
Distributed Performance testing by funkloadDistributed Performance testing by funkload
Distributed Performance testing by funkload
 
Managing Oracle Enterprise Manager Cloud Control 12c with Oracle Clusterware
Managing Oracle Enterprise Manager Cloud Control 12c with Oracle ClusterwareManaging Oracle Enterprise Manager Cloud Control 12c with Oracle Clusterware
Managing Oracle Enterprise Manager Cloud Control 12c with Oracle Clusterware
 
Docker Swarm secrets for creating great FIWARE platforms
Docker Swarm secrets for creating great FIWARE platformsDocker Swarm secrets for creating great FIWARE platforms
Docker Swarm secrets for creating great FIWARE platforms
 
Cloud Computing in practice with OpenNebula ~ Develer workshop 2012
Cloud Computing in practice with OpenNebula ~ Develer workshop 2012Cloud Computing in practice with OpenNebula ~ Develer workshop 2012
Cloud Computing in practice with OpenNebula ~ Develer workshop 2012
 
Cloud computing, in practice ~ develer workshop
Cloud computing, in practice ~ develer workshopCloud computing, in practice ~ develer workshop
Cloud computing, in practice ~ develer workshop
 
OpenDaylight Integration with OpenStack Neutron: A Tutorial
OpenDaylight Integration with OpenStack Neutron: A TutorialOpenDaylight Integration with OpenStack Neutron: A Tutorial
OpenDaylight Integration with OpenStack Neutron: A Tutorial
 
Ansible & Salt - Vincent Boon
Ansible & Salt - Vincent BoonAnsible & Salt - Vincent Boon
Ansible & Salt - Vincent Boon
 
Sanger OpenStack presentation March 2017
Sanger OpenStack presentation March 2017Sanger OpenStack presentation March 2017
Sanger OpenStack presentation March 2017
 
Marcelo Perazolo, Lead Software Architect, IBM Corporation - Monitoring a Pow...
Marcelo Perazolo, Lead Software Architect, IBM Corporation - Monitoring a Pow...Marcelo Perazolo, Lead Software Architect, IBM Corporation - Monitoring a Pow...
Marcelo Perazolo, Lead Software Architect, IBM Corporation - Monitoring a Pow...
 
Openstack_administration
Openstack_administrationOpenstack_administration
Openstack_administration
 
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
 
Gdg izmir kubernetes
Gdg izmir kubernetesGdg izmir kubernetes
Gdg izmir kubernetes
 

Recently uploaded

FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
Abida Shariff
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Inflectra
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 

Recently uploaded (20)

FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 

Monitoring Large-scale Cloud Infrastructures with OpenNebula

  • 1. Monitoring Large-scale Cloud Infrastructures with OpenNebula Simon Boulet OpenNebula Consultant Co-founder of the Cloudnorth.com Project simon@nostalgeek.com
  • 2. Goals 1. Show how to configure OpenNebula to achieve sub-1 minute monitoring interval 2. Demonstrate the use of OpenNebula in large-scale cloud infrastructures 3. Suggest enhancements to OpenNebula performance and monitoring
  • 3. How Big Exactly is Large-scale? How many hosts? 1,000? 2,000? 10,000 VMs?
  • 4. Monitoring in OpenNebula ● Detects when a VM or host changes status (Running, Stopped, etc.) ● Built-in metrics: CPU, memory and network usage ● You can add as many metrics as you like by customizing driver ● Can be used to perform various tasks (auto scaling, high-availability redeployment, etc.)
  • 5. Don't Expect the Default Configuration to Perform Optimally ● Database: Use MySQL database backend, not the default SQLite ● Logs: Use Syslog log system, and disable debug logging (debug_level=1) ● Number of threads: Adjust the number of drivers threads (see -t option to your *MAD config options)
  • 6. Use OpenNebula >= 4.0 Prior versions did monitoring in two phases: 1. The IM Monitor action monitored Hosts 2. The VMM Poll action monitored VMs 100 Hosts + 1,000 VMs * 15 seconds interval = 4,400 actions per minute Since OpenNebula 4.0, the IM Monitor action is capable of returning the information of VMs running on the monitored host
  • 7. Monitoring History By default OpenNebula keeps 24h of monitoring history 15 seconds interval X 24h = 5760 records per VM Average record size: 4KB 23MB of monitoring history per VM 100 VM = 2.3GB 10,000 VM = 230GB HOST_MONITORING_EXPIRATION_TIME and VM_MONITORING_EXPIRATION_TIME config options
  • 8. Monitoring History (continued) ● Reduce history to 30 minutes (1800 seconds) ● Use MySQL MEMORY storage engine for vm_monitoring and host_monitoring tables It's OK to lose monitoring history when MySQL is restarted Most recent monitoring values are stored in VM template Set MySQL max_heap_table_size large enough to hold all your monitoring history
  • 9. Watch your Load Average As of 4.2, the maximum number of simultaneous XML-RPC API connections is limited to 15 Overloaded OpenNebula = Slow XML-RPC API response = API Limit / Timeout ● Reduce load at deployment time by adjusting number of VMs simultaneously deployed by scheduler ● Watch next release (4.4) for XML-RPC API concurrency enhancements
  • 10. Local Caching Nameserver OpenNebula use DNS name for monitoring hosts (unless you named your hosts using their IP address instead of name) ● Use a local caching nameserver to speed up DNS lookup (such as dnsmasq).
  • 11. Beware of SSH Transport Most OpenNebula drivers (KVM, Xen, etc.) use SSH connections to perform actions OK for deploying new VM, but expensive when doing VM monitoring
  • 12. Meet Ganglia << Ganglia is a scalable distributed system monitor tool for high-performance computing systems such as clusters and grids. >> - Wikipedia OpenNebula has built-in support for Ganglia By default Ganglia and OpenNebula must run on the same machine Set GANGLIA_HOST in /var/lib/one/remotes/im/ganglia.d/ganglia_probe and /var/lib/one/remotes/vmm/kvm/poll_ganglia
  • 14. Ganglia Driver Limitations 1. Currently only 1 Ganglia Collector is supported 2. Need to run script on each host to export OpenNebula-specific metric (OPENNEBULA_VMS_INFORMATION) 3. Ganglia as a maximum length of 1392 bytes for string metrics
  • 15. Host sFlow << The Host sFlow agent exports physical and virtual server performance metrics using the sFlow protocol. The agent provides scalable, multi-vendor, multi-OS performance monitoring with minimal impact on the systems being monitored.>> - http://host-sflow.sourceforge.net/ Exports a standard set of hypervisor and VM metrics Official support for Xen, KVM and Hyper-V, but uses Libvirt to gather metrics (and Libvirt has support LXC, OpenVZ, VMWare, etc.)
  • 16. Host sFlow (continued) Source: http://blog.sflow.com/2012/02/ganglia-33-released.html
  • 17. Host sFlow (continued) Sample Metrics Hosts Metrics VMs Metrics Not currently supported in OpenNebula. Contact me if you're interested. vnode_mem_total Hypervisor Total Memory vnode_domains Hypervisor VM Count <VM ID>.vcpu_state VM State (Running, Stopped, etc.) <VM ID>.vmem_util VM Memory Utilization <VM ID>.vdisk_free VM Free Disk Space
  • 18. 4,000 VMs at Sub-1 Minute Interval OpenNebula 4.2 + xml-rpc patch (upcoming in 4.4) Experimental Host sFlow Driver 1 OpenNebula Core (EC2 High-CPU XLarge instance) 1 Sunstone Web Server (EC2 Standard Medium instance) 1 Ganglia Collector (EC2 Standard Medium instance) 100 Hosts (EC2 High-CPU Medium instances) ~40 VMs per Host ~4,000 VMs (OpenVZ) 15 - 60 second monitoring interval
  • 19. 4,000 VMs at Sub-1 Minute Interval
  • 20. 4,000 VMs at Sub-1 Minute Interval
  • 21. 4,000 VMs at Sub-1 Minute Interval
  • 22. Looking Forward There’s room for optimizations ● The command line tools can get very slow when returning very large result sets (but not the API…) ● Distributed driver, for example using ZeroMQ for distributing tasks to multiple workers ● Investigate PoolSQL locks being held for long period and blocking other threads (discussed in bug #1818) ● Gather metrics about OpenNebula internals: locks wait, effective monitoring interval, memory footprints, etc. ● Investigate very large Sunstone memory usage
  • 23. Thank you! Questions? “OpenNebula captured my interest for several technical reasons besides the fact that it is truly open. It's architecture is very elegant; it has C++ bones, ruby muscles and bash tendons. It's extensible and understandable. It has no peer as far as I can tell.” Christopher Barry, Infrastructure Engineer, RJMetrics, September 2012 http://opennebula.org/users:testimonials