OpenStack monitoring - Unidata S.p.A. Case Report

1,827 views
1,687 views

Published on

Published in: Technology, Design
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,827
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
56
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide

OpenStack monitoring - Unidata S.p.A. Case Report

  1. 1. OpenStack and Monitoring Unidata S.p.A. case report Davide Guerri - Unidata S.p.A. - d.guerri@unidata.it
  2. 2. Agenda • What is Unidata S.p.A.? • (Cloud) monitoring • OpenStack Monitoring • Unidata case report
  3. 3. Unidata S.p.A. • established in 1985 • pioneer of microcomputer technology in Italy • today one of the most important ISPs • PoP at NaMeX, MiX, AMS-IX • large fiber infrastructure (Rome and province of Rome) • a large number of WiFi installations (based on the OpenWISP project) also for the Italian PA • institutional partners • AIIP - first Italian ISPs Association - founder and member, 1995 • NaMeX - Internet exchange and interconnection point - founder and member, 1995 • strong vocation for innovation (making significant investments in R&D)
  4. 4. Unidata S.p.A. • since 2012 - public and private cloud services • UniCloud [3] - yep, it’s OpenStack! ;-) • Folsom release • Full access to OpenStack API (SSL) • IPv6 enabled
  5. 5. Cloud Monitoring
  6. 6. Puppies vs Cattle
  7. 7. Puppies vs Cattle • (crude) analogy that describes the most appropriate use of the cloud paradigm • “The servers in today’s data center are like puppies – they’ve got names and when they get sick, everything grinds to a halt while you nurse them back to health” -- Joshua McKenty, co-founder of Piston Cloud • treat servers like cattle • a single server should easily replaced • it should be possible to (seamlessly) increment or decrement their number for a given application
  8. 8. Puppies vs Cattle • ...not only for VMs... • it also make sense for the bare-metal • this also changes something for monitoring, doesn’t it?
  9. 9. Cloud Monitoring • for cloud monitoring we’ve got two points of view • operators • infrastructural monitoring • end users • cloud infrastructural resources (IaaS) monitoring (e.g. cloud servers monitoring) • cloud services monitoring (SaaS/PaaS)
  10. 10. Cloud Monitoring • in both cases: what to monitor? and with what purpose? • availability - for proactive anomalies fix • efficiency - for (proactive) capacity planning • what is needed? • alerting systems • instantaneous measures • historical data
  11. 11. OpenStack Monitoring
  12. 12. OpenStack Monitoring • as of today (Grizzly release) there is no integrated and ready-to-use monitoring system [1] • what about Ceilometer? • general purpose measurement collector
  13. 13. OpenStack Monitoring • Healthnmon (uses ceilometer) [2] • inventory management • alerts and notifications • utilization data (CPU, RAM, network, storage) for guests and hosts
  14. 14. ...meanwhile... • those who already offer cloud services based on the OpenStack had to develop (semi-) ad-hoc solutions • OpenStack is massively scalable... • ...so also the monitoring system should be scalable • the good news is that we have all the ingredients • and they are free and open source ;-)
  15. 15. What to monitor? • load average/ CPUs/RAM/swap/disk & network usage • alerts based on absolute (and relative) thresholds • health of storage resources • logs analysis • system integrity checks
  16. 16. What to monitor? • OpenStack specific • services availability and logs of the following • nova-* • glance-* • cinder-* • keystone • horizon • misc (dnsmasq, swift, rabbitmq)
  17. 17. Unidata S.p.A. case report
  18. 18. UniCloud • UniCloud logical architecture - public cloud infrastructure
  19. 19. Monitoring - Operator p.o.v
  20. 20. UniCloud Monitoring • Zenoss core, for infrastructural monitoring • open source (GPLv2) • SNMP and network protocol monitoring of applications, servers and network devices • auto-discovery / auto-modeling • crucial for automatizations (puppies vs cattle) • just add the SNMP agent to the configuration of new nodes (e.g. with Puppet)
  21. 21. UniCloud - Zenoss core • Web UI with events and infrastructure summary • historical data browsing • customizable reports • real-time email or user- defined alerts • simple integration with an SMS gateway
  22. 22. UniCloud Monitoring • OpenStack/Systems logs • swatch - email alerts for errors/anomalies • logwatch - daily system status review • system integrity (and security) • smartmontools - health of hard drives with email notifications • rkhunter - daily systems status analysis and (eventual) alerting • arpwatch - real-time ARP monitoring (detection of duplicate IPs)
  23. 23. Monitoring - User p.o.v
  24. 24. UniCloud Monitoring • ad hoc monitoring system based on • OpenStack API • Collectd [5] • collects, transfers and stores performance data of computers and network equipment • modular architecture • we used RRD, LibVirt, and network plugins • free and open source (GPLv2) • we wrote a patch for the LibVirt plugin - included since version 5.2 [6]
  25. 25. UniCloud Monitoring • Front-end • WEB-UI RoR (written from scratch) • OpenStack ActiveResource - Ruby binding for OpenStack API by Unidata S.p.A. [7]
  26. 26. UniCloud Monitoring • hypervisors • acquire “raw” data from LibVirt (localhost) • sends structured data to the collector • collector • receives data from the network • (efficiently) writes RRD files • RoR application • establishes a mapping between OpenStack cloud instances and RRD files (via API) • renders performance graphs to fulfill user requests (instances and timespans)
  27. 27. UniCloud Monitoring • What gets monitored? • all the measurements that the collectd LibVirt plugin makes available • for each vCPU - utilization rate (%) • for each network interface - pps, bps and eps (in+out) • for each disks - bps and ops (read+write) • with “extra volumes” from nova-volume (or cinder)
  28. 28. UniCloud Monitoring • Does it scale? • collectd is not a new product... • it has proven itself to be very reliable and scalable • it’s possible to use multiple collectors • for HA (using multicast) or LB • puppies vs cattle? • automatic discovery of new cloud instances • collectd installation and configuration should be made by means of a configuration management system (e.g. Puppet)
  29. 29. UniCloud Monitoring Collectd configuration example (/etc/collectd/collectd.conf) Collector Hypervisors
  30. 30. Some screenshots
  31. 31. Some screenshots
  32. 32. Some screenshots
  33. 33. Grazie per l’attenzione Domande!
  34. 34. [1] OpenStack official programs https://wiki.openstack.org/wiki/Programs [2] Ceilometer and Healthnmon https://wiki.openstack.org/wiki/Ceilometer/CeilometerAndHealthnmon [3] UniCloud http://unicloud.it [4] Zenoss http://zenoss.com [5] Collectd http://collectd.org [6] Collectd 5.2 changelog https://collectd.org/wiki/index.php/Version_5.2 [7] OpenStack ActiveResource https://github.com/Unidata-SpA/openstack_activeresource References

×