One of the great challenges of of monitoring any large cluster is how much data to collect and how often to collect it. Those responsible for managing the cloud infrastructure want to see everything collected centrally which places limits on how much and how often. Developers on the other hand want to see as much detail as they can at as high a frequency as reasonable without impacting the overall cloud performance.
To address what seems to be conflicting requirements, we've chosen a hybrid model at HP. Like many others, we have a centralized monitoring system that records a set of key system metrics for all servers at the granularity of 1 minute, but at the same time we do fine-grained local monitoring on each server of hundreds of metrics every second so when there are problems that need more details than are available centrally, one can go to the servers in question to see exactly what was going on at any specific time.
The tool of choice for this fine-grained monitoring is the open source tool collectl, which additionally has an extensible api. It is through this api that we've developed a swift monitoring capability to not only capture the number of gets, put, etc every second, but using collectl's colmux utility, we can also display these in a top-like formact to see exactly what all the object and/or proxy servers are doing in real-time.
We've also developer a second cability that allows one to see what the Virtual Machines are doing on each compute node in terms of CPU, disk and network traffic. This data can also be displayed in real-time with colmux.
This talk will briefly introduce the audience to collectl's capabilities but more importantly show how it's used to augment any existing centralized monitoring infrastructure.
Speakers
Mark Seger
Handwritten Text Recognition for manuscripts and early printed texts
Fine grained monitoring
1. Fine-grained Monitoring at HP
Mark Seger
Hewlett Packard
Cloud Services
4/19/2013 1Fine-grained Monitoring
2. Agenda
• What is the problem we’re trying to solve
• Introduction to collectl
• Monitoring Swift & Glance
• Monitoring VMs
4/19/2013 2Fine-grained Monitoring
3. Conflicting Problem Statements
• (Ks of nodes) X (hundreds of metrics)
– And you want to centrally monitor them how often?
– And store then in a database for future mining?
– Politely pause for laughter…
• Reality
– Choose a frequency on the order of a minute or more
– Don’t collect all data at the same time
– Don’t collect everything
• BUT when problems arise
– Granularity measured in minutes is too coarse
– If samples aren’t taken at the same time, how do you correlate?
– There’s almost never enough detailed data to provide answers
4/19/2013 3Fine-grained Monitoring
4. Isn’t the solution obvious?
• Use one central tool and another local tool
• But this has its own problems as well
– Will the data cross-correlate? Hopefully…
– What about customizations?
– Do you really want to collect the same data twice?
• At HP we’ve chosen a hybrid model
– Use a lightweight local data collection tool, some redundancy
• Collect cpu, disk, net & mem every second; key processes every 5
• Extend it to add OpenStack monitoring capabilities
• Send subset/updates to centralized monitor every minute or more
– Central tool: collectd, Local tool: collectl no relation!
4/19/2013 4Fine-grained Monitoring
5. Introduction to collectl
• Open source tool, in use for many years
• Roots in HPC, so knows how to be efficient
– Think of it as SAR on steroids
– Can monitor at sub-second intervals when needed
– Synchronizes samples across cluster to usecs for correlation
– Can generate data in plottable formats
– Can record data locally and send over a socket
• And do it at different frequencies!
• Can also write to a snapshot file which is what we’re doing
– Has an API for extensibility
– Several utilities for plotting and real-time cluster monitoring
4/19/2013 5Fine-grained Monitoring
8. Let’s talk about swift/glance monitoring
• The real question is what metrics do they expose?
– Track GETs, PUTs, etc
– Include object sizes and timings
– Also provides error codes/text
• Tail swift logs and write rolling counters every second
– Operation types
– Object size and network bandwidth histograms, though b/w can be
misleading
• Also generate hourly/daily summaries, retaining 1 week’s worth
• Same utility also knows how to parse glance logs
– Separates tracking of metadata operations
cat /var/log/perf/ops/opscount.txt
get: 29452211 put: 84775192 del: 12433208 post: 65666 pat: 0 head: 28489510 e4xx: 174473 e500: 4774
get 25679667 547036 58147 28824 234 8839362141 23961566 717292 124028 95240 69479 35804 12355 5796 6814 8794
put 49048442 489078 123922 41085 715 12633635902 36213403 120714 10620 3815 4293 3666 6570 4030 188 0
4/19/2013 8Fine-grained Monitoring
9. Collectl plugins
• Use collectl’s import API
– Read opscount file every monitoring interval logging to disk
– Can also be used interactively
– Most importantly, supported by collectl utilities
• Use collectl’s export API to write to local file every minute
– Aligned to top-of-minute to avoid RRD messing with the data
• Use collectd’s putval capabilility to upload when file changes
4/19/2013 9Fine-grained Monitoring
11. What about KVM?
• Uses several collectl plugins
– Collectl tells us the command line used to start each process
• Parses line for instance ID and mac address
• Another collectl plugin tell us mac -> vnet name mapping
– Collectl also tracks I/O for each processs
– Another to monitor our block storage service
– Can use nova manage to look up user info by instance
• This data also sent to collectd once/minute
segerm@nv-aw2az1-compute0004:~$ sudo collectl --import vnet:bockc --export kvmsum
# PROCESS SUMMARY (counters are /sec)
# PID THRD S SysT UsrT Pct N AccumTim BckI BckO DskI DskO NetI NetO Instance UserID BockServer(s)
13273 6 S 0.01 0.05 1 4 15:39:28 0 0 0 12 0 0 000d5cc3 31689020408812
18093 14 S 0.02 0.26 7 4 41:15:00 0 0 0 0 0 0 000e33d7 29387151913164
19517 1 S 0.00 0.01 1 1 01:39:54 0 0 0 0 0 0 000e23e1 84575604886783
30287 1 S 0.00 0.00 0 1 05:30:11 0 0 0 0 0 0 000bf753 22420103441357 10.8.14.129
30739 2 S 0.00 0.00 0 2 12:18:42 0 0 0 0 0 0 00061147 30248174159870
4/19/2013 11Fine-grained Monitoring
12. Using the cloud to monitor the cloud
• Each morning slightly after midnight
– Ask each node to generate a summary of yesterday
– Do parallel copy to pull back to central node
– Also do parallel copy of plottable data
– Generate a set of 24 hour plots in batch, slow but worth it
• Investigating parallelizing some of this too
4/19/2013 12Fine-grained Monitoring
13. Currently, a very crude prototype
Daily numbers
4/19/2013 13Fine-grained Monitoring
14. Error Counts and Bandwidth Too
4/19/2013 14Fine-grained Monitoring
19. Operations and PUT histograms
Note – the sample sizes are 1 second and plots only 10KB
4/19/2013 19Fine-grained Monitoring
20. Collectl Multiplexor: colmux
• Think of collectl top-anything
– Including any plugins
– Runs collectl in real-time against set of nodes
– Sorts by any column and can dynamically change with arrow keys
– 2 different output formats
• Can also playback historical data for diagnostic analysis
# OPS SUMMARY (/sec) Fri Mar 22 15:45:05 2013 Connected: 14 of 14
# <--------------------operations-------------------->
#Host GetKB PutKB Gets Puts Dels Post Pats Head E4xx E500
sw-aw2az1-proxy014 0 224 0 4 0 0 0 0 0 0
sw-aw2az1-proxy010 0 166 0 3 0 0 0 0 0 0
sw-aw2az1-proxy009 0 100 0 2 0 0 0 0 0 0
sw-aw2az1-proxy005 0 77 0 2 0 0 0 0 0 0
Time 001 003 004 005 007 008 | 001 003 004 005 007 008 | Get Put
16:02:45 0 0 0 1 0 0 | 0 23 0 0 0 0 | 1 23
16:02:46 0 0 0 0 0 0 | 0 18 0 0 0 0 | 0 18
16:02:47 0 0 0 0 0 0 | 0 23 3 0 3 1 | 0 30
16:02:48 0 0 0 0 0 0 | 1 19 15 2 0 0 | 0 37
16:02:49 0 0 0 0 0 0 | 1 10 19 0 2 1 | 0 33
4/19/2013 20Fine-grained Monitoring
21. Monitoring 192 nodes
Idle Nodes CPU Burst Very Busy Erratic
You don’t even have to be able to read the output to see what’s happening
4/19/2013 21Fine-grained Monitoring