Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Upcoming SlideShare
Metrics stack 2.0
Metrics stack 2.0
Loading in …3
×
1 of 46

Metrics 2.0 & Graph-Explorer

1

Share

Download to read offline

Related Books

Free with a 30 day trial from Scribd

See all

Related Audiobooks

Free with a 30 day trial from Scribd

See all

Metrics 2.0 & Graph-Explorer

  1. 1. Metrics 2.0 & Graph­Explorer    
  2. 2.     Credit: user niteroi @ panoramio.com
  3. 3.     vimeo.com/43800150
  4. 4.    
  5. 5.    
  6. 6. “Dieter” ?    
  7. 7. Peter → Deter    
  8. 8. what.is.a.metric    
  9. 9. stats.timers.dfs5.proxy­ server.object.GET.200.timing .upper_90    
  10. 10. O(X*Y*Z) X = # apps                 Y = # people              Z = # aggregators         
  11. 11. stats.timers.dfs5.proxy­server.object.GET.200.timing.upper_90 {     “server”: “dfvimeodfsproxy5”,     “http_method”: “GET”,     “http_code”: “200”,     “unit”: “ms”,     “target_type”: “gauge”,     “stat”: “upper_90”,     “swift_type”: “object”     “plugin”: “swift_proxy_server” } https://github.com/vimeo/graph­explorer/wiki    
  12. 12. ● b: bit ● B: byte ● Err: errors ● Warn: warnings ● Conn: connections ● Event: events (tcp events etc) ● Ino: inodes ● Jiff: jiffies (i.e. for cpu usage) ● Job: job (as in job queue) ● File: (not 'F' that's farad) ● Load: cpu load ● Metric: a metric line like in the statsd or graphite protocol ● Msg: message (like in message queues) ● Page: page (as in memory segment) ● Pckt: network packet ● Process ● Req: http requests, database queries, etc ● Sock: sockets ● Thread ●   Ticket: upload tickets, kerberos tickets, ..  
  13. 13.    
  14. 14. Carbon­tagger: ...  service=foo.instance=host.target_type=gauge.type=calculation .unit=B 123 1234567890 … Statsdaemon: ..unit=B..unit=B...     →  unit=B/s ..unit=ms..unit=ms.. →  unit=ms stat=mean    
  15. 15.    
  16. 16.    
  17. 17. Graph­Explorer queries 101 Proxy­server swift server:regex upper_90  unit=ms from <datetime> to <datetime> avg  over <timespec>     
  18. 18.    
  19. 19.    
  20. 20.    
  21. 21.    
  22. 22. Stack .. http_method:(PUT|GET)  swift_type=object avg by http_code,server    
  23. 23.    
  24. 24. transcode unit=jobs/s avg over <time> from  <datetime> to <datetime>    
  25. 25.     Note: data is obfuscated
  26. 26. !queue sum by zone:ap­southeast|eu­west|us­ east|us­west|sa­east|vimeo­df|vimeo­lv group  by state    
  27. 27.     Note: data is obfuscated
  28. 28. Group by zone    
  29. 29.     Note: data is obfuscated
  30. 30. {     server=dfvimeodfs1     plugin=diskspace     mountpoint=_srv_node_dfs5     unit=B     type=used     target_type=gauge }    
  31. 31. server:dfvimeodfs unit=GB type=free srv node    
  32. 32.    
  33. 33. unit=GB/d group by mountpoint    
  34. 34.    
  35. 35.    
  36. 36.    
  37. 37.    
  38. 38.    
  39. 39. unit=Mb/s network dfvimeorpc sum by server    
  40. 40.    
  41. 41. unit=MB    
  42. 42.    
  43. 43.    
  44. 44. Dashboard definition  queries = [    'cpu usage sum by core',    'mem unit=B !total group by type:swap',    'stack network unit=b/s',    'unit=B (free|used) group by =mountpoint'  ]    
  45. 45. Conclusion ● Changing information needs (esp. for troubleshooting) ● Complicated information needs  → changing & complicated graphs & alerts → PAIN ● ● Self­describing metrics ● Standardized metrics ● Native metrics 2.0 ●   Structuring metrics → BREEZE   
  46. 46. Conclusion ● metrics can be a lot more useful ● Feedback ● ●   Graph­Explorer, carbon­tagger,  statsdaemon, ... Standardisation & native metrics 2.0 ?  

×