OpenStack in Action 4! Nick Barcet & Julien Danjou - From ceilometer to telemetry not so alarming!

1,310 views
957 views

Published on

Paris, 5th December 2013 : OpenStack in Action 4! organized by eNovance, brings together members of the OpenStack community.

Published in: Business, Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,310
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
33
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

OpenStack in Action 4! Nick Barcet & Julien Danjou - From ceilometer to telemetry not so alarming!

  1. 1. From Ceilometer to Telemetry Not so alarming! A Julien Danjou & Nick Barcet presentation for OpenStack in action! 4 on the 5th December 2013
  2. 2. Speakers Nick Barcet VP Products @ eNovance Co-founded the Ceilometer project at the Folsom summit and led the project through incubation Julien Danjou Ceilometer Lead Dev @ eNovance Has been a core Ceilometer contributor from the outset, taking over the PTL reins for Havana
  3. 3. State of the project •  Officially named OpenStack Telemetry •  Havana is the first integrated release •  Community growth o  o  Grizzly: 30 contributors, 267 commits Havana: 57 contributors, 434 commits
  4. 4. What was done during the Havana cycle?
  5. 5. UDP transport •  Faster, stateless •  Lighter (msgpack encoding) but… •  No delivery guaranteed •  Not signed ▶ Use case: gathering metrics for alarms
  6. 6. Improved API •  Group samples by fields when requesting •  •  statistics (?groupby[]=user_id) Limit the number of items returned (?limit=42) Provides links to other resources in the API
  7. 7. Send your own samples Users or operators can send samples ➔ Leverage the statistics ➔ Usable for alarming POST  /v2/meters/mymeter     [{      "counter_type":  "gauge",      "counter_unit":  "megabyte",      "counter_volume":  142.0,      "user_id":   "efd87807-­‐12d2-­‐4b38-­‐9c70-­‐5f5c2ac427ff",      "project_id":  "35b17138-­‐b364-­‐4e6a-­‐ a131-­‐8f3099c5be68",      "resource_id":   "bd9431c1-­‐8d69-­‐4ad3-­‐803a-­‐8d4a6b89fd36",      "resource_metadata":  {              "name1":  "value1",              "name2":  "value2"      },      "source":  "mypaasplatform",      "timestamp":  "2013-­‐09-­‐10T20:34:13.711330"   }]  
  8. 8. New storage backends
  9. 9. Database TTL Previously: No way to purge data. Ceilometer produces a lot of data (gigabytes per day) Now: ceilometer-expirer will drop data older than the configured time-to-live delay
  10. 10. Hyper-V ➔  Disk, network and CPU usage
  11. 11. New meters •  API endpoints o  Meters the requests made to API server (Neutron, Glance, Nova, Swift, etc) •  Neutron bandwidth o  o  Meter the bandwidth consumed by each project Traffic labeled as configured by operator (based on source/destination)
  12. 12. Neutron Traffic Labels Internet label: Ext label: Compute VM VM label: Object VM Swift Swift Swift
  13. 13. Alarms Regularly watch for meters statistics values and triggers actions based on threshold crossings.
  14. 14. Alarms architecture Ceilometer API R P C H T T P Ceilometer alarm evaluator Webhook, SMS, email… B u s Trigger Trigger Ceilometer Ceilometer alarm notifier Ceilometer alarm notifier alarm notifier
  15. 15. Alarm types •  Threshold alarmsTriggered once a value crosses a threshold“Call a Webhook as soon as CPU usage goes above 80%” •  Combination alarmsTriggered once all alarms in that alarm are triggered“Call a Webhook as soon as alarm “foo” and alarm “bar” are triggered”
  16. 16. Alarms API POST /v2/alarms { "alarm_actions": [ "http://site:8000/alarm"], "insufficient_data_actions": ["http://site:8000/nodata"], "ok_actions": ["http://site:8000/ok"], "comparison_operator": "gt", "description": "An alarm", "evaluation_periods": 2, "matching_metadata": {"key_name": "key_value"}, "meter_name": "storage.objects", "name": "SwiftObjectAlarm", "period": 240, "statistic": "avg", "threshold": 200.0 } GET /v2/alarms/foobar PUT /v2/alarms/foobar DELETE /v2/alarms/foobar
  17. 17. Heat & auto-scaling API service Heat Engine injects user metadata triggers alarm my_stack Instance monitors instances Alarm evaluator Compute Agent Ceilometer creates alarms
  18. 18. Heat & auto-scaling Heat Engine injects user metadata my_stack Instance Instance Instance API Alarms scales out stack Compute Ceilometer alarming
  19. 19. Heat & auto-scaling Heat Engine injects user metadata my_stack Instance Instance Instance Instance Instance API Alarms scales out stack Compute Ceilometer alarming
  20. 20. Events storage (Almost) all OpenStack components send notifications on events: let’s store them. ➔  Useful to be able to re-generate samples ➔  Useful to generate new sample we did not think about ➔  Allow to have a double-entry accounting ➔  Audit ability Not yet complete, to be continued in Icehouse
  21. 21. Exciting ideas for Icehouse we’re going to hack on.
  22. 22. General improvements •  Split the collector in two logical pieces •  Rely on notification for samples rather than •  •  •  RPC Bring SQLAlchemy and MongoDB driver almost on parity Support for hardware polling Support Ironic
  23. 23. API improvements •  Complex filtering and query DSL •  /v2/samples(a.k.a. /v2/meter without the x  OR  y  AND  z   •  •  •  meter) Return rate rather than absolute value More statistics functions (rate of change, moving-window averages…) Bulk requests
  24. 24. Alarming •  Exclude low sample counts Allow time constrained alarms • 
  25. 25. Distributed polling Leveraging Tooz and Taskflow to distribute tasks among workers (agents). ★ Ability to distribute the polling ★ Replace alarm evaluator custom distributor
  26. 26. OpenStack Telemetry Ceilometer #openstack-ceilometer @ Freenode The end.
  27. 27. Backup slides
  28. 28. Heat & auto-scaling my_stack Instance API service Meter store queries stats reports samples Compute Agent provides alarm rules Alarm evaluator Ceilometer Heat Engine

×