Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

The Gnocchi Experiment

a high-level, technical overview of how Gnocchi stores handles time series data.

  • Be the first to comment

The Gnocchi Experiment

  1. 1. The Gnocchi Experiment playing with timeseries
  2. 2. History ● Ceilometer started in 2012 ○ Original mission: provide an infrastructure to collect any information needed regarding OpenStack projects ● Added alarming in 2013 ○ Create rules and based on threshold conditions that when broken trigger action ● Added events in 2014 ○ The state of an object in an OpenStack service at a point in time ● New mission ○ To reliably collect data on the utilization of the physical and virtual resources comprising deployed clouds, persist these data for subsequent retrieval and analysis, and trigger actions when defined
  3. 3. Ceilometer Architecture OpenStack Services Notification Bus API External Systems Notification Agents Agent1 AgentN Agent2 Pipeline Polling Agents Agent1 AgentN Agent2 Pipeline Databases Alarms Events Meters AlarmEvaluator AlarmNotifier Collectors Collector1 CollectorN Collector2
  4. 4. this didn’t work.
  5. 5. Growing pains ● Too large of a scope - we did everything ● Too complex - must deploy everything ● Too much data - all data in one place ● Too few resources - handful of developers ● Too generic a solution - storage designed to handle any scenario ● Good at nothing, average/bad at everything
  6. 6. Ceilometer Gnocchi Ceilometer Architecture Notification Bus Aodh OpenStack Services MetricsAPI External Systems Notification Agents Agent1 AgentN Agent2 Pipeline Polling Agents Agent1 AgentN Agent2 Panko Alarms Events Metrics AlarmEvaluator Collectors Collector1 CollectorN Collector2 AlarmNotifier EventsAPI
  7. 7. Componentisation ● Split functionality into own projects ○ Faster rate of change ○ Less expertise ● Important functionality lives ● Ceilometer - data gathering and transformation service ● Gnocchi - time series storage service ● Aodh - alarming service ● Panko - event focused storage service ● They all work together and separately
  8. 8. Gnocchi
  9. 9. Gnocchi use cases ● Storage brick for a billing system ● Alarm-triggering or monitoring system ● Statistical usage of data
  10. 10. Ceilometer to Gnocchi ● Ceilometer legacy storage captures full-resolution data ○ Each datapoint has: Timestamp, measurement, IDs, resource metadata, metric metadata, etc… ● Gnocchi stores pre-aggregated data in a timeserie ○ Each datapoint has: Timestamp, measurement… that’s it… and then it’s compressed ○ resource metadata is an explicit subset AND not tied to measurement ○ Defined archival rules ■ capture data at 1 min granularity for 1 day AND 3 hr granularity for 1 month AND ...
  11. 11. Archive Policies 5 minute granularity for a day 1 day granularity for a year
  12. 12. How it all works...
  13. 13. Ceilometer Raw sample { "user_id": "0d9d089b8f8340999fbe01354ef84643", "resource_id": "a7c7cf84-5bf7-4838-a116-645ea376f4e0", "timestamp": "2016-05-11T18:23:46.166000", "meter": "disk.write.bytes", "volume": 56114794496, "source": "openstack", "recorded_at": "2016-05-11T18:23:47.177000", "project_id": "dec2b73655154e31be903fc93e575146", "type": "cumulative", "id": "7fbf56ca-17a5-11e6-a210-e8bdd1f62a56", "unit": "B", "metadata": { "instance_host": "cloud03.wz", "ephemeral_gb": "0", "flavor.vcpus": "8", "OS-EXT-AZ.availability_zone": "nova", "memory_mb": "16384", "display_name": "gord_dev", "state": "active", "flavor.id": "5", "status": "active", "ramdisk_id": "None", "flavor.name": "m1.xlarge", "disk_gb": "160", "kernel_id": "None", "image.id": "dba2c73c-3f11-45a1-998a-6a4ca2cf243e", "flavor.ram": "16384", "host": "64fe410a8b602f69fe43a180c62b02d6c00e41c03caba40a092e2fb6", "device": "['vda']", "flavor.ephemeral": "0", "image.name": "fedora-23-x86_64", } }
  14. 14. Separation of value Resource ● Id ● User_id ● Project_id ● Start_timestamp: timestamp ● End_timestamp: timestamp ● Metadata: {attribute: value} ● Metric: list Measurements ● [ (timestamp, value), ... ] Metric ● Name ● archive_policy
  15. 15. Gnocchi Architecture API Resource Indexer Metric Storage MetricD Computation workers data
  16. 16. MetricD Aggregation Metric Storage MetricD Computation workers2 raw metric dump computed aggregates 1 3backlog 1. Get unprocessed datapoint 2. Compute new aggregations a. Update sum, avg, min, max, etc… values based on define policy 3. Add datapoint to backlog for next computation a. Delete datapoints not required for future aggregations b. By default, only keep backlog for single period.
  17. 17. Storage format Metric Storage raw metric dump computed aggregates backlog ● [ (timestamp, value), (timestamp,value) ] ● One object per write ● { values: { timestamp: value, timestamp:value }, block_size: max number of points, back_window: number of blocks to retain} ● Binary serialised using msgpacks ● One object per metric ● { first_timestamp: first timestamp of block, aggregation_method: sum, min, max, etc…, max_size: max number of points, sampling: granularity (60s, 300s, etc…), timestamps: [ time1, time2, … ], values: [value1, value2, … ]} ● Binary serialised using msgpacks ● Compressed with LZ4 ● Split into chunks to minimise transfer when updating large series ● (potentially) multiple objects per aggregate per granularity per metric
  18. 18. Query path API Resource Indexer Metric Storage What’s the cpu utilisation for VM1? resource_id Meausures (all granularities) metric_id +---------------------------+-------------+----------------+ | timestamp | granularity | value | +---------------------------+-------------+----------------+ | 2016-04-07T00:00:00+00:00 | 86400.0 | 0.30323927544 | | 2016-04-07T17:00:00+00:00 | 3600.0 | 1.2855184725 | | 2016-04-07T18:00:00+00:00 | 3600.0 | 0.188613527791 | | 2016-04-07T19:00:00+00:00 | 3600.0 | 0.188871232024 | | 2016-04-07T20:00:00+00:00 | 3600.0 | 0.188876901916 | | 2016-04-07T21:00:00+00:00 | 3600.0 | 0.189646641908 | | 2016-04-07T21:10:00+00:00 | 300.0 | 0.190019839676 | | 2016-04-07T21:15:00+00:00 | 300.0 | 0.186565358466 | | 2016-04-07T21:20:00+00:00 | 300.0 | 0.183166934543 | | 2016-04-07T21:25:00+00:00 | 300.0 | 0.179994544916 | | 2016-04-07T21:30:00+00:00 | 300.0 | 0.186649908928 | | 2016-04-07T21:35:00+00:00 | 300.0 | 0.193315212093 | | 2016-04-07T21:40:00+00:00 | 300.0 | 0.193272093903 | | 2016-04-07T21:45:00+00:00 | 300.0 | 0.196677374077 | | 2016-04-07T21:50:00+00:00 | 300.0 | 0.193300189049 | +---------------------------+-------------+----------------+ metric_id
  19. 19. Query path API Resource Indexer Metric Storage What’s the metadata for VM1? resource_id resource+-----------------------+----------------------------------------------------------------+ | Field | Value | +-----------------------+----------------------------------------------------------------+ | created_by_project_id | f7481a38d7c543528d5121fab9eb2b99 | | created_by_user_id | 9246f424dcb341478067967f495dc133 | | display_name | test3 | | ended_at | None | | flavor_id | 1 | | host | 7f218c8350a86a71dbe6d14d57e8f74fa60ac360fee825192a6cf624 | | id | e90974a6-31bf-4e47-8824-ca074cd9b47d | | image_ref | 671375cc-177b-497a-8551-4351af3f856d | | metrics | cpu.delta: 20cd1d71-de2f-43d5-90a8-b23ad31a7d04 | | | cpu_util: 22cd22e7-e48e-4f21-887a-b1c6612b4c98 | | | disk.iops: 9611a114-d37e-42e7-9b0c-0fb5e61d96c8 | | | disk.latency: 6205c66f-2a5d-49c8-85e6-aa7572cfb34a | | | disk.root.size: c9f9ca31-7e54-4dd7-81ad-129d86951dbc | | | disk.usage: 4f29ca2e-d58f-40a9-94a7-15084233c1bb | | original_resource_id | e90974a6-31bf-4e47-8824-ca074cd9b47d | | project_id | 71bf402adea343609f2192ce998fa38e | | revision_end | None | | revision_start | 2016-04-07T17:32:33.245924+00:00 | | server_group | None | | started_at | 2016-04-07T17:32:25.740862+00:00 | | type | instance | | user_id | fd3eb127863b4177bf1abb38dda1f557 | +-----------------------+----------------------------------------------------------------+
  20. 20. Zero computation at query. Only lookup.
  21. 21. Results (benchmark data, Gnocchi 1.3.x)
  22. 22. Ceilometer to Gnocchi Ceilometer legacy storage ● Single datapoint averages to ~1.5KB/point (mongodb) or ~150B/point (SQL) ● For 1000 VM, capturing 10 metrics/VM, every minute: ~15MB/minute, ~900MB/hour, ~21GB/day, etc… Gnocchi ● Single datapoint AT MOST is 9B/point ● For 1000 VM, capturing 10 metrics/VM, every minute: ~90KB/minute, ~5.4MB/hour, ~130MB/day, etc…

×