Introduction to Blueflood at Berlin Buzzwords 2013

1,443 views

Published on

Blueflood is a distributed metrics processing service created by Rackspace. Source code will be released in Summer 2013.

Published in: Technology
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,443
On SlideShare
0
From Embeds
0
Number of Embeds
38
Actions
Shares
0
Downloads
0
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide

Introduction to Blueflood at Berlin Buzzwords 2013

  1. 1. BluefloodSimple Metrics ProcessingGary Dusbabek • Rackspace • Berlin Buzzwords 2013Monday, June 3, 13
  2. 2. OutlineDescriptionMotivationConceptsGutsMonday, June 3, 13
  3. 3. Monday, June 3, 13
  4. 4. First things first...Monday, June 3, 13
  5. 5. First things first...We are making this open sourceMonday, June 3, 13
  6. 6. First things first...We are making this open sourceSoonMonday, June 3, 13
  7. 7. Open SoresMonday, June 3, 13
  8. 8. Open SoarsMonday, June 3, 13
  9. 9. What is Blueflood?Monday, June 3, 13
  10. 10. Three things:Monday, June 3, 13
  11. 11. Monday, June 3, 13
  12. 12. Ingest metricsMonday, June 3, 13
  13. 13. Ingest metricsCondense metricsMonday, June 3, 13
  14. 14. Ingest metricsCondense metricsQuery metricsMonday, June 3, 13
  15. 15. Ingest signalsCondense signalsQuery signalsMonday, June 3, 13
  16. 16. Written in JavaMonday, June 3, 13
  17. 17. Cassandra for dataMonday, June 3, 13
  18. 18. MotivationFast GraphsAccept Multiple TenantsCheap(ish)MaintainableMonday, June 3, 13
  19. 19. Fast GraphsMonday, June 3, 13
  20. 20. Primary use?Monday, June 3, 13
  21. 21. Primary use?Dashboards & GraphsMonday, June 3, 13
  22. 22. Get data quickly!!!Monday, June 3, 13
  23. 23. Monday, June 3, 13
  24. 24. Return as fewdata points aspossibleMonday, June 3, 13
  25. 25. Return as fewdata points aspossiblePrecomputewhen possibleMonday, June 3, 13
  26. 26. Monday, June 3, 13
  27. 27. Sweet Spot:Monday, June 3, 13
  28. 28. Sweet Spot:300-400 data pointsMonday, June 3, 13
  29. 29. Sweet Spot:300-400 data pointsCan’t fit much more into a graphMonday, June 3, 13
  30. 30. Support MultipleTenantsMonday, June 3, 13
  31. 31. Mainly customretention policies:Different TTLsacross tenantsMonday, June 3, 13
  32. 32. CheapishMonday, June 3, 13
  33. 33. Avoid new hardwareMonday, June 3, 13
  34. 34. Avoid new hardwareCassandra nodes use lots ofdisk, but no CPU.Monday, June 3, 13
  35. 35. Avoid new hardwareCassandra nodes use lots ofdisk, but no CPU.Let’s use that CPU!Monday, June 3, 13
  36. 36. MaintenanceMonday, June 3, 13
  37. 37. Graphs arenot ourprimaryproduct.Monday, June 3, 13
  38. 38. Graphs arenot ourprimaryproduct.Thiscannot bea burden.Monday, June 3, 13
  39. 39. Requirelittle tuningMonday, June 3, 13
  40. 40. Requirelittle tuningScaleshorizontally.Monday, June 3, 13
  41. 41. Three jobsOne code baseMonday, June 3, 13
  42. 42. IngestRoll UpQueryMonday, June 3, 13
  43. 43. ConceptsMonday, June 3, 13
  44. 44. MetricMonday, June 3, 13
  45. 45. SignalMetricMonday, June 3, 13
  46. 46. Has dataMetricMonday, June 3, 13
  47. 47. Can have a typeMetricMonday, June 3, 13
  48. 48. Can alsohave unitsMetricMonday, June 3, 13
  49. 49. Is uniquelyidentifiableMetricMonday, June 3, 13
  50. 50. LocatorMonday, June 3, 13
  51. 51. Uniquely identifiesa metricLocatorMonday, June 3, 13
  52. 52. Treated opaquelyby the systemLocatorMonday, June 3, 13
  53. 53. You should embeddata in itLocatorMonday, June 3, 13
  54. 54. Example:123:web-0:disk0:bytes-freeMonday, June 3, 13
  55. 55. A metric exists as a rowa:b:cLocatorMonday, June 3, 13
  56. 56. A metric exists as a rowt1 data,a:b:cLocatorMonday, June 3, 13
  57. 57. A metric exists as a rowt1 data,t2 data,a:b:cLocatorMonday, June 3, 13
  58. 58. A metric exists as a rowt1 data,t2 data,t3 data,a:b:cLocatorMonday, June 3, 13
  59. 59. A metric exists as a rowt1 data,t2 data,t3 data,t4 data,a:b:cLocatorMonday, June 3, 13
  60. 60. A metric exists as a rowt1 data,t2 data,t3 data,t4 data,t5a:b:cLocatorMonday, June 3, 13
  61. 61. ShardMonday, June 3, 13
  62. 62. Partitions themetric space toshare rollupresponsibilitiesMonday, June 3, 13
  63. 63. When set to N,every metrichashes to 0..N-1ShardMonday, June 3, 13
  64. 64. We use 128 shardsShardMonday, June 3, 13
  65. 65. Each node ownsone or moreshardsShardMonday, June 3, 13
  66. 66. A shard is ownedby one or morenodesShardMonday, June 3, 13
  67. 67. Has nothing to dowith queryShardMonday, June 3, 13
  68. 68. Very little to dowith data ingestionShardMonday, June 3, 13
  69. 69. A single node actively rolls up ashardMonday, June 3, 13
  70. 70. Zookeeper manages thisMonday, June 3, 13
  71. 71. Zookeeper manages this(we’d like this to go away)Monday, June 3, 13
  72. 72. OK for Zookeeper to fail becauserollup operations are idempotentMonday, June 3, 13
  73. 73. Granularity(or Resolution)Monday, June 3, 13
  74. 74. A way of dividingup timeGranularityMonday, June 3, 13
  75. 75. TimeFull . . . . . . . . . . . . . . . . . . . . . . . .Data can arrived spaced evenly...Monday, June 3, 13
  76. 76. TimeFull .. . . . . . .... . .. . . . .. . . . . .. . . . . .... or not.Monday, June 3, 13
  77. 77. TimeFull5min.. . . . . . .... . .. . . . .. . . . . .. . . . . ..Monday, June 3, 13
  78. 78. TimeFull5min.. . . . . . .... . .. . . . .. . . . . .. . . . . .. .Monday, June 3, 13
  79. 79. TimeFull5min.. . . . . . .... . .. . . . .. . . . . .. . . . . .. . .Monday, June 3, 13
  80. 80. TimeFull5min.. . . . . . .... . .. . . . .. . . . . .. . . . . .. . . .Monday, June 3, 13
  81. 81. TimeFull5min.. . . . . . .... . .. . . . .. . . . . .. . . . . .. . . . .Monday, June 3, 13
  82. 82. TimeFull5min.. . . . . . .... . .. . . . .. . . . . .. . . . . .. . . . . .Monday, June 3, 13
  83. 83. TimeFull5min.. . . . . . .... . .. . . . .. . . . . .. . . . . .. . . . . . .Monday, June 3, 13
  84. 84. TimeFull5min.. . . . . . .... . .. . . . .. . . . . .. . . . . .. . . . . . . .Monday, June 3, 13
  85. 85. TimeFull5min.. . . . . . .... . .. . . . .. . . . . .. . . . . .. . . . . . . . .Monday, June 3, 13
  86. 86. TimeFull5min.. . . . . . .... . .. . . . .. . . . . .. . . . . .. . . . . . . . . .Monday, June 3, 13
  87. 87. TimeFull5min.. . . . . . .... . .. . . . .. . . . . .. . . . . .. . . . . . . . . . .Monday, June 3, 13
  88. 88. TimeFull5min.. . . . . . .... . .. . . . .. . . . . .. . . . . .. . . . . . . . . . .Monday, June 3, 13
  89. 89. TimeFull5min20min.. . . . . . .... . .. . . . .. . . . . .. . . . . .. . . . . . . . . . . ..Monday, June 3, 13
  90. 90. TimeFull5min20min.. . . . . . .... . .. . . . .. . . . . .. . . . . .. . . . . . . . . . . .. .Monday, June 3, 13
  91. 91. TimeFull5min20min.. . . . . . .... . .. . . . .. . . . . .. . . . . .. . . . . . . . . . . .. . .Monday, June 3, 13
  92. 92. TimeFull5min20min1h.. . . . . . .... . .. . . . .. . . . . .. . . . . .. . . . . . . . . . . .. . ..Monday, June 3, 13
  93. 93. TimeFull5min20min1h.. . . . . . .... . .. . . . .. . . . . .. . . . . .. . . . . . . . . . . .. . ..4h24hetc.Can’t demonstrate4h and 24h andmaintain scaleMonday, June 3, 13
  94. 94. Imagine a two weekperiod divided intochunks, each 5m longSlotsMonday, June 3, 13
  95. 95. 288 of them in a daySlotsMonday, June 3, 13
  96. 96. 4032 in one twoweek periodSlotsMonday, June 3, 13
  97. 97. Number them0..4031SlotsMonday, June 3, 13
  98. 98. They repeat everytwo weeksSlotsMonday, June 3, 13
  99. 99. 4029Monday, June 3, 13
  100. 100. 40294030Monday, June 3, 13
  101. 101. 402940304031Monday, June 3, 13
  102. 102. 4029403040310Monday, June 3, 13
  103. 103. 40294030403101Monday, June 3, 13
  104. 104. Same concept forother granularitiesSlotsMonday, June 3, 13
  105. 105. Just fewer slots asgranularities becomecoarserSlotsMonday, June 3, 13
  106. 106. 5m = 4032 slots over two weeksMonday, June 3, 13
  107. 107. 5m = 4032 slots over two weeks20m = 1008 slots over two weeksMonday, June 3, 13
  108. 108. 5m = 4032 slots over two weeks20m = 1008 slots over two weeks1h = 336 slots over two weeksMonday, June 3, 13
  109. 109. 5m = 4032 slots over two weeks20m = 1008 slots over two weeks1h = 336 slots over two weeks4h = 84 slots over two weeksMonday, June 3, 13
  110. 110. 5m = 4032 slots over two weeks20m = 1008 slots over two weeks1h = 336 slots over two weeks4h = 84 slots over two weeks24h = 14 slots over two weeks (duh!)Monday, June 3, 13
  111. 111. Every timestamphashes to one slotSlotsMonday, June 3, 13
  112. 112. Can be used to update amap indicating if a metric hasbeen seen over a timeperiodSlotsMonday, June 3, 13
  113. 113. Just keep track of thelast slot during whichyou saw it.SlotsMonday, June 3, 13
  114. 114. Tracked in a columnfamily with a 48h TTLSlotsMonday, June 3, 13
  115. 115. Metric is “forgotten”after 48hSlotsMonday, June 3, 13
  116. 116. Doesn’t get rolled upany moreSlotsMonday, June 3, 13
  117. 117. IngestionMonday, June 3, 13
  118. 118. Metric has attributesMonday, June 3, 13
  119. 119. Metric has attributeslocator (id)Monday, June 3, 13
  120. 120. Metric has attributeslocator (id)valueMonday, June 3, 13
  121. 121. Metric has attributeslocator (id)valuecollection timeMonday, June 3, 13
  122. 122. Metric has attributeslocator (id)valuecollection timetime to live (TTL)Monday, June 3, 13
  123. 123. Metric has attributeslocator (id)valuecollection timetime to live (TTL)typeMonday, June 3, 13
  124. 124. Metric has attributeslocator (id)valuecollection timetime to live (TTL)typeunitMonday, June 3, 13
  125. 125. Metric has attributeslocator (id)valuecollection timetime to live (TTL)typeunitMonday, June 3, 13
  126. 126. Metrics arrive somehowMonday, June 3, 13
  127. 127. Metrics arrive somehowPasse through transformsMonday, June 3, 13
  128. 128. Metrics arrive somehowPasse through transforms(you can augment these)Monday, June 3, 13
  129. 129. Metrics arrive somehowPasse through transforms(you can augment these)Written to the full-resolution databaseMonday, June 3, 13
  130. 130. Metrics arrive somehowPasse through transforms(you can augment these)Written to the full-resolution databaseWritten to the discoverydatabaseMonday, June 3, 13
  131. 131. Metrics arrive somehowPasse through transforms(you can augment these)Written to the full-resolution databaseWritten to the discoverydatabase(so we know what metrics are active for a givenshard)Monday, June 3, 13
  132. 132. Metrics arrive somehowPasse through transforms(you can augment these)Written to the full-resolution databaseWritten to the discoverydatabase(so we know what metrics are active for a givenshard)Shard+slot state is updated(marked dirty, so we know what time periodsneed to be rolled up)Monday, June 3, 13
  133. 133. Ingestion is designed tobe pluggableMonday, June 3, 13
  134. 134. Ingestion is designed tobe pluggableIf you’re a coder youcan swap in otherthings:Monday, June 3, 13
  135. 135. Ingestion is designed tobe pluggableIf you’re a coder youcan swap in otherthings:TransportsMonday, June 3, 13
  136. 136. Ingestion is designed tobe pluggableIf you’re a coder youcan swap in otherthings:TransportsTransformsMonday, June 3, 13
  137. 137. Ingestion is designed tobe pluggableIf you’re a coder youcan swap in otherthings:TransportsTransformsOr just live with thedefaultsMonday, June 3, 13
  138. 138. Monday, June 3, 13
  139. 139. late data is okMonday, June 3, 13
  140. 140. late data is oktolerated unless laterthan 24hMonday, June 3, 13
  141. 141. Roll UpsMonday, June 3, 13
  142. 142. roll up is scheduled when a slot has notreceived data for 5 minutesMonday, June 3, 13
  143. 143. roll up is scheduled when a slot has notreceived data for 5 minutes(usually because time has moved beyond it)Monday, June 3, 13
  144. 144. roll up is scheduled when a slot has notreceived data for 5 minutes(usually because time has moved beyond it)select out all locators updated during that slotMonday, June 3, 13
  145. 145. roll up is scheduled when a slot has notreceived data for 5 minutes(usually because time has moved beyond it)select out all locators updated during that slotfor each locator get all datapoints during therange of that slotMonday, June 3, 13
  146. 146. roll up is scheduled when a slot has notreceived data for 5 minutes(usually because time has moved beyond it)select out all locators updated during that slotfor each locator get all datapoints during therange of that slotdo mathsMonday, June 3, 13
  147. 147. roll up is scheduled when a slot has notreceived data for 5 minutes(usually because time has moved beyond it)select out all locators updated during that slotfor each locator get all datapoints during therange of that slotdo mathssave in coarser granularityMonday, June 3, 13
  148. 148. roll up is scheduled when a slot has notreceived data for 5 minutes(usually because time has moved beyond it)select out all locators updated during that slotfor each locator get all datapoints during therange of that slotdo mathssave in coarser granularityrepeat per granularityMonday, June 3, 13
  149. 149. Monday, June 3, 13
  150. 150. Smart schedulingMonday, June 3, 13
  151. 151. Smart schedulingDon’t rollup a 1h range if its 20m ranges havenot been computed.Monday, June 3, 13
  152. 152. Monday, June 3, 13
  153. 153. Don’t want to get behindMonday, June 3, 13
  154. 154. Don’t want to get behindIf you can’t process all rollups within 5 minutes,you need more processingMonday, June 3, 13
  155. 155. Query APIMonday, June 3, 13
  156. 156. GetByPoints(locator, from, to, numPoints)Monday, June 3, 13
  157. 157. GetByPoints(locator, from, to, numPoints)“Give me N data points”Monday, June 3, 13
  158. 158. GetByPoints(locator, from, to, numPoints)“Give me N data points”Automatically chooses resolution for best fitMonday, June 3, 13
  159. 159. GetByResolution(locator, start, stop, resolution)Monday, June 3, 13
  160. 160. GetByResolution(locator, start, stop, resolution)Most controlMonday, June 3, 13
  161. 161. GetByResolution(locator, start, stop, resolution)Most controlPossibility of return more data than you needMonday, June 3, 13
  162. 162. Query: Straight Cassandra readsHelps with SLA across tenantsMonday, June 3, 13
  163. 163. @gdusbabekMonday, June 3, 13
  164. 164. Image CreditsAll images came from the Flickr Commons Collectionhttp://flickr.com/commonschalk drawing http://www.flickr.com/photos/stevendepolo/4705141484outline http://www.flickr.com/photos/adactio/3563013647pencil http://www.flickr.com/photos/isox4/4841242881sore http://www.flickr.com/photos/yortw/5436427109soar http://www.flickr.com/photos/eyeno/6183027047amber http://www.flickr.com/photos/mikaelmiettinen/4219852860wall http://www.flickr.com/photos/vinothchandar/8093281752truck http://www.flickr.com/photos/amalakar/8111811112packages http://www.flickr.com/photos/cushinglibrary/3729414657gummi bears http://www.flickr.com/photos/28misguidedsouls/5649609098skyscraper http://www.flickr.com/photos/nathanmac87/5341060061traffic light http://www.flickr.com/photos/emrank/2435273839money http://www.flickr.com/photos/heyrocker/117059817lightpost http://www.flickr.com/photos/jacreative/134129950harpsicord http://www.flickr.com/photos/dalbera/2739071156carboat http://www.flickr.com/photos/mbtrama/3826879277pens http://www.flickr.com/photos/freddy-click-boy/3098136909/river http://www.flickr.com/photos/makelessnoise/240072417/map http://www.flickr.com/photos/bulle_de/4672972586shard http://www.flickr.com/photos/pauljill/4964306570/butterflies http://www.flickr.com/photos/webtreatsetc/5265396307/camel http://www.flickr.com/photos/seattlemunicipalarchives/3797940791/sand dunes http://www.flickr.com/photos/mikebaird/8517511072/sprockets http://www.flickr.com/photos/mwichary/2665559632/snake http://www.flickr.com/photos/mattpandor4/8254089051/train http://www.flickr.com/photos/scjn/2554491487/sizes/o/steps http://www.flickr.com/photos/borkurdotnet/363738205/cinnamon roll http://www.flickr.com/photos/sbogdanich/8090569452/question mark http://www.flickr.com/photos/bilal-kamoon/6835060992thank you http://www.flickr.com/photos/nateone/3768979925/Monday, June 3, 13

×