Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Generating and Analyzing Events            Zach Tellman               Runa             @ztellman
linear threads of    execution
linear threads of    execution
linear threads of    execution
logging
logging“but we have only the silent evidence of scattered cups and dice, ... brooches and sandals. How can we make these o...
logging
logging• is a lossy stream of data• is a verbose stream of data• is a necessary stream of data
lamina• is for transforming streams of data• is for aggregating streams of data• is for analyzing streams of data• is for ...
lamina• is for creating narratives from events
what we talk aboutwhen we talk about     events
a single event• is called a future• is called a promise• is called an async-result
a stream of events • is called a channel
channelnode         queue
(enqueue ch 1)
(map* inc ch)
(enqueue ch 2)
(map* dec ch)
(enqueue ch 3)
lamina.viz(view-graph ch)(view-propagation ch 1)
(join a c)(join b c)
(receive-all c prn)
(close a)
(close a)
(close a)
(close a)
(close a)
other familiar functions     • mapcat*     • reduce*     • reductions*     • take*     • take-while*     • partition*     ...
so why not just use      seqs?
data+ time  event
less familiar functions    • sample-every    • partition-every    • rate    • mean    • periodically    • combine-latest
1 lazy-seq+ 1 thread  1 eager-seq
one thread per edge
back-propagation
back-propagation
lamina• is for temporal streams of data• is for granular streams of data• is for reasoning about streams of data
probe-channels • are grounded • are permanent • are named
generating probe data(trace :foo:bar  {:value 123   :description "a number!"})
consuming probe data(probe-channel :foo:bar)(probe-channel ["foo" :bar])
consuming probe data(select-probes "fo*" #"ba[r|z]")
consuming probe data
instrumenting functions(instrument + {:name "plus"})
(probe-channel :plus:enter)  {:name "plus"   :timestamp 1234567890   :args [1 2 3]}
(probe-channel :plus:return)   {:name "plus"    :timestamp 1234567890    :offset 0    :duration 10000    :sub-tasks nil   ...
(probe-channel :plus:error)   {:name "plus"    :timestamp 1234567890    :offset 0    :duration 10000    :sub-tasks nil    ...
(defn-instrumented bar  [x y]  (+ x y))(defn-instrumented foo  {:name "foo"}  [x y]  (bar x y))
{:name "foo" :timestamp 1234567890 :offset 0 :duration 10000 :sub-tasks ({:name "user:bar"              :timestamp 1234567...
(time* (foo 1 2))time - 13.0us  foo - 10.0us    user:bar - 8.0us
the cost of instrumentation          (+ 1 1)             vs       (apply + [1 1])
executors• are instrumented thread-pools
executors(executor  {:name :some-threads   :min-thread-count 2   :max-thread-count 8})
(defexecutor ex  {:name "ex"   :max-thread-count 4})(defn-instrumented foo  {:name "foo"   :executor ex}  [x y]  (+ x y))
(probe-channel :foo:return)(probe-channel :ex)  {:name "foo"   :offset 0   :timestamp 1234567890   :duration 10000   :enqu...
lamina.stats • rate • sum • mean • quantiles • variance • outliers
meanan exponentially weighted moving average      with a configurable window
mean(->> (probe-channel :ex)     (map* :enqueued-duration)     mean)
quantiles     a statistical distribution over thelast five minutes using reservoir sampling
quantiles(->> (probe-channel :foo:enter)     rate     quantiles)
quantiles{50     5 75     7.5 95     9.5 99     9.9 99.9   9.99}
outliersa filtered view of the data stream that       only emits unusual events
outliers(->> (probe-channel :foo:return)     (outliers :duration))
it’s easy to merge streams of data,     but hard to split them apart
distributor    creates sub-streams grouped by facet,and applies identical operations to each stream
distributor(distributor :uri  (fn [facet ch]    (->> ch         (close-on-idle 10000)         rate         (map* #(vector ...
distributor
aggregatemerges multiple periodic streams
aggregate(aggregate first ch)
aggregate["/" 1]["/abc" 1]["/def" 1]
aggregate["/" 1]["/abc" 1]["/def" 1]["/" 1]
aggregate{"/"      ["/" 1] "/abc" ["/abc" 1] "/def" ["/def" 1]}["/" 1]
distribute-aggregate(distribute-aggregate  :uri   (fn [facet ch] (rate ch))   ch)
distribute-aggregate      {"/"    1       "/abc" 1       "/def" 1}
(distribute-aggregate  :name  (fn [_ ch]    (->> ch         (map* :duration)         sum))  (probe-channel :ex))
aleph“the only place on earth where all places are -- seen from every angle, each standing clear, without any confusion or...
traffic{:name "http-server" :remote-address "127.0.0.1" :bytes 42}
(distribute-aggregate  :name  (fn [_ ch]    (->> ch         (map* :bytes)         sum))  (select-probes "*traffic:out"))
(distribute-aggregate  :remote-address  (fn [_ ch]    (->> ch         (close-on-idle 10000)         (map* :bytes)         ...
what’s missing• further examples• inter-process aggregation• data endpoints/tooling
questions?
Generating and Analyzing Events
Upcoming SlideShare
Loading in …5
×

Generating and Analyzing Events

Presentation at EuroClojure 2012

  • Login to see the comments

Generating and Analyzing Events

  1. 1. Generating and Analyzing Events Zach Tellman Runa @ztellman
  2. 2. linear threads of execution
  3. 3. linear threads of execution
  4. 4. linear threads of execution
  5. 5. logging
  6. 6. logging“but we have only the silent evidence of scattered cups and dice, ... brooches and sandals. How can we make these objects live?” - Peter Ackroyd, London: A Biography
  7. 7. logging
  8. 8. logging• is a lossy stream of data• is a verbose stream of data• is a necessary stream of data
  9. 9. lamina• is for transforming streams of data• is for aggregating streams of data• is for analyzing streams of data• is for reacting to streams of data
  10. 10. lamina• is for creating narratives from events
  11. 11. what we talk aboutwhen we talk about events
  12. 12. a single event• is called a future• is called a promise• is called an async-result
  13. 13. a stream of events • is called a channel
  14. 14. channelnode queue
  15. 15. (enqueue ch 1)
  16. 16. (map* inc ch)
  17. 17. (enqueue ch 2)
  18. 18. (map* dec ch)
  19. 19. (enqueue ch 3)
  20. 20. lamina.viz(view-graph ch)(view-propagation ch 1)
  21. 21. (join a c)(join b c)
  22. 22. (receive-all c prn)
  23. 23. (close a)
  24. 24. (close a)
  25. 25. (close a)
  26. 26. (close a)
  27. 27. (close a)
  28. 28. other familiar functions • mapcat* • reduce* • reductions* • take* • take-while* • partition* • partition-all*
  29. 29. so why not just use seqs?
  30. 30. data+ time event
  31. 31. less familiar functions • sample-every • partition-every • rate • mean • periodically • combine-latest
  32. 32. 1 lazy-seq+ 1 thread 1 eager-seq
  33. 33. one thread per edge
  34. 34. back-propagation
  35. 35. back-propagation
  36. 36. lamina• is for temporal streams of data• is for granular streams of data• is for reasoning about streams of data
  37. 37. probe-channels • are grounded • are permanent • are named
  38. 38. generating probe data(trace :foo:bar {:value 123 :description "a number!"})
  39. 39. consuming probe data(probe-channel :foo:bar)(probe-channel ["foo" :bar])
  40. 40. consuming probe data(select-probes "fo*" #"ba[r|z]")
  41. 41. consuming probe data
  42. 42. instrumenting functions(instrument + {:name "plus"})
  43. 43. (probe-channel :plus:enter) {:name "plus" :timestamp 1234567890 :args [1 2 3]}
  44. 44. (probe-channel :plus:return) {:name "plus" :timestamp 1234567890 :offset 0 :duration 10000 :sub-tasks nil :args [1 2 3] :result 6}
  45. 45. (probe-channel :plus:error) {:name "plus" :timestamp 1234567890 :offset 0 :duration 10000 :sub-tasks nil :args [nil 1] :error ...}
  46. 46. (defn-instrumented bar [x y] (+ x y))(defn-instrumented foo {:name "foo"} [x y] (bar x y))
  47. 47. {:name "foo" :timestamp 1234567890 :offset 0 :duration 10000 :sub-tasks ({:name "user:bar" :timestamp 1234567890 :offset 1000 :duration 8000 :sub-tasks nil :args [1 2] :result 3}) :args [1 2] :result 3}
  48. 48. (time* (foo 1 2))time - 13.0us foo - 10.0us user:bar - 8.0us
  49. 49. the cost of instrumentation (+ 1 1) vs (apply + [1 1])
  50. 50. executors• are instrumented thread-pools
  51. 51. executors(executor {:name :some-threads :min-thread-count 2 :max-thread-count 8})
  52. 52. (defexecutor ex {:name "ex" :max-thread-count 4})(defn-instrumented foo {:name "foo" :executor ex} [x y] (+ x y))
  53. 53. (probe-channel :foo:return)(probe-channel :ex) {:name "foo" :offset 0 :timestamp 1234567890 :duration 10000 :enqueued-duration 500000 :sub-tasks nil :args [1 2] :result 3}
  54. 54. lamina.stats • rate • sum • mean • quantiles • variance • outliers
  55. 55. meanan exponentially weighted moving average with a configurable window
  56. 56. mean(->> (probe-channel :ex) (map* :enqueued-duration) mean)
  57. 57. quantiles a statistical distribution over thelast five minutes using reservoir sampling
  58. 58. quantiles(->> (probe-channel :foo:enter) rate quantiles)
  59. 59. quantiles{50 5 75 7.5 95 9.5 99 9.9 99.9 9.99}
  60. 60. outliersa filtered view of the data stream that only emits unusual events
  61. 61. outliers(->> (probe-channel :foo:return) (outliers :duration))
  62. 62. it’s easy to merge streams of data, but hard to split them apart
  63. 63. distributor creates sub-streams grouped by facet,and applies identical operations to each stream
  64. 64. distributor(distributor :uri (fn [facet ch] (->> ch (close-on-idle 10000) rate (map* #(vector facet %)))))
  65. 65. distributor
  66. 66. aggregatemerges multiple periodic streams
  67. 67. aggregate(aggregate first ch)
  68. 68. aggregate["/" 1]["/abc" 1]["/def" 1]
  69. 69. aggregate["/" 1]["/abc" 1]["/def" 1]["/" 1]
  70. 70. aggregate{"/" ["/" 1] "/abc" ["/abc" 1] "/def" ["/def" 1]}["/" 1]
  71. 71. distribute-aggregate(distribute-aggregate :uri (fn [facet ch] (rate ch)) ch)
  72. 72. distribute-aggregate {"/" 1 "/abc" 1 "/def" 1}
  73. 73. (distribute-aggregate :name (fn [_ ch] (->> ch (map* :duration) sum)) (probe-channel :ex))
  74. 74. aleph“the only place on earth where all places are -- seen from every angle, each standing clear, without any confusion or blending.” - Jorge Luis Borges, The Aleph
  75. 75. traffic{:name "http-server" :remote-address "127.0.0.1" :bytes 42}
  76. 76. (distribute-aggregate :name (fn [_ ch] (->> ch (map* :bytes) sum)) (select-probes "*traffic:out"))
  77. 77. (distribute-aggregate :remote-address (fn [_ ch] (->> ch (close-on-idle 10000) (map* :bytes) sum)) (select-probes "*traffic:out"))
  78. 78. what’s missing• further examples• inter-process aggregation• data endpoints/tooling
  79. 79. questions?

×