Generating and Analyzing Events            Zach Tellman               Runa             @ztellman
linear threads of    execution
linear threads of    execution
linear threads of    execution
logging
logging“but we have only the silent evidence of scattered cups and dice, ... brooches and sandals. How can we make these o...
logging
logging• is a lossy stream of data• is a verbose stream of data• is a necessary stream of data
lamina• is for transforming streams of data• is for aggregating streams of data• is for analyzing streams of data• is for ...
lamina• is for creating narratives from events
what we talk aboutwhen we talk about     events
a single event• is called a future• is called a promise• is called an async-result
a stream of events • is called a channel
channelnode         queue
(enqueue ch 1)
(map* inc ch)
(enqueue ch 2)
(map* dec ch)
(enqueue ch 3)
lamina.viz(view-graph ch)(view-propagation ch 1)
(join a c)(join b c)
(receive-all c prn)
(close a)
(close a)
(close a)
(close a)
(close a)
other familiar functions     • mapcat*     • reduce*     • reductions*     • take*     • take-while*     • partition*     ...
so why not just use      seqs?
data+ time  event
less familiar functions    • sample-every    • partition-every    • rate    • mean    • periodically    • combine-latest
1 lazy-seq+ 1 thread  1 eager-seq
one thread per edge
back-propagation
back-propagation
lamina• is for temporal streams of data• is for granular streams of data• is for reasoning about streams of data
probe-channels • are grounded • are permanent • are named
generating probe data(trace :foo:bar  {:value 123   :description "a number!"})
consuming probe data(probe-channel :foo:bar)(probe-channel ["foo" :bar])
consuming probe data(select-probes "fo*" #"ba[r|z]")
consuming probe data
instrumenting functions(instrument + {:name "plus"})
(probe-channel :plus:enter)  {:name "plus"   :timestamp 1234567890   :args [1 2 3]}
(probe-channel :plus:return)   {:name "plus"    :timestamp 1234567890    :offset 0    :duration 10000    :sub-tasks nil   ...
(probe-channel :plus:error)   {:name "plus"    :timestamp 1234567890    :offset 0    :duration 10000    :sub-tasks nil    ...
(defn-instrumented bar  [x y]  (+ x y))(defn-instrumented foo  {:name "foo"}  [x y]  (bar x y))
{:name "foo" :timestamp 1234567890 :offset 0 :duration 10000 :sub-tasks ({:name "user:bar"              :timestamp 1234567...
(time* (foo 1 2))time - 13.0us  foo - 10.0us    user:bar - 8.0us
the cost of instrumentation          (+ 1 1)             vs       (apply + [1 1])
executors• are instrumented thread-pools
executors(executor  {:name :some-threads   :min-thread-count 2   :max-thread-count 8})
(defexecutor ex  {:name "ex"   :max-thread-count 4})(defn-instrumented foo  {:name "foo"   :executor ex}  [x y]  (+ x y))
(probe-channel :foo:return)(probe-channel :ex)  {:name "foo"   :offset 0   :timestamp 1234567890   :duration 10000   :enqu...
lamina.stats • rate • sum • mean • quantiles • variance • outliers
meanan exponentially weighted moving average      with a configurable window
mean(->> (probe-channel :ex)     (map* :enqueued-duration)     mean)
quantiles     a statistical distribution over thelast five minutes using reservoir sampling
quantiles(->> (probe-channel :foo:enter)     rate     quantiles)
quantiles{50     5 75     7.5 95     9.5 99     9.9 99.9   9.99}
outliersa filtered view of the data stream that       only emits unusual events
outliers(->> (probe-channel :foo:return)     (outliers :duration))
it’s easy to merge streams of data,     but hard to split them apart
distributor    creates sub-streams grouped by facet,and applies identical operations to each stream
distributor(distributor :uri  (fn [facet ch]    (->> ch         (close-on-idle 10000)         rate         (map* #(vector ...
distributor
aggregatemerges multiple periodic streams
aggregate(aggregate first ch)
aggregate["/" 1]["/abc" 1]["/def" 1]
aggregate["/" 1]["/abc" 1]["/def" 1]["/" 1]
aggregate{"/"      ["/" 1] "/abc" ["/abc" 1] "/def" ["/def" 1]}["/" 1]
distribute-aggregate(distribute-aggregate  :uri   (fn [facet ch] (rate ch))   ch)
distribute-aggregate      {"/"    1       "/abc" 1       "/def" 1}
(distribute-aggregate  :name  (fn [_ ch]    (->> ch         (map* :duration)         sum))  (probe-channel :ex))
aleph“the only place on earth where all places are -- seen from every angle, each standing clear, without any confusion or...
traffic{:name "http-server" :remote-address "127.0.0.1" :bytes 42}
(distribute-aggregate  :name  (fn [_ ch]    (->> ch         (map* :bytes)         sum))  (select-probes "*traffic:out"))
(distribute-aggregate  :remote-address  (fn [_ ch]    (->> ch         (close-on-idle 10000)         (map* :bytes)         ...
what’s missing• further examples• inter-process aggregation• data endpoints/tooling
questions?
Generating and Analyzing Events
Upcoming SlideShare
Loading in...5
×

Generating and Analyzing Events

539

Published on

Presentation at EuroClojure 2012

Published in: Technology, Business
1 Comment
1 Like
Statistics
Notes
  • video available at https://vimeo.com/45132054
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Views
Total Views
539
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
11
Comments
1
Likes
1
Embeds 0
No embeds

No notes for slide
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • Generating and Analyzing Events

    1. 1. Generating and Analyzing Events Zach Tellman Runa @ztellman
    2. 2. linear threads of execution
    3. 3. linear threads of execution
    4. 4. linear threads of execution
    5. 5. logging
    6. 6. logging“but we have only the silent evidence of scattered cups and dice, ... brooches and sandals. How can we make these objects live?” - Peter Ackroyd, London: A Biography
    7. 7. logging
    8. 8. logging• is a lossy stream of data• is a verbose stream of data• is a necessary stream of data
    9. 9. lamina• is for transforming streams of data• is for aggregating streams of data• is for analyzing streams of data• is for reacting to streams of data
    10. 10. lamina• is for creating narratives from events
    11. 11. what we talk aboutwhen we talk about events
    12. 12. a single event• is called a future• is called a promise• is called an async-result
    13. 13. a stream of events • is called a channel
    14. 14. channelnode queue
    15. 15. (enqueue ch 1)
    16. 16. (map* inc ch)
    17. 17. (enqueue ch 2)
    18. 18. (map* dec ch)
    19. 19. (enqueue ch 3)
    20. 20. lamina.viz(view-graph ch)(view-propagation ch 1)
    21. 21. (join a c)(join b c)
    22. 22. (receive-all c prn)
    23. 23. (close a)
    24. 24. (close a)
    25. 25. (close a)
    26. 26. (close a)
    27. 27. (close a)
    28. 28. other familiar functions • mapcat* • reduce* • reductions* • take* • take-while* • partition* • partition-all*
    29. 29. so why not just use seqs?
    30. 30. data+ time event
    31. 31. less familiar functions • sample-every • partition-every • rate • mean • periodically • combine-latest
    32. 32. 1 lazy-seq+ 1 thread 1 eager-seq
    33. 33. one thread per edge
    34. 34. back-propagation
    35. 35. back-propagation
    36. 36. lamina• is for temporal streams of data• is for granular streams of data• is for reasoning about streams of data
    37. 37. probe-channels • are grounded • are permanent • are named
    38. 38. generating probe data(trace :foo:bar {:value 123 :description "a number!"})
    39. 39. consuming probe data(probe-channel :foo:bar)(probe-channel ["foo" :bar])
    40. 40. consuming probe data(select-probes "fo*" #"ba[r|z]")
    41. 41. consuming probe data
    42. 42. instrumenting functions(instrument + {:name "plus"})
    43. 43. (probe-channel :plus:enter) {:name "plus" :timestamp 1234567890 :args [1 2 3]}
    44. 44. (probe-channel :plus:return) {:name "plus" :timestamp 1234567890 :offset 0 :duration 10000 :sub-tasks nil :args [1 2 3] :result 6}
    45. 45. (probe-channel :plus:error) {:name "plus" :timestamp 1234567890 :offset 0 :duration 10000 :sub-tasks nil :args [nil 1] :error ...}
    46. 46. (defn-instrumented bar [x y] (+ x y))(defn-instrumented foo {:name "foo"} [x y] (bar x y))
    47. 47. {:name "foo" :timestamp 1234567890 :offset 0 :duration 10000 :sub-tasks ({:name "user:bar" :timestamp 1234567890 :offset 1000 :duration 8000 :sub-tasks nil :args [1 2] :result 3}) :args [1 2] :result 3}
    48. 48. (time* (foo 1 2))time - 13.0us foo - 10.0us user:bar - 8.0us
    49. 49. the cost of instrumentation (+ 1 1) vs (apply + [1 1])
    50. 50. executors• are instrumented thread-pools
    51. 51. executors(executor {:name :some-threads :min-thread-count 2 :max-thread-count 8})
    52. 52. (defexecutor ex {:name "ex" :max-thread-count 4})(defn-instrumented foo {:name "foo" :executor ex} [x y] (+ x y))
    53. 53. (probe-channel :foo:return)(probe-channel :ex) {:name "foo" :offset 0 :timestamp 1234567890 :duration 10000 :enqueued-duration 500000 :sub-tasks nil :args [1 2] :result 3}
    54. 54. lamina.stats • rate • sum • mean • quantiles • variance • outliers
    55. 55. meanan exponentially weighted moving average with a configurable window
    56. 56. mean(->> (probe-channel :ex) (map* :enqueued-duration) mean)
    57. 57. quantiles a statistical distribution over thelast five minutes using reservoir sampling
    58. 58. quantiles(->> (probe-channel :foo:enter) rate quantiles)
    59. 59. quantiles{50 5 75 7.5 95 9.5 99 9.9 99.9 9.99}
    60. 60. outliersa filtered view of the data stream that only emits unusual events
    61. 61. outliers(->> (probe-channel :foo:return) (outliers :duration))
    62. 62. it’s easy to merge streams of data, but hard to split them apart
    63. 63. distributor creates sub-streams grouped by facet,and applies identical operations to each stream
    64. 64. distributor(distributor :uri (fn [facet ch] (->> ch (close-on-idle 10000) rate (map* #(vector facet %)))))
    65. 65. distributor
    66. 66. aggregatemerges multiple periodic streams
    67. 67. aggregate(aggregate first ch)
    68. 68. aggregate["/" 1]["/abc" 1]["/def" 1]
    69. 69. aggregate["/" 1]["/abc" 1]["/def" 1]["/" 1]
    70. 70. aggregate{"/" ["/" 1] "/abc" ["/abc" 1] "/def" ["/def" 1]}["/" 1]
    71. 71. distribute-aggregate(distribute-aggregate :uri (fn [facet ch] (rate ch)) ch)
    72. 72. distribute-aggregate {"/" 1 "/abc" 1 "/def" 1}
    73. 73. (distribute-aggregate :name (fn [_ ch] (->> ch (map* :duration) sum)) (probe-channel :ex))
    74. 74. aleph“the only place on earth where all places are -- seen from every angle, each standing clear, without any confusion or blending.” - Jorge Luis Borges, The Aleph
    75. 75. traffic{:name "http-server" :remote-address "127.0.0.1" :bytes 42}
    76. 76. (distribute-aggregate :name (fn [_ ch] (->> ch (map* :bytes) sum)) (select-probes "*traffic:out"))
    77. 77. (distribute-aggregate :remote-address (fn [_ ch] (->> ch (close-on-idle 10000) (map* :bytes) sum)) (select-probes "*traffic:out"))
    78. 78. what’s missing• further examples• inter-process aggregation• data endpoints/tooling
    79. 79. questions?
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×