Messaging,
interoperability and log
  aggregation - a new
      framework

   Tomas Doran (t0m) <bobtfish@bobtfish.net>
Sponsored by
•   state51
    •   Pb of mogilefs, 100+ boxes.
    •   > 4 million tracks on-demand via API
    •   > 400 reqs/s per server, >1Gb peak from backhaul
•   Suretec VOIP Systems
    •   UK voice over IP provider
    •   Extensive API, including WebHooks for notifications
•   TIM Group
    •   “Alpha capture” applications
    •   Java / Scala / Clojure / ruby / puppet / python / perl
What?
• This talk is about my new perl library:
  Message::Passing
Why?
Why?
• I’d better stop, and explain a specific
  problem.
Why?
• I’d better stop, and explain a specific
  problem.
• The solution that grew out of this is more
  generic.
Why?
• I’d better stop, and explain a specific
  problem.
• The solution that grew out of this is more
  generic.
• But it illustrates my concerns and design
  choices well.
Why?
• I’d better stop, and explain a specific
  problem.
• The solution that grew out of this is more
  generic.
• But it illustrates my concerns and design
  choices well.
• And everyone likes a story, right?
Once upon a time...


• I was bored of tailing log files across dozens
  of servers
Once upon a time...


• I was bored of tailing log files across dozens
  of servers
• splunk was amazing, but unaffordable
Logstash
Centralised logging
Centralised logging
• Syslog isn’t good enough
Centralised logging
• Syslog isn’t good enough
 • UDP is lossy, TCP not much better
Centralised logging
• Syslog isn’t good enough
 • UDP is lossy, TCP not much better
 • Limited fields
Centralised logging
• Syslog isn’t good enough
 • UDP is lossy, TCP not much better
 • Limited fields
 • No structure to actual message
Centralised logging
• Syslog isn’t good enough
 • UDP is lossy, TCP not much better
 • Limited fields
 • No structure to actual message
 • RFC3164 - “This document describes the
    observed behaviour of the syslog protocol”
Centralised logging
• Syslog isn’t good enough
• Structured app logging
Centralised logging
• Syslog isn’t good enough
• Structured app logging
 • We want to log data, rather than text
    from our application
Centralised logging
• Syslog isn’t good enough
• Structured app logging
 • We want to log data, rather than text
    from our application
 • E.g. HTTP request - vhost, path, time to
    generate, N db queries etc..
Centralised logging
• Syslog isn’t good enough
• Structured app logging
Centralised logging
• Syslog isn’t good enough
• Structured app logging
• Post-process log files to re-structure
Centralised logging
• Syslog isn’t good enough
• Structured app logging
• Post-process log files to re-structure
Centralised logging
• Syslog isn’t good enough
• Structured app logging
• Post-process log files to re-structure
 • Cases we do not control (e.g. apache)
Centralised logging
• Syslog isn’t good enough
• Structured app logging
• Post-process log files to re-structure
 • Cases we do not control (e.g. apache)
 • SO MANY DATE FORMATS. ARGHH!!
Apache


[27/Jun/2012:23:57:03
       +0000]
Apache
ElasticSearch


 [2012-06-26
02:08:26,879]
ElasticSearch
RabbitMQ


26-Jun-2012::16:18:30
RabbitMQ
MongoDB


Thu Jun 28 01:02:29
MongoDB
Syslog



Jun 28 00:17:26
Syslog
.Net ‘tick’
634763158360000000

  100 ns from 1st Jan 1AD
(Except those that are from 3rd Jan)
.Net ‘tick’
MySQL



120404 12:31:04
MySQL
Aaaaaaannnnyyyway...

• Please use ISO8601
• or epochseconds
• or epochmicroseconds
• In UTC!
Centralised logging
• Syslog isn’t good enough
• Structured app logging
• Post-process log files to re-structure
• Publish logs as JSON to a message queue
Centralised logging
• Syslog isn’t good enough
• Structured app logging
• Post-process log files to re-structure
• Publish logs as JSON to a message queue
 • JSON is fast, and widely supported
Centralised logging
• Syslog isn’t good enough
• Structured app logging
• Post-process log files to re-structure
• Publish logs as JSON to a message queue
 • JSON is fast, and widely supported
 • Great for arbitrary structured data!
Message queue
Message queue
• Flattens load spikes!
Message queue
• Flattens load spikes!
• Only have to keep up with average message
  volume, not peak volume.
Message queue
• Flattens load spikes!
• Only have to keep up with average message
  volume, not peak volume.
• Logs are bursty! (Peak rate 1000x average.)
Message queue
• Flattens load spikes!
• Only have to keep up with average message
  volume, not peak volume.
• Logs are bursty! (Peak rate 1000x average.)
• Easy to scale - just add more consumers
Message queue
• Flattens load spikes!
• Only have to keep up with average message
  volume, not peak volume.
• Logs are bursty! (Peak rate 1000x average.)
• Easy to scale - just add more consumers
• Allows smart routing
Message queue
• Flattens load spikes!
• Only have to keep up with average message
  volume, not peak volume.
• Logs are bursty! (Peak rate 1000x average.)
• Easy to scale - just add more consumers
• Allows smart routing
• Great as a common integration point.
elasticsearch
elasticsearch
• Just tip JSON documents into it
elasticsearch
• Just tip JSON documents into it
• Figures out type for each field, indexes
  appropriately.
elasticsearch
• Just tip JSON documents into it
• Figures out type for each field, indexes
  appropriately.
• Free sharding and replication
elasticsearch
• Just tip JSON documents into it
• Figures out type for each field, indexes
  appropriately.
• Free sharding and replication
• Histograms!
Logstash
  In JRuby, by Jordan Sissel

            Input
Simple:     Filter
           Output

          Flexible
         Extensible
   Plays well with others
    Nice web interface
Logstash
Logstash
   IS
MASSIVE
440Mb
 IDLE!
Logstash on each host
   is totally out...
Logstash on each host
   is totally out...
• Running it on elasticsearch servers which
  are already dedicated to this is fine..
Logstash on each host
   is totally out...
• Running it on elasticsearch servers which
   are already dedicated to this is fine..
• I’d still like to reuse all of it’s parsing
Logstash on each host
   is totally out...
• Running it on elasticsearch servers which
  are already dedicated to this is fine..
• I’d still like to reuse all of it’s parsing
• How about I just log to AMQP from my
  app?
Logstash on each host
   is totally out...
• Running it on elasticsearch servers which
  are already dedicated to this is fine..
• I’d still like to reuse all of it’s parsing
• How about I just log to AMQP from my
  app?
• Doooom!
ZeroMQ has the
correct semantics
ZeroMQ has the
    correct semantics
• Pub/Sub sockets
ZeroMQ has the
    correct semantics
• Pub/Sub sockets
• Never, ever blocking
ZeroMQ has the
    correct semantics
• Pub/Sub sockets
• Never, ever blocking
• Lossy! (If needed)
ZeroMQ has the
    correct semantics
• Pub/Sub sockets
• Never, ever blocking
• Lossy! (If needed)
• Buffer sizes / locations configureable
ZeroMQ has the
    correct semantics
• Pub/Sub sockets
• Never, ever blocking
• Lossy! (If needed)
• Buffer sizes / locations configureable
• Arbitrary message size
ZeroMQ has the
    correct semantics
• Pub/Sub sockets
• Never, ever blocking
• Lossy! (If needed)
• Buffer sizes / locations configureable
• Arbitrary message size
• IO done in a background thread
On host log collector
• ZeroMQ SUB socket
 • App logs - pre structured
• Syslog listener
 • Forward rsyslogd
• Log file tailer
• Ship to AMQP
On host log collector
This talk
• Is about my new library: Message::Passing
• The clue is in the name...
This talk
• Is about my new library: Message::Passing
• The clue is in the name...
• Hopefully really simple
This talk
• Is about my new library: Message::Passing
• The clue is in the name...
• Hopefully really simple
• Maybe even useful!
This talk
• Is about my new library: Message::Passing
• The clue is in the name...
• Hopefully really simple
• Maybe even useful!
• Definitely small - you can replace / rewrite
  it easily.
Lets make it generic!
• So, I wanted a log shipper
Lets make it generic!
• So, I wanted a log shipper
• I ended up with a framework for messaging
  interoperability
Lets make it generic!
• So, I wanted a log shipper
• I ended up with a framework for messaging
  interoperability
• Whoops!
Lets make it generic!
• So, I wanted a log shipper
• I ended up with a framework for messaging
  interoperability
• Whoops!
• Got sick of writing scripts..
Does this actually
         work?
• YES - In production at four sites for me.
Does this actually
         work?
• YES - In production at four sites for me.
• Some of the adaptors are partially
  complete
Does this actually
         work?
• YES - In production at four sites for me.
• Some of the adaptors are partially
  complete
• Dumber than logstash - no multiple
  threads/cores
Does this actually
         work?
• YES - In production at four sites for me.
• Some of the adaptors are partially
  complete
• Dumber than logstash - no multiple
  threads/cores
• ZeroMQ is insanely fast
Other people are using
   it in production!



Two people I know of already writing have
       already written adaptors!
Events - my model for
   message passing
Events - my model for
   message passing
• a hash {}
Events - my model for
   message passing
• a hash {}
• Output consumes events:
 • method consume ($event) { ...
Events - my model for
   message passing
• a hash {}
• Output consumes events:
 • method consume ($event) { ...
• Input produces events:
 • has output_to => (..
Events - my model for
   message passing
• a hash {}
• Output consumes events:
 • method consume ($event) { ...
• Input produces events:
 • has output_to => (..
• Filter does both
Simplifying assumption

$self->output_to->consume($message)
Events
That’s it.
That’s it.
• No, really - that’s all the complexity you
  have to care about!
That’s it.
• No, really - that’s all the complexity you
  have to care about!
• Except for the complexity introduced by
  the inputs and outputs you use.
That’s it.
• No, really - that’s all the complexity you
  have to care about!
• Except for the complexity introduced by
  the inputs and outputs you use.
• Unified attribute names / reconnection
  model, etc.. This helps, somewhat..
Inputs and outputs
•   ZeroMQ In / Out
•   AMQP (RabbitMQ) In / Out
•   STOMP (ActiveMQ) In / Out
•   elasticsearch Out
•   Redis PubSub In/Out
•   Syslog In
•   MongoDB Out
•   Collectd In/Out
•   HTTP POST (“WebHooks”) Out
•   UDP packets In/Out (e.g. statsd)
DSL
•   Building more complex chains
    easy!
• Multiple inputs
• Multiple outputs
• Multiple independent chains
CLI

• 1 Input
• 1 Output
• 1 Filter (default Null)

• For simple use, or testing.
CLI



• Encode / Decode step is just a Filter
• JSON by default
• Supply command line, or config file
• Daemon features
The dist:
      Message::Passing
• Core dist supplies CLI, DSL, roles for reuse.
The dist:
      Message::Passing
• Core dist supplies CLI, DSL, roles for reuse.
• Adaptors for most protocols in other
  modules.
The dist:
      Message::Passing
• Core dist supplies CLI, DSL, roles for reuse.
• Adaptors for most protocols in other
  modules.
• Moo based - small footprint, can be
  fatpacked (no XS dependencies).
The dist:
      Message::Passing
• Core dist supplies CLI, DSL, roles for reuse.
• Adaptors for most protocols in other
  modules.
• Moo based - small footprint, can be
  fatpacked (no XS dependencies).
• Moose compatible.
Example?

message-pass --input STDIN --output STDOUT
{}
{}
Less trivial example
message-pass --input ZeroMQ --input_options
‘{“socket_bind”:”tcp://*:5222”}’
--output STDOUT

message-pass --output ZeroMQ --output_options
‘{“connect”:”tcp://127.0.0.1:5222”}’
--input STDIN
Across the network:
Multiple subscribers
Jenga?
Jenga:
message-pass --input STDIN --output STOMP --output_options
'{"destination":"/queue/foo","hostname":"localhost", "port":"6163", "username":"guest",
"password":"guest"}'

message-pass --input STOMP --output Redis --input_options '{"destination":"/queue/
foo", "hostname":"localhost","port":"6163","username":"guest","password":"guest"}'
--output_options '{"topic":"foo","hostname":"127.0.0.1","port":"6379"}'

message-pass --input Redis --output AMQP --input_options '{"topics":
["foo"],"hostname":"127.0.0.1","port":"6379"}' --output_options
'{"hostname":"127.0.0.1","username":"guest","password":"guest",
"exchange_name":"foo"}'

message-pass --input AMQP --output STDOUT --input_options
'{"hostname":"127.0.0.1", "username":"guest", "password":"guest",
"exchange_name":"foo","queue_name":"foo"}'
Jenga!
Example 4?
• The last example wasn’t silly enough!
Example 4?
• The last example wasn’t silly enough!
• How could I top that?
Example 4?
• The last example wasn’t silly enough!
• How could I top that?
• Plan - Re-invent mongrel2
Example 4?
• The last example wasn’t silly enough!
• How could I top that?
• Plan - Re-invent mongrel2
• Badly
PSGI
• PSGI $env is basically just a hash.
PSGI
• PSGI $env is basically just a hash.
• (With a little fiddling), you can serialize it as
  JSON
PSGI
PSGI
• PSGI $env is basically just a hash.
PSGI
• PSGI $env is basically just a hash.
• (With a little fiddling), you can serialize it as
  JSON
PSGI
• PSGI $env is basically just a hash.
• (With a little fiddling), you can serialize it as
  JSON
• PSGI response is just an array.
PSGI
• PSGI $env is basically just a hash.
• (With a little fiddling), you can serialize it as
  JSON
• PSGI response is just an array.
• Ignore streaming responses!
Demo?
plackup -E production -s Twiggy -MPlack::App::Message::Passing
-e'Plack::App::Message::Passing->new(return_address =>
"tcp://127.0.0.1:5555", send_address =>
"tcp://127.0.0.1:5556")->to_app'

plackup -E production -s Message::Passing testapp.psgi --host
127.0.0.1 --port 5556
PUSH socket does fan
out between multiple
      handlers.

  Reply to address
embedded in request

Run multiple ‘handler’
   processes. Hot
 restarts, hot add /
  remove workers
Other applications

• Anywhere an asynchronous event stream is
  useful!
• Monitoring
• Metrics transport
• Queued jobs - worker pool
Other applications
              (Web stuff)

• User activity (ajax ‘what are your users
  doing’)
• WebSockets / MXHR
• HTTP Push notifications - “WebHooks”
WebHooks


• HTTP PUSH notification
• E.g. Paypal IPN
• Shopify API
What about logstash?

• Use my lightweight code on end nodes.
• Use logstash for parsing/filtering on the
  dedicated hardware (elasticsearch boxes)
• Filter to change my hashes to logstash
  compatible hashes
  • For use with MooseX::Storage and/or
    Log::Message::Structured
Interoperating - a real
       example
Interoperating - a real
       example
• Log JSON events out of apps (in multiple
  languages) to ZMQ
Interoperating - a real
       example
• Log JSON events out of apps (in multiple
  languages) to ZMQ
• Collect and munge with Message::Passing
  script ‘logcollector’
Interoperating - a real
       example
• Log JSON events out of apps (in multiple
  languages) to ZMQ
• Collect and munge with Message::Passing
  script ‘logcollector’
• Send to central logstash
Interoperating - a real
       example
• Log JSON events out of apps (in multiple
  languages) to ZMQ
• Collect and munge with Message::Passing
  script ‘logcollector’
• Send to central logstash
• Send onto statsd to aggregate
Interoperating - a real
       example
• Log JSON events out of apps (in multiple
  languages) to ZMQ
• Collect and munge with Message::Passing
  script ‘logcollector’
• Send to central logstash
• Send onto statsd to aggregate
• Graphs in graphite
Standard log message
Standard event message
TimedWebRequest
• A standard event
• Page generation time, URI, HTTP status
statsd
statsd
• Rolls up counters and timers into metrics
statsd
• Rolls up counters and timers into metrics
• One bucket per stat, emits values every 10
  seconds
statsd
• Rolls up counters and timers into metrics
• One bucket per stat, emits values every 10
  seconds
• Counters: Request rate, HTTP status rate
statsd
• Rolls up counters and timers into metrics
• One bucket per stat, emits values every 10
  seconds
• Counters: Request rate, HTTP status rate
• Timers: Total page time, mean page time,
  min/max page times
statsd
statsd
Code

• https://metacpan.org/module/
  Message::Passing
• https://github.com/suretec/Message-Passing
• #message-passing on irc.perl.org
• Examples: git://github.com/2941747.git

Message:Passing - lpw 2012

  • 1.
    Messaging, interoperability and log aggregation - a new framework Tomas Doran (t0m) <bobtfish@bobtfish.net>
  • 2.
    Sponsored by • state51 • Pb of mogilefs, 100+ boxes. • > 4 million tracks on-demand via API • > 400 reqs/s per server, >1Gb peak from backhaul • Suretec VOIP Systems • UK voice over IP provider • Extensive API, including WebHooks for notifications • TIM Group • “Alpha capture” applications • Java / Scala / Clojure / ruby / puppet / python / perl
  • 3.
    What? • This talkis about my new perl library: Message::Passing
  • 4.
  • 5.
    Why? • I’d betterstop, and explain a specific problem.
  • 6.
    Why? • I’d betterstop, and explain a specific problem. • The solution that grew out of this is more generic.
  • 7.
    Why? • I’d betterstop, and explain a specific problem. • The solution that grew out of this is more generic. • But it illustrates my concerns and design choices well.
  • 8.
    Why? • I’d betterstop, and explain a specific problem. • The solution that grew out of this is more generic. • But it illustrates my concerns and design choices well. • And everyone likes a story, right?
  • 9.
    Once upon atime... • I was bored of tailing log files across dozens of servers
  • 10.
    Once upon atime... • I was bored of tailing log files across dozens of servers • splunk was amazing, but unaffordable
  • 11.
  • 12.
  • 13.
    Centralised logging • Syslogisn’t good enough
  • 14.
    Centralised logging • Syslogisn’t good enough • UDP is lossy, TCP not much better
  • 15.
    Centralised logging • Syslogisn’t good enough • UDP is lossy, TCP not much better • Limited fields
  • 16.
    Centralised logging • Syslogisn’t good enough • UDP is lossy, TCP not much better • Limited fields • No structure to actual message
  • 17.
    Centralised logging • Syslogisn’t good enough • UDP is lossy, TCP not much better • Limited fields • No structure to actual message • RFC3164 - “This document describes the observed behaviour of the syslog protocol”
  • 18.
    Centralised logging • Syslogisn’t good enough • Structured app logging
  • 19.
    Centralised logging • Syslogisn’t good enough • Structured app logging • We want to log data, rather than text from our application
  • 20.
    Centralised logging • Syslogisn’t good enough • Structured app logging • We want to log data, rather than text from our application • E.g. HTTP request - vhost, path, time to generate, N db queries etc..
  • 21.
    Centralised logging • Syslogisn’t good enough • Structured app logging
  • 22.
    Centralised logging • Syslogisn’t good enough • Structured app logging • Post-process log files to re-structure
  • 23.
    Centralised logging • Syslogisn’t good enough • Structured app logging • Post-process log files to re-structure
  • 24.
    Centralised logging • Syslogisn’t good enough • Structured app logging • Post-process log files to re-structure • Cases we do not control (e.g. apache)
  • 25.
    Centralised logging • Syslogisn’t good enough • Structured app logging • Post-process log files to re-structure • Cases we do not control (e.g. apache) • SO MANY DATE FORMATS. ARGHH!!
  • 26.
  • 27.
  • 28.
  • 29.
  • 30.
  • 31.
  • 32.
  • 33.
  • 34.
  • 35.
  • 36.
    .Net ‘tick’ 634763158360000000 100 ns from 1st Jan 1AD (Except those that are from 3rd Jan)
  • 37.
  • 38.
  • 39.
  • 41.
    Aaaaaaannnnyyyway... • Please useISO8601 • or epochseconds • or epochmicroseconds • In UTC!
  • 42.
    Centralised logging • Syslogisn’t good enough • Structured app logging • Post-process log files to re-structure • Publish logs as JSON to a message queue
  • 43.
    Centralised logging • Syslogisn’t good enough • Structured app logging • Post-process log files to re-structure • Publish logs as JSON to a message queue • JSON is fast, and widely supported
  • 44.
    Centralised logging • Syslogisn’t good enough • Structured app logging • Post-process log files to re-structure • Publish logs as JSON to a message queue • JSON is fast, and widely supported • Great for arbitrary structured data!
  • 45.
  • 46.
  • 47.
    Message queue • Flattensload spikes! • Only have to keep up with average message volume, not peak volume.
  • 48.
    Message queue • Flattensload spikes! • Only have to keep up with average message volume, not peak volume. • Logs are bursty! (Peak rate 1000x average.)
  • 49.
    Message queue • Flattensload spikes! • Only have to keep up with average message volume, not peak volume. • Logs are bursty! (Peak rate 1000x average.) • Easy to scale - just add more consumers
  • 50.
    Message queue • Flattensload spikes! • Only have to keep up with average message volume, not peak volume. • Logs are bursty! (Peak rate 1000x average.) • Easy to scale - just add more consumers • Allows smart routing
  • 51.
    Message queue • Flattensload spikes! • Only have to keep up with average message volume, not peak volume. • Logs are bursty! (Peak rate 1000x average.) • Easy to scale - just add more consumers • Allows smart routing • Great as a common integration point.
  • 52.
  • 53.
    elasticsearch • Just tipJSON documents into it
  • 54.
    elasticsearch • Just tipJSON documents into it • Figures out type for each field, indexes appropriately.
  • 55.
    elasticsearch • Just tipJSON documents into it • Figures out type for each field, indexes appropriately. • Free sharding and replication
  • 56.
    elasticsearch • Just tipJSON documents into it • Figures out type for each field, indexes appropriately. • Free sharding and replication • Histograms!
  • 57.
    Logstash InJRuby, by Jordan Sissel Input Simple: Filter Output Flexible Extensible Plays well with others Nice web interface
  • 59.
  • 60.
    Logstash IS MASSIVE
  • 61.
  • 62.
    Logstash on eachhost is totally out...
  • 63.
    Logstash on eachhost is totally out... • Running it on elasticsearch servers which are already dedicated to this is fine..
  • 64.
    Logstash on eachhost is totally out... • Running it on elasticsearch servers which are already dedicated to this is fine.. • I’d still like to reuse all of it’s parsing
  • 65.
    Logstash on eachhost is totally out... • Running it on elasticsearch servers which are already dedicated to this is fine.. • I’d still like to reuse all of it’s parsing • How about I just log to AMQP from my app?
  • 66.
    Logstash on eachhost is totally out... • Running it on elasticsearch servers which are already dedicated to this is fine.. • I’d still like to reuse all of it’s parsing • How about I just log to AMQP from my app? • Doooom!
  • 67.
  • 68.
    ZeroMQ has the correct semantics • Pub/Sub sockets
  • 69.
    ZeroMQ has the correct semantics • Pub/Sub sockets • Never, ever blocking
  • 70.
    ZeroMQ has the correct semantics • Pub/Sub sockets • Never, ever blocking • Lossy! (If needed)
  • 71.
    ZeroMQ has the correct semantics • Pub/Sub sockets • Never, ever blocking • Lossy! (If needed) • Buffer sizes / locations configureable
  • 72.
    ZeroMQ has the correct semantics • Pub/Sub sockets • Never, ever blocking • Lossy! (If needed) • Buffer sizes / locations configureable • Arbitrary message size
  • 73.
    ZeroMQ has the correct semantics • Pub/Sub sockets • Never, ever blocking • Lossy! (If needed) • Buffer sizes / locations configureable • Arbitrary message size • IO done in a background thread
  • 74.
    On host logcollector • ZeroMQ SUB socket • App logs - pre structured • Syslog listener • Forward rsyslogd • Log file tailer • Ship to AMQP
  • 75.
    On host logcollector
  • 76.
    This talk • Isabout my new library: Message::Passing • The clue is in the name...
  • 77.
    This talk • Isabout my new library: Message::Passing • The clue is in the name... • Hopefully really simple
  • 78.
    This talk • Isabout my new library: Message::Passing • The clue is in the name... • Hopefully really simple • Maybe even useful!
  • 79.
    This talk • Isabout my new library: Message::Passing • The clue is in the name... • Hopefully really simple • Maybe even useful! • Definitely small - you can replace / rewrite it easily.
  • 80.
    Lets make itgeneric! • So, I wanted a log shipper
  • 81.
    Lets make itgeneric! • So, I wanted a log shipper • I ended up with a framework for messaging interoperability
  • 82.
    Lets make itgeneric! • So, I wanted a log shipper • I ended up with a framework for messaging interoperability • Whoops!
  • 83.
    Lets make itgeneric! • So, I wanted a log shipper • I ended up with a framework for messaging interoperability • Whoops! • Got sick of writing scripts..
  • 84.
    Does this actually work? • YES - In production at four sites for me.
  • 85.
    Does this actually work? • YES - In production at four sites for me. • Some of the adaptors are partially complete
  • 86.
    Does this actually work? • YES - In production at four sites for me. • Some of the adaptors are partially complete • Dumber than logstash - no multiple threads/cores
  • 87.
    Does this actually work? • YES - In production at four sites for me. • Some of the adaptors are partially complete • Dumber than logstash - no multiple threads/cores • ZeroMQ is insanely fast
  • 88.
    Other people areusing it in production! Two people I know of already writing have already written adaptors!
  • 89.
    Events - mymodel for message passing
  • 90.
    Events - mymodel for message passing • a hash {}
  • 91.
    Events - mymodel for message passing • a hash {} • Output consumes events: • method consume ($event) { ...
  • 92.
    Events - mymodel for message passing • a hash {} • Output consumes events: • method consume ($event) { ... • Input produces events: • has output_to => (..
  • 93.
    Events - mymodel for message passing • a hash {} • Output consumes events: • method consume ($event) { ... • Input produces events: • has output_to => (.. • Filter does both
  • 94.
  • 95.
  • 96.
  • 97.
    That’s it. • No,really - that’s all the complexity you have to care about!
  • 98.
    That’s it. • No,really - that’s all the complexity you have to care about! • Except for the complexity introduced by the inputs and outputs you use.
  • 99.
    That’s it. • No,really - that’s all the complexity you have to care about! • Except for the complexity introduced by the inputs and outputs you use. • Unified attribute names / reconnection model, etc.. This helps, somewhat..
  • 100.
    Inputs and outputs • ZeroMQ In / Out • AMQP (RabbitMQ) In / Out • STOMP (ActiveMQ) In / Out • elasticsearch Out • Redis PubSub In/Out • Syslog In • MongoDB Out • Collectd In/Out • HTTP POST (“WebHooks”) Out • UDP packets In/Out (e.g. statsd)
  • 101.
    DSL • Building more complex chains easy! • Multiple inputs • Multiple outputs • Multiple independent chains
  • 102.
    CLI • 1 Input •1 Output • 1 Filter (default Null) • For simple use, or testing.
  • 103.
    CLI • Encode /Decode step is just a Filter • JSON by default • Supply command line, or config file • Daemon features
  • 104.
    The dist: Message::Passing • Core dist supplies CLI, DSL, roles for reuse.
  • 105.
    The dist: Message::Passing • Core dist supplies CLI, DSL, roles for reuse. • Adaptors for most protocols in other modules.
  • 106.
    The dist: Message::Passing • Core dist supplies CLI, DSL, roles for reuse. • Adaptors for most protocols in other modules. • Moo based - small footprint, can be fatpacked (no XS dependencies).
  • 107.
    The dist: Message::Passing • Core dist supplies CLI, DSL, roles for reuse. • Adaptors for most protocols in other modules. • Moo based - small footprint, can be fatpacked (no XS dependencies). • Moose compatible.
  • 108.
  • 110.
    Less trivial example message-pass--input ZeroMQ --input_options ‘{“socket_bind”:”tcp://*:5222”}’ --output STDOUT message-pass --output ZeroMQ --output_options ‘{“connect”:”tcp://127.0.0.1:5222”}’ --input STDIN
  • 111.
  • 112.
  • 113.
  • 114.
    Jenga: message-pass --input STDIN--output STOMP --output_options '{"destination":"/queue/foo","hostname":"localhost", "port":"6163", "username":"guest", "password":"guest"}' message-pass --input STOMP --output Redis --input_options '{"destination":"/queue/ foo", "hostname":"localhost","port":"6163","username":"guest","password":"guest"}' --output_options '{"topic":"foo","hostname":"127.0.0.1","port":"6379"}' message-pass --input Redis --output AMQP --input_options '{"topics": ["foo"],"hostname":"127.0.0.1","port":"6379"}' --output_options '{"hostname":"127.0.0.1","username":"guest","password":"guest", "exchange_name":"foo"}' message-pass --input AMQP --output STDOUT --input_options '{"hostname":"127.0.0.1", "username":"guest", "password":"guest", "exchange_name":"foo","queue_name":"foo"}'
  • 115.
  • 116.
    Example 4? • Thelast example wasn’t silly enough!
  • 117.
    Example 4? • Thelast example wasn’t silly enough! • How could I top that?
  • 118.
    Example 4? • Thelast example wasn’t silly enough! • How could I top that? • Plan - Re-invent mongrel2
  • 119.
    Example 4? • Thelast example wasn’t silly enough! • How could I top that? • Plan - Re-invent mongrel2 • Badly
  • 120.
    PSGI • PSGI $envis basically just a hash.
  • 121.
    PSGI • PSGI $envis basically just a hash. • (With a little fiddling), you can serialize it as JSON
  • 122.
  • 123.
    PSGI • PSGI $envis basically just a hash.
  • 124.
    PSGI • PSGI $envis basically just a hash. • (With a little fiddling), you can serialize it as JSON
  • 125.
    PSGI • PSGI $envis basically just a hash. • (With a little fiddling), you can serialize it as JSON • PSGI response is just an array.
  • 126.
    PSGI • PSGI $envis basically just a hash. • (With a little fiddling), you can serialize it as JSON • PSGI response is just an array. • Ignore streaming responses!
  • 127.
    Demo? plackup -E production-s Twiggy -MPlack::App::Message::Passing -e'Plack::App::Message::Passing->new(return_address => "tcp://127.0.0.1:5555", send_address => "tcp://127.0.0.1:5556")->to_app' plackup -E production -s Message::Passing testapp.psgi --host 127.0.0.1 --port 5556
  • 128.
    PUSH socket doesfan out between multiple handlers. Reply to address embedded in request Run multiple ‘handler’ processes. Hot restarts, hot add / remove workers
  • 129.
    Other applications • Anywherean asynchronous event stream is useful! • Monitoring • Metrics transport • Queued jobs - worker pool
  • 130.
    Other applications (Web stuff) • User activity (ajax ‘what are your users doing’) • WebSockets / MXHR • HTTP Push notifications - “WebHooks”
  • 131.
    WebHooks • HTTP PUSHnotification • E.g. Paypal IPN • Shopify API
  • 132.
    What about logstash? •Use my lightweight code on end nodes. • Use logstash for parsing/filtering on the dedicated hardware (elasticsearch boxes) • Filter to change my hashes to logstash compatible hashes • For use with MooseX::Storage and/or Log::Message::Structured
  • 133.
    Interoperating - areal example
  • 134.
    Interoperating - areal example • Log JSON events out of apps (in multiple languages) to ZMQ
  • 135.
    Interoperating - areal example • Log JSON events out of apps (in multiple languages) to ZMQ • Collect and munge with Message::Passing script ‘logcollector’
  • 136.
    Interoperating - areal example • Log JSON events out of apps (in multiple languages) to ZMQ • Collect and munge with Message::Passing script ‘logcollector’ • Send to central logstash
  • 137.
    Interoperating - areal example • Log JSON events out of apps (in multiple languages) to ZMQ • Collect and munge with Message::Passing script ‘logcollector’ • Send to central logstash • Send onto statsd to aggregate
  • 138.
    Interoperating - areal example • Log JSON events out of apps (in multiple languages) to ZMQ • Collect and munge with Message::Passing script ‘logcollector’ • Send to central logstash • Send onto statsd to aggregate • Graphs in graphite
  • 140.
  • 141.
  • 142.
    TimedWebRequest • A standardevent • Page generation time, URI, HTTP status
  • 143.
  • 144.
    statsd • Rolls upcounters and timers into metrics
  • 145.
    statsd • Rolls upcounters and timers into metrics • One bucket per stat, emits values every 10 seconds
  • 146.
    statsd • Rolls upcounters and timers into metrics • One bucket per stat, emits values every 10 seconds • Counters: Request rate, HTTP status rate
  • 147.
    statsd • Rolls upcounters and timers into metrics • One bucket per stat, emits values every 10 seconds • Counters: Request rate, HTTP status rate • Timers: Total page time, mean page time, min/max page times
  • 148.
  • 149.
  • 150.
    Code • https://metacpan.org/module/ Message::Passing • https://github.com/suretec/Message-Passing • #message-passing on irc.perl.org • Examples: git://github.com/2941747.git

Editor's Notes

  • #2 \n
  • #3 Mention state51 are hiring in London\nMention Tim Group are hiring in London/Boston.\n
  • #4 But, before I talk about perl at you, I&amp;#x2019;m going to go off on a tangent..\n
  • #5 I wrote code. And writing code is never something to be proud of; at least if your code looks like mine it isn&amp;#x2019;t... So I&amp;#x2019;d better justify this hubris somehow..\n
  • #6 I wrote code. And writing code is never something to be proud of; at least if your code looks like mine it isn&amp;#x2019;t... So I&amp;#x2019;d better justify this hubris somehow..\n
  • #7 I wrote code. And writing code is never something to be proud of; at least if your code looks like mine it isn&amp;#x2019;t... So I&amp;#x2019;d better justify this hubris somehow..\n
  • #8 I wrote code. And writing code is never something to be proud of; at least if your code looks like mine it isn&amp;#x2019;t... So I&amp;#x2019;d better justify this hubris somehow..\n
  • #9 \n
  • #10 \n
  • #11 Isn&amp;#x2019;t he cute? And woody!\nWho knows what this is?\n
  • #12 \n\n
  • #13 \n\n
  • #14 \n\n
  • #15 \n\n
  • #16 \n\n
  • #17 \n\n
  • #18 \n\n
  • #19 \n\n
  • #20 \n\n
  • #21 MooseX::Storage!\nThis isn&amp;#x2019;t mandatory - you can just log plain hashes if you&amp;#x2019;re concerned about performance.\nSPOT THE TYPO\n
  • #22 \n\n
  • #23 \n\n
  • #24 \n\n
  • #25 \n\n
  • #26 \n\n
  • #27 \n\n
  • #28 \n
  • #29 \n
  • #30 \n
  • #31 \n
  • #32 \n
  • #33 \n
  • #34 \n
  • #35 \n
  • #36 \n
  • #37 \n
  • #38 \n
  • #39 \n
  • #40 \n
  • #41 \n
  • #42 \n
  • #43 \n
  • #44 No, really - JSON::XS is lightning fast\n
  • #45 No, really - JSON::XS is lightning fast\n
  • #46 No, really - JSON::XS is lightning fast\n
  • #47 No, really - JSON::XS is lightning fast\n
  • #48 No, really - JSON::XS is lightning fast\n
  • #49 No, really - JSON::XS is lightning fast\n
  • #50 Most message queues have bindings in most languages.. So by abstracting message routing out of your application, and passing JSON hashes - you are suddenly nicely cross language!\n
  • #51 Most message queues have bindings in most languages.. So by abstracting message routing out of your application, and passing JSON hashes - you are suddenly nicely cross language!\n
  • #52 Most message queues have bindings in most languages.. So by abstracting message routing out of your application, and passing JSON hashes - you are suddenly nicely cross language!\n
  • #53 Most message queues have bindings in most languages.. So by abstracting message routing out of your application, and passing JSON hashes - you are suddenly nicely cross language!\n
  • #54 Most message queues have bindings in most languages.. So by abstracting message routing out of your application, and passing JSON hashes - you are suddenly nicely cross language!\n
  • #55 Most message queues have bindings in most languages.. So by abstracting message routing out of your application, and passing JSON hashes - you are suddenly nicely cross language!\n
  • #56 \n
  • #57 \n
  • #58 \n
  • #59 \n
  • #60 Very simple model - input (pluggable), filtering (pluggable by type) in C, output (pluggable)\nLots of backends - AMQP and elasticsearch + syslog and many others\nPre-built parser library for various line based log formats\nComes with web app for searches.. Everything I need!\n
  • #61 And it has an active community.\nThis is the alternate viewer app..\n
  • #62 Lets take a simple case here - I&amp;#x2019;ll shove my apache logs from N servers into elasticsearch\nI run a logstash on each host (writer), and one on each elasticsearch server (reader)..\n\n
  • #63 First problem...\n
  • #64 Well then, I&amp;#x2019;m not going to be running this on the end nodes.\n
  • #65 Has a whole library of pre-built parsers for common log formats.\nAlso, as noted, it&amp;#x2019;s faster, and notably it&amp;#x2019;s multi-threaded, so it&amp;#x2019;ll use multiple cores..\n
  • #66 Has a whole library of pre-built parsers for common log formats.\nAlso, as noted, it&amp;#x2019;s faster, and notably it&amp;#x2019;s multi-threaded, so it&amp;#x2019;ll use multiple cores..\n
  • #67 Has a whole library of pre-built parsers for common log formats.\nAlso, as noted, it&amp;#x2019;s faster, and notably it&amp;#x2019;s multi-threaded, so it&amp;#x2019;ll use multiple cores..\n
  • #68 Has a whole library of pre-built parsers for common log formats.\nAlso, as noted, it&amp;#x2019;s faster, and notably it&amp;#x2019;s multi-threaded, so it&amp;#x2019;ll use multiple cores..\n
  • #69 The last point here is most important - ZMQ networking works entirely in a background thread perl knows nothing about, which means that you can asynchronously ship messages with no changes to your existing codebase.\n
  • #70 The last point here is most important - ZMQ networking works entirely in a background thread perl knows nothing about, which means that you can asynchronously ship messages with no changes to your existing codebase.\n
  • #71 The last point here is most important - ZMQ networking works entirely in a background thread perl knows nothing about, which means that you can asynchronously ship messages with no changes to your existing codebase.\n
  • #72 The last point here is most important - ZMQ networking works entirely in a background thread perl knows nothing about, which means that you can asynchronously ship messages with no changes to your existing codebase.\n
  • #73 The last point here is most important - ZMQ networking works entirely in a background thread perl knows nothing about, which means that you can asynchronously ship messages with no changes to your existing codebase.\n
  • #74 The last point here is most important - ZMQ networking works entirely in a background thread perl knows nothing about, which means that you can asynchronously ship messages with no changes to your existing codebase.\n
  • #75 Yes, this could still be &amp;#x2018;a script&amp;#x2019;, in fact I did that at first...\nBut I now have 3 protocols, who&amp;#x2019;s to say I won&amp;#x2019;t want a 4th..\n\n
  • #76 Note the fact that we have a cluster of ES servers here.\nAnd we have two log indexers. You can cluster RabbitMQ also.\nHighly reliable solution (against machine failure). Highly scaleable solution (just add ES servers)\nWe use RabbitMQ as this also allows someone to tap a part of the log stream, could just use ZMQ throughout.\n
  • #77 At the same time, I want something that can be used for real work (i.e. not just a toy)\n
  • #78 At the same time, I want something that can be used for real work (i.e. not just a toy)\n
  • #79 At the same time, I want something that can be used for real work (i.e. not just a toy)\n
  • #80 At the same time, I want something that can be used for real work (i.e. not just a toy)\n
  • #81 At the same time, I want something that can be used for real work (i.e. not just a toy)\n
  • #82 I had a log shipper script. A long indexer script. An alerting (nagios) script. An irc notification script.\n
  • #83 I had a log shipper script. A long indexer script. An alerting (nagios) script. An irc notification script.\n
  • #84 I had a log shipper script. A long indexer script. An alerting (nagios) script. An irc notification script.\n
  • #85 I had a log shipper script. A long indexer script. An alerting (nagios) script. An irc notification script.\n
  • #86 By insanely fast, I mean I can generate, encode as JSON, send, receive, decode as JSON over 25k messages a second. On this 3 year old macbook..\n
  • #87 By insanely fast, I mean I can generate, encode as JSON, send, receive, decode as JSON over 25k messages a second. On this 3 year old macbook..\n
  • #88 By insanely fast, I mean I can generate, encode as JSON, send, receive, decode as JSON over 25k messages a second. On this 3 year old macbook..\n
  • #89 By insanely fast, I mean I can generate, encode as JSON, send, receive, decode as JSON over 25k messages a second. On this 3 year old macbook..\n
  • #90 \n
  • #91 Filters are just a combination of input and output\n
  • #92 Filters are just a combination of input and output\n
  • #93 Filters are just a combination of input and output\n
  • #94 Filters are just a combination of input and output\n
  • #95 So the input has an output, that output always has a consume method...\nTADA!\n
  • #96 You can build a &amp;#x201C;chain&amp;#x201D; of events. This can work either way around.\nThe input can be a log file, the output can be a message queue (publisher)\nInput can be a message queue, output can be a log file (consumer)\n
  • #97 The docs still suck, sorry - I have tried ;)\n
  • #98 The docs still suck, sorry - I have tried ;)\n
  • #99 The docs still suck, sorry - I have tried ;)\n
  • #100 All of these are on CPAN already.\n
  • #101 DSL - Domain specific language.\nTry to make writing scripts really simple.\n
  • #102 But you shouldn&amp;#x2019;t have to write ANY code to play around.\n
  • #103 \n
  • #104 \n
  • #105 \n
  • #106 \n
  • #107 \n
  • #108 Demo1\nSimple demo of the CLI in one process (STDOUT/STDIN)\n
  • #109 Demo1\nSimple demo of the CLI in one process (STDOUT/STDIN)\n
  • #110 Demo1\nSimple demo of the CLI in one process (STDOUT/STDIN)\n
  • #111 Less simple demo - lets actually pass messages between two processes.\nArrows indicate message flow. ZeroMQ is a lightning bolt as it&amp;#x2019;s not quite so trivial..\n
  • #112 Demo PUBSUB and round robin..\n
  • #113 So, lets play Jenga with message queues!\n
  • #114 \n
  • #115 I would have added ZeroMQ. Except then the diagram doesn&amp;#x2019;t fit on the page.\nI&amp;#x2019;ll leave this as an exercise for the reader!\n
  • #116 \n
  • #117 \n
  • #118 \n
  • #119 \n
  • #120 \n
  • #121 \n
  • #122 \n
  • #123 \n
  • #124 \n
  • #125 \n
  • #126 \n
  • #127 \n
  • #128 \n
  • #129 I&amp;#x2019;ll talk a very little more about webhooks\n
  • #130 Error stream\n
  • #131 \n
  • #132 \n
  • #133 \n
  • #134 \n
  • #135 \n
  • #136 \n
  • #137 \n
  • #138 \n
  • #139 \n
  • #140 \n
  • #141 \n
  • #142 \n
  • #143 \n
  • #144 \n
  • #145 \n
  • #146 \n
  • #147 \n