• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Message:Passing - lpw 2012
 

Message:Passing - lpw 2012

on

  • 1,200 views

 

Statistics

Views

Total Views
1,200
Views on SlideShare
1,199
Embed Views
1

Actions

Likes
1
Downloads
32
Comments
0

1 Embed 1

https://twitter.com 1

Accessibility

Categories

Upload Details

Uploaded via as Apple Keynote

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • \n
  • Mention state51 are hiring in London\nMention Tim Group are hiring in London/Boston.\n
  • But, before I talk about perl at you, I’m going to go off on a tangent..\n
  • I wrote code. And writing code is never something to be proud of; at least if your code looks like mine it isn’t... So I’d better justify this hubris somehow..\n
  • I wrote code. And writing code is never something to be proud of; at least if your code looks like mine it isn’t... So I’d better justify this hubris somehow..\n
  • I wrote code. And writing code is never something to be proud of; at least if your code looks like mine it isn’t... So I’d better justify this hubris somehow..\n
  • I wrote code. And writing code is never something to be proud of; at least if your code looks like mine it isn’t... So I’d better justify this hubris somehow..\n
  • \n
  • \n
  • Isn’t he cute? And woody!\nWho knows what this is?\n
  • \n\n
  • \n\n
  • \n\n
  • \n\n
  • \n\n
  • \n\n
  • \n\n
  • \n\n
  • \n\n
  • MooseX::Storage!\nThis isn’t mandatory - you can just log plain hashes if you’re concerned about performance.\nSPOT THE TYPO\n
  • \n\n
  • \n\n
  • \n\n
  • \n\n
  • \n\n
  • \n\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • No, really - JSON::XS is lightning fast\n
  • No, really - JSON::XS is lightning fast\n
  • No, really - JSON::XS is lightning fast\n
  • No, really - JSON::XS is lightning fast\n
  • No, really - JSON::XS is lightning fast\n
  • No, really - JSON::XS is lightning fast\n
  • Most message queues have bindings in most languages.. So by abstracting message routing out of your application, and passing JSON hashes - you are suddenly nicely cross language!\n
  • Most message queues have bindings in most languages.. So by abstracting message routing out of your application, and passing JSON hashes - you are suddenly nicely cross language!\n
  • Most message queues have bindings in most languages.. So by abstracting message routing out of your application, and passing JSON hashes - you are suddenly nicely cross language!\n
  • Most message queues have bindings in most languages.. So by abstracting message routing out of your application, and passing JSON hashes - you are suddenly nicely cross language!\n
  • Most message queues have bindings in most languages.. So by abstracting message routing out of your application, and passing JSON hashes - you are suddenly nicely cross language!\n
  • Most message queues have bindings in most languages.. So by abstracting message routing out of your application, and passing JSON hashes - you are suddenly nicely cross language!\n
  • \n
  • \n
  • \n
  • \n
  • Very simple model - input (pluggable), filtering (pluggable by type) in C, output (pluggable)\nLots of backends - AMQP and elasticsearch + syslog and many others\nPre-built parser library for various line based log formats\nComes with web app for searches.. Everything I need!\n
  • And it has an active community.\nThis is the alternate viewer app..\n
  • Lets take a simple case here - I’ll shove my apache logs from N servers into elasticsearch\nI run a logstash on each host (writer), and one on each elasticsearch server (reader)..\n\n
  • First problem...\n
  • Well then, I’m not going to be running this on the end nodes.\n
  • Has a whole library of pre-built parsers for common log formats.\nAlso, as noted, it’s faster, and notably it’s multi-threaded, so it’ll use multiple cores..\n
  • Has a whole library of pre-built parsers for common log formats.\nAlso, as noted, it’s faster, and notably it’s multi-threaded, so it’ll use multiple cores..\n
  • Has a whole library of pre-built parsers for common log formats.\nAlso, as noted, it’s faster, and notably it’s multi-threaded, so it’ll use multiple cores..\n
  • Has a whole library of pre-built parsers for common log formats.\nAlso, as noted, it’s faster, and notably it’s multi-threaded, so it’ll use multiple cores..\n
  • The last point here is most important - ZMQ networking works entirely in a background thread perl knows nothing about, which means that you can asynchronously ship messages with no changes to your existing codebase.\n
  • The last point here is most important - ZMQ networking works entirely in a background thread perl knows nothing about, which means that you can asynchronously ship messages with no changes to your existing codebase.\n
  • The last point here is most important - ZMQ networking works entirely in a background thread perl knows nothing about, which means that you can asynchronously ship messages with no changes to your existing codebase.\n
  • The last point here is most important - ZMQ networking works entirely in a background thread perl knows nothing about, which means that you can asynchronously ship messages with no changes to your existing codebase.\n
  • The last point here is most important - ZMQ networking works entirely in a background thread perl knows nothing about, which means that you can asynchronously ship messages with no changes to your existing codebase.\n
  • The last point here is most important - ZMQ networking works entirely in a background thread perl knows nothing about, which means that you can asynchronously ship messages with no changes to your existing codebase.\n
  • Yes, this could still be ‘a script’, in fact I did that at first...\nBut I now have 3 protocols, who’s to say I won’t want a 4th..\n\n
  • Note the fact that we have a cluster of ES servers here.\nAnd we have two log indexers. You can cluster RabbitMQ also.\nHighly reliable solution (against machine failure). Highly scaleable solution (just add ES servers)\nWe use RabbitMQ as this also allows someone to tap a part of the log stream, could just use ZMQ throughout.\n
  • At the same time, I want something that can be used for real work (i.e. not just a toy)\n
  • At the same time, I want something that can be used for real work (i.e. not just a toy)\n
  • At the same time, I want something that can be used for real work (i.e. not just a toy)\n
  • At the same time, I want something that can be used for real work (i.e. not just a toy)\n
  • At the same time, I want something that can be used for real work (i.e. not just a toy)\n
  • I had a log shipper script. A long indexer script. An alerting (nagios) script. An irc notification script.\n
  • I had a log shipper script. A long indexer script. An alerting (nagios) script. An irc notification script.\n
  • I had a log shipper script. A long indexer script. An alerting (nagios) script. An irc notification script.\n
  • I had a log shipper script. A long indexer script. An alerting (nagios) script. An irc notification script.\n
  • By insanely fast, I mean I can generate, encode as JSON, send, receive, decode as JSON over 25k messages a second. On this 3 year old macbook..\n
  • By insanely fast, I mean I can generate, encode as JSON, send, receive, decode as JSON over 25k messages a second. On this 3 year old macbook..\n
  • By insanely fast, I mean I can generate, encode as JSON, send, receive, decode as JSON over 25k messages a second. On this 3 year old macbook..\n
  • By insanely fast, I mean I can generate, encode as JSON, send, receive, decode as JSON over 25k messages a second. On this 3 year old macbook..\n
  • \n
  • Filters are just a combination of input and output\n
  • Filters are just a combination of input and output\n
  • Filters are just a combination of input and output\n
  • Filters are just a combination of input and output\n
  • So the input has an output, that output always has a consume method...\nTADA!\n
  • You can build a “chain” of events. This can work either way around.\nThe input can be a log file, the output can be a message queue (publisher)\nInput can be a message queue, output can be a log file (consumer)\n
  • The docs still suck, sorry - I have tried ;)\n
  • The docs still suck, sorry - I have tried ;)\n
  • The docs still suck, sorry - I have tried ;)\n
  • All of these are on CPAN already.\n
  • DSL - Domain specific language.\nTry to make writing scripts really simple.\n
  • But you shouldn’t have to write ANY code to play around.\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • Demo1\nSimple demo of the CLI in one process (STDOUT/STDIN)\n
  • Demo1\nSimple demo of the CLI in one process (STDOUT/STDIN)\n
  • Demo1\nSimple demo of the CLI in one process (STDOUT/STDIN)\n
  • Less simple demo - lets actually pass messages between two processes.\nArrows indicate message flow. ZeroMQ is a lightning bolt as it’s not quite so trivial..\n
  • Demo PUBSUB and round robin..\n
  • So, lets play Jenga with message queues!\n
  • \n
  • I would have added ZeroMQ. Except then the diagram doesn’t fit on the page.\nI’ll leave this as an exercise for the reader!\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • I’ll talk a very little more about webhooks\n
  • Error stream\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n

Message:Passing - lpw 2012 Message:Passing - lpw 2012 Presentation Transcript

  • Messaging,interoperability and log aggregation - a new framework Tomas Doran (t0m) <bobtfish@bobtfish.net>
  • Sponsored by• state51 • Pb of mogilefs, 100+ boxes. • > 4 million tracks on-demand via API • > 400 reqs/s per server, >1Gb peak from backhaul• Suretec VOIP Systems • UK voice over IP provider • Extensive API, including WebHooks for notifications• TIM Group • “Alpha capture” applications • Java / Scala / Clojure / ruby / puppet / python / perl
  • What?• This talk is about my new perl library: Message::Passing
  • Why?
  • Why?• I’d better stop, and explain a specific problem.
  • Why?• I’d better stop, and explain a specific problem.• The solution that grew out of this is more generic.
  • Why?• I’d better stop, and explain a specific problem.• The solution that grew out of this is more generic.• But it illustrates my concerns and design choices well.
  • Why?• I’d better stop, and explain a specific problem.• The solution that grew out of this is more generic.• But it illustrates my concerns and design choices well.• And everyone likes a story, right?
  • Once upon a time...• I was bored of tailing log files across dozens of servers
  • Once upon a time...• I was bored of tailing log files across dozens of servers• splunk was amazing, but unaffordable
  • Logstash
  • Centralised logging
  • Centralised logging• Syslog isn’t good enough
  • Centralised logging• Syslog isn’t good enough • UDP is lossy, TCP not much better
  • Centralised logging• Syslog isn’t good enough • UDP is lossy, TCP not much better • Limited fields
  • Centralised logging• Syslog isn’t good enough • UDP is lossy, TCP not much better • Limited fields • No structure to actual message
  • Centralised logging• Syslog isn’t good enough • UDP is lossy, TCP not much better • Limited fields • No structure to actual message • RFC3164 - “This document describes the observed behaviour of the syslog protocol”
  • Centralised logging• Syslog isn’t good enough• Structured app logging
  • Centralised logging• Syslog isn’t good enough• Structured app logging • We want to log data, rather than text from our application
  • Centralised logging• Syslog isn’t good enough• Structured app logging • We want to log data, rather than text from our application • E.g. HTTP request - vhost, path, time to generate, N db queries etc..
  • Centralised logging• Syslog isn’t good enough• Structured app logging
  • Centralised logging• Syslog isn’t good enough• Structured app logging• Post-process log files to re-structure
  • Centralised logging• Syslog isn’t good enough• Structured app logging• Post-process log files to re-structure
  • Centralised logging• Syslog isn’t good enough• Structured app logging• Post-process log files to re-structure • Cases we do not control (e.g. apache)
  • Centralised logging• Syslog isn’t good enough• Structured app logging• Post-process log files to re-structure • Cases we do not control (e.g. apache) • SO MANY DATE FORMATS. ARGHH!!
  • Apache[27/Jun/2012:23:57:03 +0000]
  • Apache
  • ElasticSearch [2012-06-2602:08:26,879]
  • ElasticSearch
  • RabbitMQ26-Jun-2012::16:18:30
  • RabbitMQ
  • MongoDBThu Jun 28 01:02:29
  • MongoDB
  • SyslogJun 28 00:17:26
  • Syslog
  • .Net ‘tick’634763158360000000 100 ns from 1st Jan 1AD(Except those that are from 3rd Jan)
  • .Net ‘tick’
  • MySQL120404 12:31:04
  • MySQL
  • Aaaaaaannnnyyyway...• Please use ISO8601• or epochseconds• or epochmicroseconds• In UTC!
  • Centralised logging• Syslog isn’t good enough• Structured app logging• Post-process log files to re-structure• Publish logs as JSON to a message queue
  • Centralised logging• Syslog isn’t good enough• Structured app logging• Post-process log files to re-structure• Publish logs as JSON to a message queue • JSON is fast, and widely supported
  • Centralised logging• Syslog isn’t good enough• Structured app logging• Post-process log files to re-structure• Publish logs as JSON to a message queue • JSON is fast, and widely supported • Great for arbitrary structured data!
  • Message queue
  • Message queue• Flattens load spikes!
  • Message queue• Flattens load spikes!• Only have to keep up with average message volume, not peak volume.
  • Message queue• Flattens load spikes!• Only have to keep up with average message volume, not peak volume.• Logs are bursty! (Peak rate 1000x average.)
  • Message queue• Flattens load spikes!• Only have to keep up with average message volume, not peak volume.• Logs are bursty! (Peak rate 1000x average.)• Easy to scale - just add more consumers
  • Message queue• Flattens load spikes!• Only have to keep up with average message volume, not peak volume.• Logs are bursty! (Peak rate 1000x average.)• Easy to scale - just add more consumers• Allows smart routing
  • Message queue• Flattens load spikes!• Only have to keep up with average message volume, not peak volume.• Logs are bursty! (Peak rate 1000x average.)• Easy to scale - just add more consumers• Allows smart routing• Great as a common integration point.
  • elasticsearch
  • elasticsearch• Just tip JSON documents into it
  • elasticsearch• Just tip JSON documents into it• Figures out type for each field, indexes appropriately.
  • elasticsearch• Just tip JSON documents into it• Figures out type for each field, indexes appropriately.• Free sharding and replication
  • elasticsearch• Just tip JSON documents into it• Figures out type for each field, indexes appropriately.• Free sharding and replication• Histograms!
  • Logstash In JRuby, by Jordan Sissel InputSimple: Filter Output Flexible Extensible Plays well with others Nice web interface
  • Logstash
  • Logstash ISMASSIVE
  • 440Mb IDLE!
  • Logstash on each host is totally out...
  • Logstash on each host is totally out...• Running it on elasticsearch servers which are already dedicated to this is fine..
  • Logstash on each host is totally out...• Running it on elasticsearch servers which are already dedicated to this is fine..• I’d still like to reuse all of it’s parsing
  • Logstash on each host is totally out...• Running it on elasticsearch servers which are already dedicated to this is fine..• I’d still like to reuse all of it’s parsing• How about I just log to AMQP from my app?
  • Logstash on each host is totally out...• Running it on elasticsearch servers which are already dedicated to this is fine..• I’d still like to reuse all of it’s parsing• How about I just log to AMQP from my app?• Doooom!
  • ZeroMQ has thecorrect semantics
  • ZeroMQ has the correct semantics• Pub/Sub sockets
  • ZeroMQ has the correct semantics• Pub/Sub sockets• Never, ever blocking
  • ZeroMQ has the correct semantics• Pub/Sub sockets• Never, ever blocking• Lossy! (If needed)
  • ZeroMQ has the correct semantics• Pub/Sub sockets• Never, ever blocking• Lossy! (If needed)• Buffer sizes / locations configureable
  • ZeroMQ has the correct semantics• Pub/Sub sockets• Never, ever blocking• Lossy! (If needed)• Buffer sizes / locations configureable• Arbitrary message size
  • ZeroMQ has the correct semantics• Pub/Sub sockets• Never, ever blocking• Lossy! (If needed)• Buffer sizes / locations configureable• Arbitrary message size• IO done in a background thread
  • On host log collector• ZeroMQ SUB socket • App logs - pre structured• Syslog listener • Forward rsyslogd• Log file tailer• Ship to AMQP
  • On host log collector
  • This talk• Is about my new library: Message::Passing• The clue is in the name...
  • This talk• Is about my new library: Message::Passing• The clue is in the name...• Hopefully really simple
  • This talk• Is about my new library: Message::Passing• The clue is in the name...• Hopefully really simple• Maybe even useful!
  • This talk• Is about my new library: Message::Passing• The clue is in the name...• Hopefully really simple• Maybe even useful!• Definitely small - you can replace / rewrite it easily.
  • Lets make it generic!• So, I wanted a log shipper
  • Lets make it generic!• So, I wanted a log shipper• I ended up with a framework for messaging interoperability
  • Lets make it generic!• So, I wanted a log shipper• I ended up with a framework for messaging interoperability• Whoops!
  • Lets make it generic!• So, I wanted a log shipper• I ended up with a framework for messaging interoperability• Whoops!• Got sick of writing scripts..
  • Does this actually work?• YES - In production at four sites for me.
  • Does this actually work?• YES - In production at four sites for me.• Some of the adaptors are partially complete
  • Does this actually work?• YES - In production at four sites for me.• Some of the adaptors are partially complete• Dumber than logstash - no multiple threads/cores
  • Does this actually work?• YES - In production at four sites for me.• Some of the adaptors are partially complete• Dumber than logstash - no multiple threads/cores• ZeroMQ is insanely fast
  • Other people are using it in production!Two people I know of already writing have already written adaptors!
  • Events - my model for message passing
  • Events - my model for message passing• a hash {}
  • Events - my model for message passing• a hash {}• Output consumes events: • method consume ($event) { ...
  • Events - my model for message passing• a hash {}• Output consumes events: • method consume ($event) { ...• Input produces events: • has output_to => (..
  • Events - my model for message passing• a hash {}• Output consumes events: • method consume ($event) { ...• Input produces events: • has output_to => (..• Filter does both
  • Simplifying assumption$self->output_to->consume($message)
  • Events
  • That’s it.
  • That’s it.• No, really - that’s all the complexity you have to care about!
  • That’s it.• No, really - that’s all the complexity you have to care about!• Except for the complexity introduced by the inputs and outputs you use.
  • That’s it.• No, really - that’s all the complexity you have to care about!• Except for the complexity introduced by the inputs and outputs you use.• Unified attribute names / reconnection model, etc.. This helps, somewhat..
  • Inputs and outputs• ZeroMQ In / Out• AMQP (RabbitMQ) In / Out• STOMP (ActiveMQ) In / Out• elasticsearch Out• Redis PubSub In/Out• Syslog In• MongoDB Out• Collectd In/Out• HTTP POST (“WebHooks”) Out• UDP packets In/Out (e.g. statsd)
  • DSL• Building more complex chains easy!• Multiple inputs• Multiple outputs• Multiple independent chains
  • CLI• 1 Input• 1 Output• 1 Filter (default Null)• For simple use, or testing.
  • CLI• Encode / Decode step is just a Filter• JSON by default• Supply command line, or config file• Daemon features
  • The dist: Message::Passing• Core dist supplies CLI, DSL, roles for reuse.
  • The dist: Message::Passing• Core dist supplies CLI, DSL, roles for reuse.• Adaptors for most protocols in other modules.
  • The dist: Message::Passing• Core dist supplies CLI, DSL, roles for reuse.• Adaptors for most protocols in other modules.• Moo based - small footprint, can be fatpacked (no XS dependencies).
  • The dist: Message::Passing• Core dist supplies CLI, DSL, roles for reuse.• Adaptors for most protocols in other modules.• Moo based - small footprint, can be fatpacked (no XS dependencies).• Moose compatible.
  • Example?message-pass --input STDIN --output STDOUT{}{}
  • Less trivial examplemessage-pass --input ZeroMQ --input_options‘{“socket_bind”:”tcp://*:5222”}’--output STDOUTmessage-pass --output ZeroMQ --output_options‘{“connect”:”tcp://127.0.0.1:5222”}’--input STDIN
  • Across the network:
  • Multiple subscribers
  • Jenga?
  • Jenga:message-pass --input STDIN --output STOMP --output_options{"destination":"/queue/foo","hostname":"localhost", "port":"6163", "username":"guest","password":"guest"}message-pass --input STOMP --output Redis --input_options {"destination":"/queue/foo", "hostname":"localhost","port":"6163","username":"guest","password":"guest"}--output_options {"topic":"foo","hostname":"127.0.0.1","port":"6379"}message-pass --input Redis --output AMQP --input_options {"topics":["foo"],"hostname":"127.0.0.1","port":"6379"} --output_options{"hostname":"127.0.0.1","username":"guest","password":"guest","exchange_name":"foo"}message-pass --input AMQP --output STDOUT --input_options{"hostname":"127.0.0.1", "username":"guest", "password":"guest","exchange_name":"foo","queue_name":"foo"}
  • Jenga!
  • Example 4?• The last example wasn’t silly enough!
  • Example 4?• The last example wasn’t silly enough!• How could I top that?
  • Example 4?• The last example wasn’t silly enough!• How could I top that?• Plan - Re-invent mongrel2
  • Example 4?• The last example wasn’t silly enough!• How could I top that?• Plan - Re-invent mongrel2• Badly
  • PSGI• PSGI $env is basically just a hash.
  • PSGI• PSGI $env is basically just a hash.• (With a little fiddling), you can serialize it as JSON
  • PSGI
  • PSGI• PSGI $env is basically just a hash.
  • PSGI• PSGI $env is basically just a hash.• (With a little fiddling), you can serialize it as JSON
  • PSGI• PSGI $env is basically just a hash.• (With a little fiddling), you can serialize it as JSON• PSGI response is just an array.
  • PSGI• PSGI $env is basically just a hash.• (With a little fiddling), you can serialize it as JSON• PSGI response is just an array.• Ignore streaming responses!
  • Demo?plackup -E production -s Twiggy -MPlack::App::Message::Passing-ePlack::App::Message::Passing->new(return_address =>"tcp://127.0.0.1:5555", send_address =>"tcp://127.0.0.1:5556")->to_appplackup -E production -s Message::Passing testapp.psgi --host127.0.0.1 --port 5556
  • PUSH socket does fanout between multiple handlers. Reply to addressembedded in requestRun multiple ‘handler’ processes. Hot restarts, hot add / remove workers
  • Other applications• Anywhere an asynchronous event stream is useful!• Monitoring• Metrics transport• Queued jobs - worker pool
  • Other applications (Web stuff)• User activity (ajax ‘what are your users doing’)• WebSockets / MXHR• HTTP Push notifications - “WebHooks”
  • WebHooks• HTTP PUSH notification• E.g. Paypal IPN• Shopify API
  • What about logstash?• Use my lightweight code on end nodes.• Use logstash for parsing/filtering on the dedicated hardware (elasticsearch boxes)• Filter to change my hashes to logstash compatible hashes • For use with MooseX::Storage and/or Log::Message::Structured
  • Interoperating - a real example
  • Interoperating - a real example• Log JSON events out of apps (in multiple languages) to ZMQ
  • Interoperating - a real example• Log JSON events out of apps (in multiple languages) to ZMQ• Collect and munge with Message::Passing script ‘logcollector’
  • Interoperating - a real example• Log JSON events out of apps (in multiple languages) to ZMQ• Collect and munge with Message::Passing script ‘logcollector’• Send to central logstash
  • Interoperating - a real example• Log JSON events out of apps (in multiple languages) to ZMQ• Collect and munge with Message::Passing script ‘logcollector’• Send to central logstash• Send onto statsd to aggregate
  • Interoperating - a real example• Log JSON events out of apps (in multiple languages) to ZMQ• Collect and munge with Message::Passing script ‘logcollector’• Send to central logstash• Send onto statsd to aggregate• Graphs in graphite
  • Standard log message
  • Standard event message
  • TimedWebRequest• A standard event• Page generation time, URI, HTTP status
  • statsd
  • statsd• Rolls up counters and timers into metrics
  • statsd• Rolls up counters and timers into metrics• One bucket per stat, emits values every 10 seconds
  • statsd• Rolls up counters and timers into metrics• One bucket per stat, emits values every 10 seconds• Counters: Request rate, HTTP status rate
  • statsd• Rolls up counters and timers into metrics• One bucket per stat, emits values every 10 seconds• Counters: Request rate, HTTP status rate• Timers: Total page time, mean page time, min/max page times
  • statsd
  • statsd
  • Code• https://metacpan.org/module/ Message::Passing• https://github.com/suretec/Message-Passing• #message-passing on irc.perl.org• Examples: git://github.com/2941747.git