Messaging, interoperability and log aggregation - a new framework


Published on

In this talk, I will talk about why log files are horrible, logging log lines, and more structured performance metrics from large scale production applications as well as building reliable, scaleable and flexible large scale software systems in multiple languages.

Why (almost) all log formats are horrible will be explained, and why JSON is a good solution for logging will be discussed, along with a number of message queuing, middleware and network transport technologies, including STOMP, AMQP and ZeroMQ.

The Message::Passing framework will be introduced, along with the project which the perl code is interoperable with. These are pluggable frameworks in ruby/java/jruby and perl with pre-written sets of inputs, filters and outputs for many many different systems, message formats and transports.

They were initially designed to be aggregators and filters of data for logging. However they are flexible enough to be used as part of your messaging middleware, or even as a replacement for centralised message queuing systems.

You can have your cake and eat it too - an architecture which is flexible, extensible, scaleable and distributed. Build discrete, loosely coupled components which just pass messages to each other easily.

Integrate and interoperate with your existing code and code bases easily, consume from or publish to any existing message queue, logging or performance metrics system you have installed.

Simple examples using common input and output classes will be demonstrated using the framework, as will easily adding your own custom filters. A number of common messaging middleware patterns will be shown to be trivial to implement.

Some higher level use-cases will also be explored, demonstrating log indexing in ElasticSearch and how to build a responsive platform API using webhooks.

Interoperability is also an important goal for messaging middleware. The project will be highlighted and we'll discuss crossing the single language barrier, allowing us to have full integration between java, ruby and perl components, and to easily write bindings into libraries we want to reuse in any of those languages.

Published in: Technology, Education
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • \n
  • Mention JFDI, and I really don’t care what language it’s in\n
  • Mention state51 are hiring in London\nMention Tim Group are hiring in London/Boston.\n
  • But, before I talk about perl at you, I’m going to go off on a tangent..\n
  • I wrote code. And writing code is never something to be proud of; at least if your code looks like mine it isn’t... So I’d better justify this hubris somehow..\n
  • \n
  • Isn’t he cute? And woody!\nWho knows what this is?\n
  • Ok, so logstash is an open source project, in ruby.\nBefore I talk about it in detail, I’ll go through some of the design choices for supporting technologies.\nDoes anyone need convincing why centralised logging is something you want?\n
  • \n\n
  • MooseX::Storage!\nThis isn’t mandatory - you can just log plain hashes if you’re concerned about performance.\nSPOT THE TYPO\n
  • \n\n
  • Every language has a JSON library. This makes passing hashes of JSON data around a great way to interoperate.\nNo, really - JSON::XS is lightning fast\n
  • There are a whole pile of different queue products. Why would you want to use one (for logging to)?\nAverage volume is really important!\nA solution with hosts polling the database server has (at least) a cost of O(n).\nA message queue (can at least theoretically) perform as O(1), no matter how many consumer.\nBy ‘smart routing’, I mean you can publish a ‘firehose’ message stream.\nMost message queue products allow you to get a subset of that stream.\nMost message queues have bindings in most languages.. So by abstracting message routing out of your application, and passing JSON hashes - you are suddenly nicely cross language!\n
  • If you haven’t yet heard of elasticsearch, I recommend you check it out.\nIt’s big, it’s Java, it needs some care and feeding, but!\nYou can just throw data into it.\nelasticsearch is smart - and works out the field types for you.\nGiven you do things sensibly, elasticsearch is pretty amazing for scaleability and replication - you can just add more boxes to your cluster and it all goes faster!\nPonies and unicorns for everyone.\n
  • These deserve a little of their own description!\nYou can query across an arbitrary set of JSON documents, fast!\nAnd then get stats about the documents out. Like averages, sums, counts, max/min etc.\nIf you think about this for a bit, you can re-implement all your RRDs in elasticsearch quite easily. Ponies and unicorns for everyone.\nYou may not actually want to re-invent RRD, especially given you have no (native) way of collapsing data points down... However it’s brilliant for making up metrics you may want an RRD for, and asking elasticsearch to generate you a graph to see if it might be useful!\n
  • Very simple model - input (pluggable), filtering (pluggable by type) in C, output (pluggable)\nLots of backends - AMQP and elasticsearch + syslog and many others\nPre-built parser library for various line based log formats\nComes with web app for searches.. Everything I need!\n
  • And it has an active community.\nThis is the alternate viewer app..\n
  • Lets take a simple case here - I’ll shove my apache logs from N servers into elasticsearch\nI run a logstash on each host (writer), and one on each elasticsearch server (reader)..\n
  • So, that has 2 logstashes - one reading files and writing AMQP\nOne reading AMQP and writing to elasticsearch\nHowever, my raw apache log lines need parsing (in the filter stage) - to be able to do things like ‘all apache requests with 500 status’, rather than ‘all apache requests containing the string 500’\n
  • So, the ‘filter’ step, for example - is the parsing apache logs and re-structuring them.\n
  • Red indicates the filtering\n
  • Except I could instead do the filtering here, if I wanted to.\nDoesn’t really matter - depends what’s best for me..\nRight, so... Lets try that then?\n
  • First problem...\n
  • Well then, I’m not going to be running this on the end nodes.\n
  • And it’s not tiny, even on machines dedicated to log parsing / filtering / indexing\n
  • But sure, I spun it up on a couple of spare machines...\n
  • It works really well as advertised.\n
  • The JVM giveth (lots of awesome software), the JVM taketh away (any RAM you had).\nruby is generally slower than perl. jruby is generally faster than perl. jruby trounces perl at (pure ruby) AMQP decoding. MRI 30% slower than perl. JRuby 30% faster than perl!\nSo I’m not actually knocking the technology here - just saying it won’t work in this situation for me.\n
  • So, anyway, I’m totally stuffed... The previous plan is a non-starter.\nSo I need something to collect logs from each host and ship them to AMQP\nOk, cool, I can write that in plain ruby or plain perl and it’s gotta be slimmer, right?\nI still plan to reuse logstash - just not on end nodes!\nHas a whole library of pre-built parsers for common log formats.\nAlso, as noted, it’s faster, and notably it’s multi-threaded, so it’ll use multiple cores..\n
  • Ok, so hopefully I’ve explained one of the problems I want to solve.\nAnd I’ve maybe explained why I have the hubris to solve it myself\nI’ve tried to keep things (at least conceptually) as simple as possible\nAt the same time, I want something that can be used for real work (i.e. not just a toy)\n
  • Good question!\n
  • But wait a second... I just want to get something ‘real’ running here...\nSo, I’m already tipping stuff into AMQP..\n\n\n
  • ZeroMQ looked like the right answer.\nI played with it. It works REALLY well.\nI’d recommend you try it.\nThe last point here is most important - ZMQ networking works entirely in a background thread perl knows nothing about, which means that you can asynchronously ship messages with no changes to your existing codebase.\n
  • Yes, this could still be ‘a script’, in fact I did that at first...\nBut I now have 3 protocols, who’s to say I won’t want a 4th..\n\n
  • Note the fact that we have a cluster of ES servers here.\nAnd we have two log indexers.\nYou can cluster RabbitMQ also.\nHighly reliable solution (against machine failure). Highly scaleable solution (just add ES servers)\n
  • This is where I went crazy.\nThis isn’t how I started.\nI am blaming AMQP! Too complex for simple cases\nI had a log shipper script. A long indexer script. An alerting (nagios) script. An irc notification script.\n
  • I mean, solving this in the simple case has got to be easy, right?\nI stole logstash’s terminology!\nAnd here’s the API, we have Outputs, which consume messages\nWe have inputs, which output messages.\nFilters are just a combination of input and output\n
  • So the input has an output, that output always has a consume method...\nTADA!\n
  • You can build a “chain” of events. This can work either way around.\nThe input can be a log file, the output can be a message queue (publisher)\nInput can be a message queue, output can be a log file (consumer)\n
  • STOMP is very different to AMQP is very different to RabbitMQ. I can’t really help much here, except for trying to make the docs not suck.\nThe docs still suck, sorry - I have tried ;)\n
  • All of these are on CPAN already.\n
  • DSL - Domain specific language.\nTry to make writing scripts really simple.\n
  • But you shouldn’t have to write ANY code to play around.\n
  • \n
  • How are we doing for time?\nI can do some demos, or we can have some questions, or both!\n(Remember to click the next slides as people as questions)\n
  • \n
  • \n
  • \n
  • Demo1\nSimple demo of the CLI in one process (STDOUT/STDIN)\n
  • Less simple demo - lets actually pass messages between two processes.\nArrows indicate message flow. ZeroMQ is a lightning bolt as it’s not quite so trivial..\n
  • \n
  • \n
  • \n
  • By insanely fast, I mean I can generate, encode as JSON, send, receive, decode as JSON over 25k messages a second. On this 3 year old macbook..\n
  • \n
  • \n
  • \n
  • I’ll talk a very little more about webhooks\n
  • Error stream\n
  • \n
  • Demo PUBSUB and round robin..\n
  • So, lets play Jenga with message queues!\n
  • I would have added ZeroMQ. Except then the diagram doesn’t fit on the page.\nI’ll leave this as an exercise for the reader!\n
  • \n
  • \n
  • \n
  • \n
  • Messaging, interoperability and log aggregation - a new framework

    1. 1. Messaging,interoperability and log aggregation - a new framework Tomas Doran (t0m) <>
    2. 2. Who are you?• Perl Developer • Been paid to write perl code for ~14 years• Open Source hacker • Catalyst core team • >160 CPAN dists• Also C, Javascript, ruby, etc..
    3. 3. Sponsored by• state51 • Pb of mogilefs, 100+ boxes. • > 4 million tracks on-demand via API • > 400 reqs/s per server, >1Gb peak from backhaul• Suretec VOIP Systems • UK voice over IP provider • Extensive API, including WebHooks for notifications• TIM Group • International Financial apps • Java / ruby / puppet
    4. 4. What?• This talk is about my new perl library: Message::Passing
    5. 5. Why?• I’d better stop, and explain a specific problem.• The solution that grew out of this is more generic.• But it illustrates my concerns and design choices well.• And everyone likes a story, right?
    6. 6. Once upon a time...• I was bored of tailing log files across dozens of servers• splunk was amazing, but unaffordable
    7. 7. Logstash
    8. 8. Centralised logging• Syslog isn’t good enough • UDP is lossy, TCP not much better • Limited fields • No structure to actual message • RFC3164 - “This document describes the observed behaviour of the syslog protocol”
    9. 9. Centralised logging• Syslog isn’t good enough• Structured app logging • We want to log data, rather than text from our application • E.g. HTTP request - vhost, path, time to generate, N db queries etc..
    10. 10. Centralised logging• Syslog isn’t good enough• Structured app logging
    11. 11. Centralised logging• Syslog isn’t good enough• Structured app logging• Post-process log files to re-structure • We can do this in cases we don’t control • Apache logs, etc.. • SO MANY DATE FORMATS. ARGHH!!
    12. 12. Centralised logging• Syslog isn’t good enough• Structured app logging• Post-process log files to re-structure• Publish logs as JSON to a message queue • JSON is fast, and widely supported • Great for arbitrary structured data!
    13. 13. Message queue• Flattens load spikes!• Only have to keep up with average message volume, not peak volume.• Logs are bursty! (Peak rate 1000x average.)• Easy to scale - just add more consumers• Allows smart routing• Great as a common integration point.
    14. 14. elasticsearch• Just tip JSON documents into it• Figures out type for each field, indexes appropriately.• Free sharding and replication• Histograms!
    15. 15. Histograms!• elasticsearch does ‘big data’, not just text search.• Ask arbitrary questions• Get back aggregate metrics / counts• Very powerful.
    16. 16. Logstash In JRuby, by Jordan Sissel InputSimple: Filter Output Flexible Extensible Plays well with others Nice web interface
    17. 17. Logstash
    18. 18. Logstash INPUT FILTEROUTPUT
    19. 19. Logstash INPUT FILTEROUTPUT
    20. 20. Logstash
    21. 21. Logstash
    22. 22. Logstash ISMASSIVE
    23. 23. 440Mb IDLE!
    24. 24. 2+Gbworking
    25. 25. 440Mb IDLE!
    26. 26. OH HAI JVM
    27. 27. Java (JRuby) decoding AMQP is, howevermuch much faster than perl doing that... JVM+-
    28. 28. Logstash on each host is totally out...• Running it on elasticsearch servers which are already dedicated to this is fine..• I’d still like to reuse all of it’s parsing
    29. 29. This talk• Is about my new library: Message::Passing• The clue is in the name...• Hopefully really simple• Maybe even useful!
    30. 30. Wait a second!• My app logs are already structured!• Why don’t I just publish AMQP from the app
    31. 31. Good question!• I tried that.• App logging relies on RabbitMQ being up• Adds a single point of failure.• Logging isn’t that important!• ZeroMQ to the rescue
    32. 32. ZeroMQ has the correct semantics• Pub/Sub sockets• Never, ever blocking• Lossy! (If needed)• Buffer sizes / locations configureable• Arbitrary message size• IO done in a background thread
    33. 33. On host log collector• ZeroMQ SUB socket • App logs - pre structured• Syslog listener • Forward rsyslogd• Log file tailer• Ship to AMQP
    34. 34. On host log collector
    35. 35. Lets make it generic!• So, I wanted a log shipper• I ended up with a framework for messaging interoperability• Whoops!• Got sick of writing scripts..
    36. 36. Events - my model for message passing• a hash {}• Output consumes events: • method consume ($event) { ...• Input produces events: • has output_to => (..• Filter does both
    37. 37. Simplifying assumption$self->output_to->consume($message)
    38. 38. Events
    39. 39. That’s it.• No, really - that’s all the complexity you have to care about!• Except for the complexity introduced by the inputs and outputs you use.• Unified attribute names / reconnection model, etc.. This helps, somewhat..
    40. 40. Inputs and outputs• ZeroMQ In / Out• AMQP (RabbitMQ) In / Out• STOMP (ActiveMQ) In / Out• elasticsearch Out• Redis PubSub In/Out• Syslog In• HTTP POST (“WebHooks”) Out
    41. 41. DSL• Building more complex chains easy!• Multiple inputs• Multiple outputs• Multiple independent chains
    42. 42. CLI• 1 Input• 1 Output• 1 Filter (default Null)• For simple use, or testing.
    43. 43. CLI• Encode / Decode step is just a Filter• JSON by default
    44. 44. Questions?
    45. 45. Questions?I can build my log shipper, without using 1/2 Gb of RAM.
    46. 46. Questions?I built my log shipper.
    47. 47. Questions?24Mb
    48. 48. Demo?
    49. 49. Demo?
    50. 50. Does this actually work?• YES - In production at two sites.
    51. 51. Does this actually work?• YES - In production at two sites.• Some of the adaptors are partially complete
    52. 52. Does this actually work?• YES - In production at two sites.• Some of the adaptors are partially complete• Dumber than logstash - no multiple threads/cores
    53. 53. Does this actually work?• YES - In production at two sites.• Some of the adaptors are partially complete• Dumber than logstash - no multiple threads/cores• ZeroMQ is insanely fast
    54. 54. Other people are using it in production!Two people I know of already writing adaptors!
    55. 55. What about logstash?• Use my lightweight code on end nodes.• Use ‘proper’ logstash for parsing/filtering on the dedicated hardware (elasticsearch boxes)• Filter to change my hashes to logstash compatible hashes • For use with MooseX::Storage and/or Log::Message::Structured
    56. 56. Other applications• Anywhere an asynchronous event stream is useful!• Monitoring• Metrics transport• Queued jobs
    57. 57. Other applications (Web stuff)• User activity (ajax ‘what are your users doing’)• WebSockets / MXHR• HTTP Push notifications - “WebHooks”
    58. 58. WebHooks• HTTP PUSH notification• E.g. Paypal IPN• Shopify API
    59. 59. Messaging patterns• Pub / Sub (AMQP / STOMP / Redis / ZMQ)• Round robin (AMQP / STOMP / Redis / ZMQ)• Partial subscribe - ‘routing keys’ • AMQP - Best at this, wildcards anywhere • Redis - wildcards as suffix • ZMQ - Exact match
    60. 60. Demo?
    61. 61. Jenga?
    62. 62. Jenga!
    63. 63. Demo?• The last demo wasn’t silly enough!• How could I top that?• Plan - Re-invent mongrel2• Badly
    64. 64. PSGI• PSGI $env is basically just a hash.• (With a little fiddling), you can serialize it as JSON• PSGI response is just an array.• Ignore streaming responses!
    65. 65. PUSH socket does fanout between multiple handlers. Reply to addressembedded in request
    66. 66. Code• Message::Passing•• #message-passing on• Demo examples: • git://