Real time system performance monitoring
     AMQP and CatalystX::JobServer

     Cooking a
     Rabbit pie
                Tomas (t0m) Doran
                São Paulo.pm perl workshop 2010
                London perl workshop 2010
I accidentally wrote
         software

• I try to avoid doing this
• I’m not very good at it ;)
Aka the “my code still
doesn’t work yet” talk
• This is the 2nd time I’ve talked about
  AMQP stuff this year.
• Half the code still isn’t properly production
  ready.
• It’s an order of magnitude better than last
  time round ;)
Message queueing

• Lets not talk about the code just yet...
• This talk is about Message Queueing and
  specifically AMQP.
• I’m going to assume if you knew nothing,
  you went to the earlier talk - not going to
  duplicate too much.
AMQP Concepts
              RabbitMQ


                vhost



  Publisher   Exchange




               Queue
  Consumer
AMQP

• All your clients know (at least half) of the
  wiring.
• Different topologies depending on routing
  configuration.
• Can specify other options such as durability
• Nice when your server dies - no ‘current
  config’
Routing keys
• Each message sent to an exchange has a
  routing key.
• Each queue can be bound to exchanges
  with routing keys
• I.e. you can subscribe to thingsilike.meat
  and thingsilike.beer on an exchange, but not
  subscribe to thingsilike.classicalmusic
• Wildcards - # and *
AMQP Delivery modes
• Two modes of consuming messages
 • Get - Gets a single message from the
    queue
 • Subscribe - server sends messages from
    the queue to the client until cancelled
• Durability - exchange and queue must
  agree
• Explicit ACK possible but not required
Non-trivial message
    queueing
• Flexible topologies.
• Each queue can bind to multiple
  exchanges, with multiple routing keys.
• Routing can be dynamic.
• E.g. one client can ‘tail’ a log, but then re-
  bind with a different routing key to get a
  different subset of messages.
Custom Exchanges

• RabbitMQ allows pluggable exchange types
• Simplest and most useful example is the
  ‘emit last message on bind’ exchange
• New consumers get the last message seen
  on the exchange
I wanted to play with
      RabbitMQ
• I blame Net::RabbitFoot
• It’s written using AnyEvent
• Which I hadn’t used before, but looked
  good for this type of thing.
• I felt my way around writing simple things
  standalone.
• Got all the web servers logging into Rabbit
Logging into message
       queuing
• Good example of broadcast
• Want to aggregate logs to files
• And be able to ‘tail’ them
• Logging directly from the application
• Also tailing (normal) log files to message
  queue
Practical example:
Getting the logs back
Ok, there is some setup
Ok, there is some setup
Dumping them to a file

• That’s pretty simple after that..
• Except:
 • Log rotation
 • Not flushing to disk once per message
 • etc...
Viewing them live

• Someone wrote an AMQP client in flash
• AMQP security model not useful publicly
• Cute prototype
• (Sorry no live demo - it hated me when I
  tried to make it work again)
Queueing Jobs

• New skool scripts - MX::Getopt and a -
  >run method
• Add MooseX::Storage
• You can flatten a script as JSON, send it
  over the wire, re-inflate it, call the ->run
  method.
Message Queueing
      Framework?
• I now have several scripts, all doing bits
  with queuing. All duplicating code.
• I want to run batch jobs
• I want to aggregate log messages (e.g.
  average web requests per min).
• I want to log messages to file(s)
• I want to broadcast (and log) aggregates
• Need something more generic
Most urgent problem:
     Job Server
• Wrote a simple abstraction on getting
  channels, making exchanges and queues, etc
• Was still going to end up with a load of
  scripts/jobs that needed running and
  managing.Yuk.
• Inspecting the state of each job ‘service’
  hard / boring..
But wait
• Each ‘thing’ (timer, listener, logger, emitter,
  worker etc) an instance with state in
  attributes - traits => [‘Serialize’]
• Construct a instances from config file.
• One process managing each machine’s
  tasks.
• Less processes to manage, win. Working
  out the state just got harder still, FAIL.
But wait
• Make all the classes use MooseX::Storage
• Catalyst makes instances of things from
  config...
• Build Catalyst app from config file at boot
• 2 Trivial controllers introspect entire app
• State monitoring (nagios, munin, debugging)
  now trivial
Really really trivial
Hubris
• I could have just used Gearman for my
  jobs.
• I could have come up with a simpler
  solution for logging and aggregation.
• I was having fun.

• Until I tried running it in production.
Hubris
• I could have just used Gearman for my
  jobs.
• I could have come up with a simpler
  solution for logging and aggregation.
• I was having fun.

• Until I tried running it in production.
Hubris
• I could have just used Gearman for my
  jobs.
• I could have come up with a simpler
  solution for logging and aggregation.
• I was having fun.

• It crashed and burned in production.
• sys programming, async, pluggable, FUUUU
Hubris
• I could have just used Gearman for my
  jobs.
• I could have come up with a simpler
  solution for logging and aggregation.
• I was having fun.

• It crashed and burned in production.
• sys programming, async, pluggable, FUUUU
I learnt a lot
• I knew a lot in theory before. Doing it is
  somewhat harder :)


• It doesn’t crash any more.
• Or leak fds
• Still not perfect (e.g. doesn’t reconnect
  right if mq falls over)
Why laziness didn’t win.
• Brad’s code is great, until you try not being
  Livejournal.
• My day job is nothing like Livejournal.

• I was still very excited about AMQP
• I had a talk to write for a conference (and I
  could avoid writing it by writing software)
• Hippie rocked my world
Web 2.0
• Lots of Javascript, update pages dynamically
• Messages already JSON from MX::Storage
• comet - long poll, multipart xhr
• Joose.Storage - inflate your objects in
  Javascript
• Present data from message queues to the
  user as it becomes available.
• Hippie - painless comet - rocked my world
Web::Hippie
• Async pipe to the browser.
• Abstracts all the nasty ajax details (also
  does long poll, or websockets)
• Applications (I had a practical use for):
  •   Interactive log tail. Realtime systems graphs.

  •   Instant feedback from long running batch jobs

  •   ‘Social’ features by broadcasting/aggregating data
Job Statuses

• We inflate JSON data to $job
• $job->run($status_cb);
• $status_cb->( CompletionEstimate-
  >new( percent => 50 ) );
Job Statuses
• All jobs have a UUID
• Job statuses - ‘Running/Complete/StatusLine/
  CompletionEstimate/RunJob’
• Ask for a pipe to some UUID(s)
• Draw nice progress indicators
• Further jobs can be included (RunJob magic)
• Custom statuses trivial - perl class with
  attributes, Javascript class with display logic
Wait a second
•   I don’t want the unwashed masses making HTTP
    connections to machines running jobs.

•   Ergo: Send all the job statuses to an exchange, use
    UUID as routing key.

•   Optional ‘Hippies’ controller - client produces a
    set of keys, these sprintf => routing keys.

•   Hippie pipe => one queue per-client queue bound
    to keys they want.

•   ‘RunJob’ messages automatically binds extra key,
    so you see things triggered by things you are
    watching.
How it all hangs together
Useable?
• Running jobs works.
• Running it in production at work. It doesn’t
  crash any more.
• Status pipe stuff not deployed to clients yet.
• If you need a high volume, simple,
  production ready job scheduler, right
  now, use Gearman.
Demo?
Demo 1

• Simple job server
• Enqueues 10 jobs at start
• 1 worker process
• JSON status of app
• Add workers dynamically
Demo 2

• Publish component status to queue
  regularly
• Simple CLI script tailing queue
• Jobs indexed by UUID - allows Hippie
Demo 3

• Status updates from ‘Job Server’ published.
• 2nd ‘/hippies’ process binds queue to
  exchange for some UUIDs.
• Gets ‘RunJob’ notifications, and statuses
  when they run.
Next steps?
• Reliability - recover from errors better.
• Expose more stats about MQ use.
• Better (some!) logging.
• Docs would be good...
• Hippe::Pipe for bidirectional
• More traits / plugins / components
• More / better stock Javascript
What to go look at

• http://www.rabbitmq.com
• Docs, setup info, FAQ
• Slideshare - lots of good and more detailed
  AMQP introductions.
What to go look at
       (CPAN)
• Net::RabbitFoot
• App::RabbitTail
• Plack / Twiggy
• Web::Hippie
• CatalystX::JobServer (github.com/bobtfish/
  CatalystX-JobServer)
Other solutions

• Gearman
• STOMP
 • ActiveMQ (or Rabbit has a STOMP
    adaptor - good luck)
 • Catalyst::Engine::STOMP
Thanks!
• Questions?
• Anyone interested in playing, feel free to
  grab me on irc (t0m @ magnet &
  Freenode)
• Happy to hand-hold as I need more people
  using this.
• I’ll love you forever if you write docs.

Cooking a rabbit pie

  • 1.
    Real time systemperformance monitoring AMQP and CatalystX::JobServer Cooking a Rabbit pie Tomas (t0m) Doran São Paulo.pm perl workshop 2010 London perl workshop 2010
  • 2.
    I accidentally wrote software • I try to avoid doing this • I’m not very good at it ;)
  • 3.
    Aka the “mycode still doesn’t work yet” talk • This is the 2nd time I’ve talked about AMQP stuff this year. • Half the code still isn’t properly production ready. • It’s an order of magnitude better than last time round ;)
  • 4.
    Message queueing • Letsnot talk about the code just yet... • This talk is about Message Queueing and specifically AMQP. • I’m going to assume if you knew nothing, you went to the earlier talk - not going to duplicate too much.
  • 5.
    AMQP Concepts RabbitMQ vhost Publisher Exchange Queue Consumer
  • 6.
    AMQP • All yourclients know (at least half) of the wiring. • Different topologies depending on routing configuration. • Can specify other options such as durability • Nice when your server dies - no ‘current config’
  • 7.
    Routing keys • Eachmessage sent to an exchange has a routing key. • Each queue can be bound to exchanges with routing keys • I.e. you can subscribe to thingsilike.meat and thingsilike.beer on an exchange, but not subscribe to thingsilike.classicalmusic • Wildcards - # and *
  • 8.
    AMQP Delivery modes •Two modes of consuming messages • Get - Gets a single message from the queue • Subscribe - server sends messages from the queue to the client until cancelled • Durability - exchange and queue must agree • Explicit ACK possible but not required
  • 9.
    Non-trivial message queueing • Flexible topologies. • Each queue can bind to multiple exchanges, with multiple routing keys. • Routing can be dynamic. • E.g. one client can ‘tail’ a log, but then re- bind with a different routing key to get a different subset of messages.
  • 10.
    Custom Exchanges • RabbitMQallows pluggable exchange types • Simplest and most useful example is the ‘emit last message on bind’ exchange • New consumers get the last message seen on the exchange
  • 11.
    I wanted toplay with RabbitMQ • I blame Net::RabbitFoot • It’s written using AnyEvent • Which I hadn’t used before, but looked good for this type of thing. • I felt my way around writing simple things standalone. • Got all the web servers logging into Rabbit
  • 12.
    Logging into message queuing • Good example of broadcast • Want to aggregate logs to files • And be able to ‘tail’ them • Logging directly from the application • Also tailing (normal) log files to message queue
  • 13.
  • 14.
    Ok, there issome setup
  • 15.
    Ok, there issome setup
  • 16.
    Dumping them toa file • That’s pretty simple after that.. • Except: • Log rotation • Not flushing to disk once per message • etc...
  • 17.
    Viewing them live •Someone wrote an AMQP client in flash • AMQP security model not useful publicly • Cute prototype • (Sorry no live demo - it hated me when I tried to make it work again)
  • 19.
    Queueing Jobs • Newskool scripts - MX::Getopt and a - >run method • Add MooseX::Storage • You can flatten a script as JSON, send it over the wire, re-inflate it, call the ->run method.
  • 20.
    Message Queueing Framework? • I now have several scripts, all doing bits with queuing. All duplicating code. • I want to run batch jobs • I want to aggregate log messages (e.g. average web requests per min). • I want to log messages to file(s) • I want to broadcast (and log) aggregates • Need something more generic
  • 21.
    Most urgent problem: Job Server • Wrote a simple abstraction on getting channels, making exchanges and queues, etc • Was still going to end up with a load of scripts/jobs that needed running and managing.Yuk. • Inspecting the state of each job ‘service’ hard / boring..
  • 22.
    But wait • Each‘thing’ (timer, listener, logger, emitter, worker etc) an instance with state in attributes - traits => [‘Serialize’] • Construct a instances from config file. • One process managing each machine’s tasks. • Less processes to manage, win. Working out the state just got harder still, FAIL.
  • 23.
    But wait • Makeall the classes use MooseX::Storage • Catalyst makes instances of things from config... • Build Catalyst app from config file at boot • 2 Trivial controllers introspect entire app • State monitoring (nagios, munin, debugging) now trivial
  • 24.
  • 25.
    Hubris • I couldhave just used Gearman for my jobs. • I could have come up with a simpler solution for logging and aggregation. • I was having fun. • Until I tried running it in production.
  • 26.
    Hubris • I couldhave just used Gearman for my jobs. • I could have come up with a simpler solution for logging and aggregation. • I was having fun. • Until I tried running it in production.
  • 27.
    Hubris • I couldhave just used Gearman for my jobs. • I could have come up with a simpler solution for logging and aggregation. • I was having fun. • It crashed and burned in production. • sys programming, async, pluggable, FUUUU
  • 28.
    Hubris • I couldhave just used Gearman for my jobs. • I could have come up with a simpler solution for logging and aggregation. • I was having fun. • It crashed and burned in production. • sys programming, async, pluggable, FUUUU
  • 29.
    I learnt alot • I knew a lot in theory before. Doing it is somewhat harder :) • It doesn’t crash any more. • Or leak fds • Still not perfect (e.g. doesn’t reconnect right if mq falls over)
  • 30.
    Why laziness didn’twin. • Brad’s code is great, until you try not being Livejournal. • My day job is nothing like Livejournal. • I was still very excited about AMQP • I had a talk to write for a conference (and I could avoid writing it by writing software) • Hippie rocked my world
  • 31.
    Web 2.0 • Lotsof Javascript, update pages dynamically • Messages already JSON from MX::Storage • comet - long poll, multipart xhr • Joose.Storage - inflate your objects in Javascript • Present data from message queues to the user as it becomes available. • Hippie - painless comet - rocked my world
  • 32.
    Web::Hippie • Async pipeto the browser. • Abstracts all the nasty ajax details (also does long poll, or websockets) • Applications (I had a practical use for): • Interactive log tail. Realtime systems graphs. • Instant feedback from long running batch jobs • ‘Social’ features by broadcasting/aggregating data
  • 33.
    Job Statuses • Weinflate JSON data to $job • $job->run($status_cb); • $status_cb->( CompletionEstimate- >new( percent => 50 ) );
  • 34.
    Job Statuses • Alljobs have a UUID • Job statuses - ‘Running/Complete/StatusLine/ CompletionEstimate/RunJob’ • Ask for a pipe to some UUID(s) • Draw nice progress indicators • Further jobs can be included (RunJob magic) • Custom statuses trivial - perl class with attributes, Javascript class with display logic
  • 35.
    Wait a second • I don’t want the unwashed masses making HTTP connections to machines running jobs. • Ergo: Send all the job statuses to an exchange, use UUID as routing key. • Optional ‘Hippies’ controller - client produces a set of keys, these sprintf => routing keys. • Hippie pipe => one queue per-client queue bound to keys they want. • ‘RunJob’ messages automatically binds extra key, so you see things triggered by things you are watching.
  • 36.
    How it allhangs together
  • 37.
    Useable? • Running jobsworks. • Running it in production at work. It doesn’t crash any more. • Status pipe stuff not deployed to clients yet. • If you need a high volume, simple, production ready job scheduler, right now, use Gearman.
  • 38.
  • 39.
    Demo 1 • Simplejob server • Enqueues 10 jobs at start • 1 worker process • JSON status of app • Add workers dynamically
  • 40.
    Demo 2 • Publishcomponent status to queue regularly • Simple CLI script tailing queue • Jobs indexed by UUID - allows Hippie
  • 41.
    Demo 3 • Statusupdates from ‘Job Server’ published. • 2nd ‘/hippies’ process binds queue to exchange for some UUIDs. • Gets ‘RunJob’ notifications, and statuses when they run.
  • 42.
    Next steps? • Reliability- recover from errors better. • Expose more stats about MQ use. • Better (some!) logging. • Docs would be good... • Hippe::Pipe for bidirectional • More traits / plugins / components • More / better stock Javascript
  • 43.
    What to golook at • http://www.rabbitmq.com • Docs, setup info, FAQ • Slideshare - lots of good and more detailed AMQP introductions.
  • 44.
    What to golook at (CPAN) • Net::RabbitFoot • App::RabbitTail • Plack / Twiggy • Web::Hippie • CatalystX::JobServer (github.com/bobtfish/ CatalystX-JobServer)
  • 45.
    Other solutions • Gearman •STOMP • ActiveMQ (or Rabbit has a STOMP adaptor - good luck) • Catalyst::Engine::STOMP
  • 46.
    Thanks! • Questions? • Anyoneinterested in playing, feel free to grab me on irc (t0m @ magnet & Freenode) • Happy to hand-hold as I need more people using this. • I’ll love you forever if you write docs.