Type 1 Logs automatically generated from a service. For example apache2.log or mail.log – Usually huge amount of structured, but raw data. jira.graylog2.org:80 x.x.x.x - - [29/May/2011:01:47:38 +0200] "GET /browse/WEBINTERFACE-21?page=com.atlassian.jira.plugin.system.issuetabpanels%3Aall-tabpanel HTTP/1.1" 200 7639 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
Type 2 Logs sent directly from within your application. Triggered for example by a log.error() call or an Exception catcher. - Possible to send structured via for example GELF 2011-05-29 18:55:51 +0200 [payment] Could not validate credit card: Got HTTP 404 from example.org
How to send your logs Don't store the logs in flat files. Send them somewhere to get more value out of them.
Syslog Syslog adapters for Rails are available and work pretty good.
GELF Graylog extended log format – Let's you structure your logs. Also check out structured syslog. Ruby library, Rack exception notifier and Ruby logger available. ( www.graylog2.org )
AMQP Guaranteed and ordered delivery. Very flexible. Easily subscribe to the flow. Use routing keys to structure origin of the logs. Hell yeah, use this if you have an AMQP bus available. (or build one) Check out https://github.com/paukul/amqp_logging
Throw the messages out of your app like a hot potato Loose coupling! Your logs should always leave the application without interfering it! Prefer UDP over TCP, decouple AMQP log transports. Catch all exceptions and get back into the app flow.
Add more value to your logs For example pre-generate geo information for IP addresses or integrate the time_bandits gem.
https://github.com/skaes/time_bandits Completed in 680.378ms (View: 28.488, DB: 5.111(2,0), MC: 5.382(6r,0m), GC: 120.100(1), HP: 0(2000000,546468,18682541,934967)) | 200 OK [ http://127.0.0.1/jobs/info ] Can generate a deep insight view of your application performance when used with LogJam: https://github.com/alpinegizmo/logjam
Where to send your logs There are a lot of tools available.
Hosted services: Loggly www.loggly.com Dynamic pricing based on your usage Free for 200MB/day with 1 week retention time UDP/TCP/HTTP API as input for syslog
Hosted services: Splunk www.splunk.com Two license types: Free / Enterprise Supports any raw input
Two more hosted services: www.papertrailapp.com www.logentries.com
Open source tools: Logstash www.logstash.net Collect, parse and store logs for later use Input -> Filter -> Output Plays very well with Graylog2