Take My Logs. Please!
Upcoming SlideShare
Loading in...5
×
 

Take My Logs. Please!

on

  • 4,079 views

Details on how we capture application data in our access and error logs, as well as how to generate quick reports and graphs from these logs. ...

Details on how we capture application data in our access and error logs, as well as how to generate quick reports and graphs from these logs.

This talk was presented at O'Reilly's Velocity Online Conference on October 26, 2011.

Statistics

Views

Total Views
4,079
Views on SlideShare
4,041
Embed Views
38

Actions

Likes
20
Downloads
84
Comments
0

6 Embeds 38

http://www.linkedin.com 20
http://a0.twimg.com 8
http://coderwall.com 4
http://lanyrd.com 4
http://www.slideshare.net 1
https://twitter.com 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Take My Logs. Please! Take My Logs. Please! Presentation Transcript

  • Take my logs. Please.Mike BrittainDirector of Engineering, InfrastructureEtsy.commike@etsy.com @mikebrittain
  • (hello?)
  • This soundsboooooorrrrring...No, no... hang in there!
  • 25 MM uniques/month150 Countries$300 MM+ sales last year
  • Apache, PHP, MySQL,PostgreSQL,Memcache, Gearman,Solr, etc.
  • What’s working?
  • What’s working?Performance
  • What’s working?PerformanceOperability
  • What’s working?PerformanceOperabilitySimplicity
  • Logging + Trending
  • App logging(Apache access and error logs)
  • “Common”LogFormat "%h %l %u %t "%r" %>s %b
  • “Combined”LogFormat "%h %l %u %t "%r" %>s %b "%{Referer}i" "%{User-agent}i""
  • mod_log_config %f Filename requested # of keepalive requests served %k on this connection Time taken to serve the request, %T in seconds
  • mod_log_config %f Filename requested # of keepalive requests served %k on this connection Time taken to serve the request, %D in microseconds
  • mod_log_config %f Filename requested # of keepalive requests served %k on this connection Time taken to serve the request, %D in microseconds Contents of “note” foobar from%{foobar}n another module
  • apache_note()apache_note(“foobar”, $whatever);
  • “Steroids”LogFormat %{True-Client-IP}i %l %t "%r"%>s %b "%{Referer}i""%{User-Agent}i" %V%{user_id}n %{shop_id}n %{uaid}n%{ab_selections}n %{request_uid}n%{api_consumer_key}n%{api_method_name}n%{php_bytes}n %{php_microsec}n %D
  • “Steroids”LogFormat %{True-Client-IP}i %l %t "%r"%>s %b "%{Referer}i""%{User-Agent}i" %V%{user_id}n %{shop_id}n %{uaid}n%{ab_selections}n %{request_uid}n%{api_consumer_key}n%{api_method_name}n%{php_bytes}n %{php_microsec}n %D
  • $GLOBALS[timer] = microtime(true) * 1000000;
  • $GLOBALS[timer] = microtime(true) * 1000000;register_shutdown_function(pageStats);function pageStats() {}
  • $GLOBALS[timer] = microtime(true) * 1000000;register_shutdown_function(pageStats);function pageStats() { $timer_end = microtime(true) * 1000000; $diff = $timer_end - $GLOBALS[timer];}
  • $GLOBALS[timer] = microtime(true) * 1000000;register_shutdown_function(pageStats);function pageStats() { $timer_end = microtime(true) * 1000000; $diff = $timer_end - $GLOBALS[timer]; apache_note(php_microsec, $diff); apache_note(php_bytes, memory_get_peak_usage());}
  • What about “%D”?
  • “Steroids”LogFormat %{True-Client-IP}i %l %t "%r"%>s %b "%{Referer}i""%{User-Agent}i" %V%{user_id}n %{shop_id}n %{uaid}n%{ab_selections}n %{request_uid}n%{api_consumer_key}n%{api_method_name}n%{php_bytes}n %{php_microsec}n %D
  • “Steroids”LogFormat %{True-Client-IP}i %l %t "%r"%>s %b "%{Referer}i""%{User-Agent}i" %V%{user_id}n %{shop_id}n %{uaid}n%{ab_selections}n %{request_uid}n%{api_consumer_key}n%{api_method_name}n%{php_bytes}n %{php_microsec}n %D
  • “Steroids”LogFormat %{True-Client-IP}i %l %t "%r"%>s %b "%{Referer}i""%{User-Agent}i" %V%{user_id}n %{shop_id}n %{uaid}n%{ab_selections}n %{request_uid}n%{api_consumer_key}n%{api_method_name}n%{php_bytes}n %{php_microsec}n %D
  • “Steroids”LogFormat %{True-Client-IP}i %l %t "%r"%>s %b "%{Referer}i""%{User-Agent}i" %V%{user_id}n %{shop_id}n %{uaid}n%{ab_selections}n ...easy_reg=1; personalize_widget=0;icon_in_cornflower_blue=1;
  • Coming soon...%{locale}n (i18n)%{platform}n (desktop vs. mobile)
  • Coming soon...%{locale}n (i18n)%{platform}n (desktop vs. mobile)OPS-1805, OPS-1827etsy.com/careers
  • Using something else?time, http method, request uri,response code, referer, user-agent,response time, response memory,custom segmentation fields...
  • Quick averagesgrep "GET /listing/" access.log | awk {sum=sum+$(NF-1)} END {print sum/NR}
  • Quick graphsgrep "GET /listing/" access.log | perl -pe "s/.*[.*d{4}:(d{2}):(d{2}):d{2}.*]/1:2/" | awk {print $1, $(NF-1)} > /tmp/pagetimes.datgives you...
  • Quick graphs# /tmp/pagetimes.dat18:37 251.018:38 252.118:39 253.518:40 251.018:45 250.0and then...
  • Quick graphs# GNUPLOTset terminal pngset output listings.pngset yrange [0:2000]set xdata timeset timefmt "%d/%B/%Y:%H:%M:%S"set format x "%H:%M"plot /tmp/pagetimes.dat using 1:2 with points
  • Quick graphs
  • Error logsPHP + Apache errors in one fileSimple logging interface
  • Error logsLevels: error, info, debugNamespace: perf, sql, __class__
  • Logger::error("Query exceeded 5 sec: $query", “sql_long_query”);
  • web0054 [Fri Mar 04 16:27:48 2011] [error][sql_long_query] [mk04gw1p71] Query exceeded5 sec: SELECT * FROM ...
  • web0054 [Fri Mar 04 16:27:48 2011] [error][sql_long_query] [mk04gw1p71] Query exceeded5 sec: SELECT * FROM ...
  • $ grep "16:27:48" access.log | wc -l1527
  • web0054 [Fri Mar 04 16:27:48 2011] [error][sql_long_query] [mk04gw1p71] Query exceeded5 sec: SELECT * FROM ...
  • iowerror.log -> request_uid -> access.logrequest uri, ab selections, user id, locale,platform, api key, etc.
  • Filteringtail -f error.log | grep -v “sql_long_query” | ...
  • web0001 [04:28:54 2011] [error] [client 10.101.x.x] Help me, Rhonda.web0001 [04:28:54 2011] [error] [client 10.101.x.x] Oh noooooo!web0001 [04:28:54 2011] [error] [client 10.101.x.x] Gaaaaahhh!web0001 [04:28:54 2011] [error] [client 10.101.x.x] Heeeeeeellllllllllllllpppweb0001 [04:28:54 2011] [error] [client 10.101.x.x] Oh noooooo!web0001 [04:28:54 2011] [fatal] [client 10.101.x.x] Gaaaaahhh!web0201 [04:28:54 2011] [warning] [client 10.101.x.x] Gaaaaahhh!web0034 [04:28:54 2011] [warning] [client 10.101.x.x] Oh noooooooooooweb0001 [04:28:54 2011] [error] [client 10.101.x.x] Gaaaaahhh!!!web1101 [04:28:54 2011] [error] [client 10.101.x.x] Gaaaaahhh!!!web0201 [04:28:54 2011] [error] [client 10.101.x.x] Youve been eaten by a grweb0055 [04:28:54 2011] [fatal] [client 10.101.x.x] Gaaaaahhh!!!web0002 [04:28:54 2011] [warning] [client 10.101.x.x] Sky is falling.web0089 [04:28:54 2011] [error] [client 10.101.x.x] Gaaaaahhh!!!web0020 [04:28:54 2011] [error] [client 10.101.x.x] Sky is falling.web1101 [04:28:54 2011] [fatal] [client 10.101.x.x] Gaaaaahhh!web0055 [04:28:54 2011] [warning] [client 10.101.x.x] Gaaaaahhh!web0001 [04:28:54 2011] [warning] [client 10.101.x.x] Oh noooooooooooweb0001 [04:28:54 2011] [error] [client 10.101.x.x] Gaaaaahhh!!!web0034 [04:28:54 2011] [error] [client 10.101.x.x] Gaaaaahhh!!!web0087 [04:28:54 2011] [fatal] [client 10.101.x.x] Sky is falling.web0002 [04:28:54 2011] [error] [client 10.101.x.x] Oh noooooo!web0201 [04:28:54 2011] [fatal] [client 10.101.x.x] Gaaaaahhh!web0077 [04:28:54 2011] [warning] [client 10.101.x.x] Gaaaaahhh!web0355 [04:28:54 2011] [warning] [client 10.101.x.x] Oh noooooooooooweb0052 [04:28:54 2011] [error] [client 10.101.x.x] Gaaaaahhh!!!web0001 [04:28:54 2011] [error] [client 10.101.x.x] Gaaaaahhh!!!web0003 [04:28:54 2011] [error] [client 10.101.x.x] Youve been eaten by a grweb0066 [04:28:54 2011] [fatal] [client 10.101.x.x] Gaaaaahhh!!!web0001 [04:28:54 2011] [warning] [client 10.101.x.x] Sky is fallingweb0020 [04:28:54 2011] [error] [client 10.101.x.x] Sky is falling.web1101 [04:28:54 2011] [fatal] [client 10.101.x.x] Gaaaaahhh!web0055 [04:28:54 2011] [warning] [client 10.101.x.x] Gaaaaahhh!web0001 [04:28:54 2011] [warning] [client 10.101.x.x] Oh noooooooooooweb0001 [04:28:54 2011] [error] [client 10.101.x.x] Gaaaaahhh!!!
  • Trendingfatals errors warnings
  • LogsterRun by cronMaintains a cursor on log filesSimple parsing & aggregationOutput to Ganglia or Graphite github.com/etsy
  • web0054 [Fri Mar 04 16:27:48 2011] [error] [login] [mk04gw1p71] User login failed. Reason: wrong password for ...
  • ^.+ [.+] [(?P<log_level>.+)]
  • if (fields[log_level] == “fatal”): self.fatals += 1elif (fields[log_level] == “error”): self.errors += 1elif (fields[log_level] == “warning”): self.warnings += 1...
  • MetricObject("fatals", (self.fatals / self.duration), "per sec")MetricObject("errors", (self.errors / self.duration), "per sec")MetricObject("warning", (self.warnings / self.duration), "per sec")
  • fatals errors warnings
  • Logster Signed-in vs. Signed-out
  • github.com/etsy
  • Log a plethora of data.Don’t be afraid to use one file.
  • Use custom fields to segment data.
  • Correlate errors to specific requests.
  • Make f#@k!ng graphs.
  • Convert rates to trend lines.
  • Take my logs. Please!
  • Thank you. codeascraft.etsy.com github.com/etsyMike BrittainDirector of Engineering, InfrastructureEtsy.commike@etsy.com @mikebrittain