Log analysis with Hadoop in livedoor 2013

10,393 views

Published on

Published in: Technology
0 Comments
54 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
10,393
On SlideShare
0
From Embeds
0
Number of Embeds
2,769
Actions
Shares
0
Downloads
179
Comments
0
Likes
54
Embeds 0
No embeds

No notes for slide

Log analysis with Hadoop in livedoor 2013

  1. 1. Log analysis system with Hadoop in livedoor 2013 Winter 2013/01/20 Hadoop Conference Japan 2013 Winter TAGOMORI Satoshi (@tagomoris) NHN Japan Corp.13年1月21日月曜日
  2. 2. TAGOMORI SATOSHI (@TAGOMORIS) NHN JAPAN CORP. WEB SERVICE BUSINESS DIVISION DEVELOPMENT DEPARTMENT 2 (IN JAN 2012, LIVEDOOR -> NHN JAPAN)13年1月21日月曜日
  3. 3. 13年1月21日月曜日
  4. 4. 13年1月21日月曜日
  5. 5. livedoor in NHN Japan13年1月21日月曜日
  6. 6. 13年1月21日月曜日
  7. 7. large scale web services 400+ Web Servers 5Gbps @ Aug 2009 15Gbps @ Aug 2011 20+Gbps @ Jan 2013 (direct outbound + CDN)13年1月21日月曜日
  8. 8. giant access log traffic At Aug 2011 (HCJ2011) From 96 servers 580GB/day13年1月21日月曜日
  9. 9. giant access log traffic NOW (At Jan 2013 HCJ2013W) From 320+ servers 1.5+ TB/day (raw) 5,300,000,000+ lines/day 120,000+ lines/sec (peak time) 400Mbps log traffic13年1月21日月曜日
  10. 10. What we want to do COUNT PV,UU and others (daily) COUNT Service metrics (daily/hourly) FIND Surprised Errors [4xx,5xx] (immediately) CHECK Response Times (immediately) SERCH Logs in troubles (hourly/immediately)13年1月21日月曜日
  11. 11. Batches and Streams Hadoop is for batches High performance batch is important HDFS has good performance Stream log writing and calcurations are also VERY VERY IMPORTANT Hybrid System: Stream processing + Batch13年1月21日月曜日
  12. 12. System Overview Archive Storage Web Servers Fluentd (scribed) Cluster Notifications STREAM (IRC) Fluentd Watchers Graph Tools webhdfs SCHEDULED BATCH BATCH hive Hadoop Cluster server Shib ShibUI (HDFS, YARN) Huahin Manager13年1月21日月曜日
  13. 13. Hadoop in livedoor 2013 18 nodes (Master 3 + Slave 15) 120core, 180GB RAM, 100TB HDFS CDH4.1.2 NameNode HA(QJM), WebHDFS YARN, Hive + HiverServer113年1月21日月曜日
  14. 14. Fluentd in livedoor 2013 16 nodes (Deliver 4 + Worker 10 + Watcher 2) Fluentd (latest release / trunk) Ruby based message transfer daemon Many plugins from rubygems.org13年1月21日月曜日
  15. 15. Hadoop/Fluentd engineer in livedoor 2013 1 person.13年1月21日月曜日
  16. 16. Processes Overview Log collection / Archiving Parse / Transform / Add flags Load into Hive tables On-demand queries Scheduled queries Stream aggregations + Notifications13年1月21日月曜日
  17. 17. Past and present 1st gen: Fully batch (late 2011) Scribed + Hadoop 2nd gen: Partially stream processing (earlier 2012) Fluentd + Hadoop 3rd gen: Fully stream processing (late 2012) Fluentd + Hadoop + Graph Tools 4th gen: New Cluster with CDH4 (earlier 2013)13年1月21日月曜日
  18. 18. BREAK.13年1月21日月曜日
  19. 19. 1st gen: First impl. Archive Storage Web Servers (scribed) Scribed STREAM (LIBHDFS) BATCH Hadoop Cluster hive server CDH3b2 Shib (Hadoop Streaming)13年1月21日月曜日
  20. 20. Shib: Hive Web Client https://github.com/tagomoris/shib13年1月21日月曜日
  21. 21. 1st gen: Fully batch Log collection / Archiving Scribed(libhdfs) Parse / Transform / Add flags Hadoop Streaming Load into Hive tables HiveServer On-demand queries + Shib Scheduled queries Stream aggregations + Notifications13年1月21日月曜日
  22. 22. 1st gen: Fully batch Simplicity: easy to implement Shib: easy to run on-demand query Latency: hourly rotation + import batch Performance: import batch needs CPU Scribed: libhdfs dependency problem13年1月21日月曜日
  23. 23. 2nd gen: +Fluentd Archive Storage Web Servers Fluentd (scribed) Cluster STREAM Cludera Hoop BATCH Hadoop Cluster hive server CDH3u2 Shib Huahin (Hive) Manager13年1月21日月曜日
  24. 24. Fluentd stream processing out_exec_filter any filter programs with STDIN/ STDOUT compatible with Hadoop Streaming! out_hoop output plugin to write HDFS over Hoop Hoop: a.k.a. HttpFs in Hadoop 2.0.x13年1月21日月曜日
  25. 25. Fluentd stream processing Web Servers Fluentd worker Fluentd deliver Fluentd worker Fluentd deliver Fluentd worker Fluentd deliver Fluentd worker Hoop Server Fluentd worker HDFS Fluentd worker13年1月21日月曜日
  26. 26. Huahin Manager REST API for: JobTracker (MRv1) ResourceManager (YARN) HiveServer http://huahinframework.org/huahin-manager/13年1月21日月曜日
  27. 27. 2nd gen: +Fluentd Log collection / Archiving Fluentd Parse / Transform / Add flags Fluentd Load into Hive tables HiveServer On-demand queries + Shib Scheduled queries Stream aggregations + Notifications13年1月21日月曜日
  28. 28. 2nd gen: +Fluentd Compatibility: RPC based HDFS/JobTracker Access Performance: import needs no CPU (Load Only) Latency: hourly rotation only Latency: hourly rotation for any queries Hoop Server: SPOF / traffic bottleneck13年1月21日月曜日
  29. 29. 3rd gen: ++++++ Archive Storage Web Servers Fluentd (scribed) Cluster Notifications STREAM (IRC) Fluentd Watchers Graph Tools webhdfs SCHEDULED BATCH BATCH Hadoop Cluster hive server CDH3u5 Shib ShibUI Huahin (Hive) Manager13年1月21日月曜日
  30. 30. WebHDFS (CDH3u5 or CDH4) HttpFs (Hoop) NameNode DataNode httpfs Client server DataNode HTTP Java Native DataNode WebHDFS NameNode DataNode Client DataNode DataNode HTTP13年1月21日月曜日
  31. 31. Fluentd online aggregation Semi-realtime aggregation to: counts errors of HTTP response calculate avg/%tiles of response time draw graphs immediately Many plugins for real time aggregation13年1月21日月曜日
  32. 32. Graph Tools: GrowthForecast / HRForecast Graph drawing tools to update values over very simple HTTP request GrowthForecast: Real-time values HRForecast: Summarized (past) values13年1月21日月曜日
  33. 33. HTTP Status/Response Time on GrowthForecast HTTP STATUS: 2XX(BLUE),3XX(GREEN),4XX(ORANGE), 5XX(RED) HTTP RESPONSE TIMES: AVG, [90, 95, 98, 99]PERCENTILE http://kazeburo.github.com/GrowthForecast/13年1月21日月曜日
  34. 34. ShibUI13年1月21日月曜日
  35. 35. ShibUI https://github.com/kazeburo/hrforecast13年1月21日月曜日
  36. 36. 3rd gen: +++++++ Log collection / Archiving Fluentd Parse / Transform / Add flags Fluentd Load into Hive tables HiveServer On-demand queries + Shib Scheduled queries ShibUI Fluentd Stream aggregations + Notifications13年1月21日月曜日
  37. 37. 3rd gen: +++++++ NO SPOF: for data stream Real time monitoring Queries for services: Scheduled queries, Visualization Latency: hourly rotation for any queries SPOF: NameNode (VIP & DRBD is xxxx...)13年1月21日月曜日
  38. 38. 4th gen: NOW Archive Storage Web Servers Fluentd (scribed) Cluster Notifications STREAM (IRC) Fluentd Watchers Graph Tools webhdfs SCHEDULED BATCH BATCH Hadoop Cluster hive server CDH4 Shib ShibUI Huahin (HDFS, YARN) Manager13年1月21日月曜日
  39. 39. 4th gen: CDH4.1.2 NO SPOF: QJM based NameNode HA Performance: YARN (?) Latency: multiple rotation in an hour with hive table schema change NONE should be improved!13年1月21日月曜日
  40. 40. Good parts for solo engineer: RPC: Loosely-coupled architecture High compatibility / Low maintenance cost Open Source All components are OSS Open knowledge Well blogged / presentationed13年1月21日月曜日
  41. 41. OUR DRIVER IS "OPENNESS" thanks to crouton & @kbysmnr !13年1月21日月曜日
  42. 42. Software list: https://ccp.cloudera.com/display/SUPPORT/Downloads http://fluentd.org/ http://fluentd.org/plugin/ https://github.com/tagomoris/fluent-agent-lite https://github.com/tagomoris/shib https://github.com/tagomoris/shibui http://huahinframework.org/huahin-manager/ http://kazeburo.github.com/GrowthForecast/ http://github.com/kazeburo/hrforecast13年1月21日月曜日
  43. 43. See also: Hadoop and Subsystem in livedoor (2011) http://www.slideshare.net/tagomoris/hadoop-and-subsystems-in-livedoor-hcj11f Distributed message stream processing on Fluentd http://www.slideshare.net/tagomoris/distributed-stream-processing-on-fluentd-fluentd Hive Tools in NHN Japan http://www.slideshare.net/tagomoris/hive-tools-in-nhn-japan-hadoopreading OSS based large scale log aggregation in livedoor http://www.slideshare.net/tagomoris/oss-nhntech Fluentd and WebHDFS http://www.slideshare.net/tagomoris/fluentd-and-webhdfs13年1月21日月曜日

×