Log analysis with Hadoop in livedoor 2013
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
9,191
On Slideshare
6,687
From Embeds
2,504
Number of Embeds
8

Actions

Shares
Downloads
152
Comments
0
Likes
52

Embeds 2,504

http://d.hatena.ne.jp 1,876
http://shinodogg.com 546
https://twitter.com 52
http://www.twylah.com 21
http://webcache.googleusercontent.com 4
http://twitter.com 3
https://twimg0-a.akamaihd.net 1
http://tweetedtimes.com 1

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Log analysis system with Hadoop in livedoor 2013 Winter 2013/01/20 Hadoop Conference Japan 2013 Winter TAGOMORI Satoshi (@tagomoris) NHN Japan Corp.13年1月21日月曜日
  • 2. TAGOMORI SATOSHI (@TAGOMORIS) NHN JAPAN CORP. WEB SERVICE BUSINESS DIVISION DEVELOPMENT DEPARTMENT 2 (IN JAN 2012, LIVEDOOR -> NHN JAPAN)13年1月21日月曜日
  • 3. 13年1月21日月曜日
  • 4. 13年1月21日月曜日
  • 5. livedoor in NHN Japan13年1月21日月曜日
  • 6. 13年1月21日月曜日
  • 7. large scale web services 400+ Web Servers 5Gbps @ Aug 2009 15Gbps @ Aug 2011 20+Gbps @ Jan 2013 (direct outbound + CDN)13年1月21日月曜日
  • 8. giant access log traffic At Aug 2011 (HCJ2011) From 96 servers 580GB/day13年1月21日月曜日
  • 9. giant access log traffic NOW (At Jan 2013 HCJ2013W) From 320+ servers 1.5+ TB/day (raw) 5,300,000,000+ lines/day 120,000+ lines/sec (peak time) 400Mbps log traffic13年1月21日月曜日
  • 10. What we want to do COUNT PV,UU and others (daily) COUNT Service metrics (daily/hourly) FIND Surprised Errors [4xx,5xx] (immediately) CHECK Response Times (immediately) SERCH Logs in troubles (hourly/immediately)13年1月21日月曜日
  • 11. Batches and Streams Hadoop is for batches High performance batch is important HDFS has good performance Stream log writing and calcurations are also VERY VERY IMPORTANT Hybrid System: Stream processing + Batch13年1月21日月曜日
  • 12. System Overview Archive Storage Web Servers Fluentd (scribed) Cluster Notifications STREAM (IRC) Fluentd Watchers Graph Tools webhdfs SCHEDULED BATCH BATCH hive Hadoop Cluster server Shib ShibUI (HDFS, YARN) Huahin Manager13年1月21日月曜日
  • 13. Hadoop in livedoor 2013 18 nodes (Master 3 + Slave 15) 120core, 180GB RAM, 100TB HDFS CDH4.1.2 NameNode HA(QJM), WebHDFS YARN, Hive + HiverServer113年1月21日月曜日
  • 14. Fluentd in livedoor 2013 16 nodes (Deliver 4 + Worker 10 + Watcher 2) Fluentd (latest release / trunk) Ruby based message transfer daemon Many plugins from rubygems.org13年1月21日月曜日
  • 15. Hadoop/Fluentd engineer in livedoor 2013 1 person.13年1月21日月曜日
  • 16. Processes Overview Log collection / Archiving Parse / Transform / Add flags Load into Hive tables On-demand queries Scheduled queries Stream aggregations + Notifications13年1月21日月曜日
  • 17. Past and present 1st gen: Fully batch (late 2011) Scribed + Hadoop 2nd gen: Partially stream processing (earlier 2012) Fluentd + Hadoop 3rd gen: Fully stream processing (late 2012) Fluentd + Hadoop + Graph Tools 4th gen: New Cluster with CDH4 (earlier 2013)13年1月21日月曜日
  • 18. BREAK.13年1月21日月曜日
  • 19. 1st gen: First impl. Archive Storage Web Servers (scribed) Scribed STREAM (LIBHDFS) BATCH Hadoop Cluster hive server CDH3b2 Shib (Hadoop Streaming)13年1月21日月曜日
  • 20. Shib: Hive Web Client https://github.com/tagomoris/shib13年1月21日月曜日
  • 21. 1st gen: Fully batch Log collection / Archiving Scribed(libhdfs) Parse / Transform / Add flags Hadoop Streaming Load into Hive tables HiveServer On-demand queries + Shib Scheduled queries Stream aggregations + Notifications13年1月21日月曜日
  • 22. 1st gen: Fully batch Simplicity: easy to implement Shib: easy to run on-demand query Latency: hourly rotation + import batch Performance: import batch needs CPU Scribed: libhdfs dependency problem13年1月21日月曜日
  • 23. 2nd gen: +Fluentd Archive Storage Web Servers Fluentd (scribed) Cluster STREAM Cludera Hoop BATCH Hadoop Cluster hive server CDH3u2 Shib Huahin (Hive) Manager13年1月21日月曜日
  • 24. Fluentd stream processing out_exec_filter any filter programs with STDIN/ STDOUT compatible with Hadoop Streaming! out_hoop output plugin to write HDFS over Hoop Hoop: a.k.a. HttpFs in Hadoop 2.0.x13年1月21日月曜日
  • 25. Fluentd stream processing Web Servers Fluentd worker Fluentd deliver Fluentd worker Fluentd deliver Fluentd worker Fluentd deliver Fluentd worker Hoop Server Fluentd worker HDFS Fluentd worker13年1月21日月曜日
  • 26. Huahin Manager REST API for: JobTracker (MRv1) ResourceManager (YARN) HiveServer http://huahinframework.org/huahin-manager/13年1月21日月曜日
  • 27. 2nd gen: +Fluentd Log collection / Archiving Fluentd Parse / Transform / Add flags Fluentd Load into Hive tables HiveServer On-demand queries + Shib Scheduled queries Stream aggregations + Notifications13年1月21日月曜日
  • 28. 2nd gen: +Fluentd Compatibility: RPC based HDFS/JobTracker Access Performance: import needs no CPU (Load Only) Latency: hourly rotation only Latency: hourly rotation for any queries Hoop Server: SPOF / traffic bottleneck13年1月21日月曜日
  • 29. 3rd gen: ++++++ Archive Storage Web Servers Fluentd (scribed) Cluster Notifications STREAM (IRC) Fluentd Watchers Graph Tools webhdfs SCHEDULED BATCH BATCH Hadoop Cluster hive server CDH3u5 Shib ShibUI Huahin (Hive) Manager13年1月21日月曜日
  • 30. WebHDFS (CDH3u5 or CDH4) HttpFs (Hoop) NameNode DataNode httpfs Client server DataNode HTTP Java Native DataNode WebHDFS NameNode DataNode Client DataNode DataNode HTTP13年1月21日月曜日
  • 31. Fluentd online aggregation Semi-realtime aggregation to: counts errors of HTTP response calculate avg/%tiles of response time draw graphs immediately Many plugins for real time aggregation13年1月21日月曜日
  • 32. Graph Tools: GrowthForecast / HRForecast Graph drawing tools to update values over very simple HTTP request GrowthForecast: Real-time values HRForecast: Summarized (past) values13年1月21日月曜日
  • 33. HTTP Status/Response Time on GrowthForecast HTTP STATUS: 2XX(BLUE),3XX(GREEN),4XX(ORANGE), 5XX(RED) HTTP RESPONSE TIMES: AVG, [90, 95, 98, 99]PERCENTILE http://kazeburo.github.com/GrowthForecast/13年1月21日月曜日
  • 34. ShibUI13年1月21日月曜日
  • 35. ShibUI https://github.com/kazeburo/hrforecast13年1月21日月曜日
  • 36. 3rd gen: +++++++ Log collection / Archiving Fluentd Parse / Transform / Add flags Fluentd Load into Hive tables HiveServer On-demand queries + Shib Scheduled queries ShibUI Fluentd Stream aggregations + Notifications13年1月21日月曜日
  • 37. 3rd gen: +++++++ NO SPOF: for data stream Real time monitoring Queries for services: Scheduled queries, Visualization Latency: hourly rotation for any queries SPOF: NameNode (VIP & DRBD is xxxx...)13年1月21日月曜日
  • 38. 4th gen: NOW Archive Storage Web Servers Fluentd (scribed) Cluster Notifications STREAM (IRC) Fluentd Watchers Graph Tools webhdfs SCHEDULED BATCH BATCH Hadoop Cluster hive server CDH4 Shib ShibUI Huahin (HDFS, YARN) Manager13年1月21日月曜日
  • 39. 4th gen: CDH4.1.2 NO SPOF: QJM based NameNode HA Performance: YARN (?) Latency: multiple rotation in an hour with hive table schema change NONE should be improved!13年1月21日月曜日
  • 40. Good parts for solo engineer: RPC: Loosely-coupled architecture High compatibility / Low maintenance cost Open Source All components are OSS Open knowledge Well blogged / presentationed13年1月21日月曜日
  • 41. OUR DRIVER IS "OPENNESS" thanks to crouton & @kbysmnr !13年1月21日月曜日
  • 42. Software list: https://ccp.cloudera.com/display/SUPPORT/Downloads http://fluentd.org/ http://fluentd.org/plugin/ https://github.com/tagomoris/fluent-agent-lite https://github.com/tagomoris/shib https://github.com/tagomoris/shibui http://huahinframework.org/huahin-manager/ http://kazeburo.github.com/GrowthForecast/ http://github.com/kazeburo/hrforecast13年1月21日月曜日
  • 43. See also: Hadoop and Subsystem in livedoor (2011) http://www.slideshare.net/tagomoris/hadoop-and-subsystems-in-livedoor-hcj11f Distributed message stream processing on Fluentd http://www.slideshare.net/tagomoris/distributed-stream-processing-on-fluentd-fluentd Hive Tools in NHN Japan http://www.slideshare.net/tagomoris/hive-tools-in-nhn-japan-hadoopreading OSS based large scale log aggregation in livedoor http://www.slideshare.net/tagomoris/oss-nhntech Fluentd and WebHDFS http://www.slideshare.net/tagomoris/fluentd-and-webhdfs13年1月21日月曜日