1. Log analysis system
with Hadoop
in livedoor 2013 Winter
2013/01/20
Hadoop Conference Japan 2013 Winter
TAGOMORI Satoshi (@tagomoris)
NHN Japan Corp.
13年1月21日月曜日
2. TAGOMORI SATOSHI (@TAGOMORIS)
NHN JAPAN CORP.
WEB SERVICE BUSINESS DIVISION DEVELOPMENT DEPARTMENT 2
(IN JAN 2012, LIVEDOOR -> NHN JAPAN)
13年1月21日月曜日
7. large scale web services
400+ Web Servers
5Gbps @ Aug 2009
15Gbps @ Aug 2011
20+Gbps @ Jan 2013
(direct outbound + CDN)
13年1月21日月曜日
8. giant access log traffic
At Aug 2011 (HCJ2011)
From 96 servers
580GB/day
13年1月21日月曜日
9. giant access log traffic
NOW (At Jan 2013 HCJ2013W)
From 320+ servers
1.5+ TB/day (raw)
5,300,000,000+ lines/day
120,000+ lines/sec (peak time)
400Mbps log traffic
13年1月21日月曜日
10. What we want to do
COUNT PV,UU and others (daily)
COUNT Service metrics (daily/hourly)
FIND Surprised Errors [4xx,5xx] (immediately)
CHECK Response Times (immediately)
SERCH Logs in troubles (hourly/immediately)
13年1月21日月曜日
11. Batches and Streams
Hadoop is for batches
High performance batch is important
HDFS has good performance
Stream log writing and calcurations
are also VERY VERY IMPORTANT
Hybrid System:
Stream processing + Batch
13年1月21日月曜日
24. Fluentd stream processing
out_exec_filter
any filter programs with STDIN/
STDOUT
compatible with Hadoop Streaming!
out_hoop
output plugin to write HDFS over Hoop
Hoop: a.k.a. HttpFs in Hadoop 2.0.x
13年1月21日月曜日
31. Fluentd online aggregation
Semi-realtime aggregation to:
counts errors of HTTP response
calculate avg/%tiles of response time
draw graphs immediately
Many plugins for real time aggregation
13年1月21日月曜日
32. Graph Tools:
GrowthForecast / HRForecast
Graph drawing tools to update values
over very simple HTTP request
GrowthForecast: Real-time values
HRForecast: Summarized (past) values
13年1月21日月曜日
33. HTTP Status/Response Time
on GrowthForecast
HTTP STATUS: 2XX(BLUE),3XX(GREEN),4XX(ORANGE), 5XX(RED)
HTTP RESPONSE TIMES: AVG, [90, 95, 98, 99]PERCENTILE
http://kazeburo.github.com/GrowthForecast/
13年1月21日月曜日
37. 3rd gen: +++++++
NO SPOF: for data stream
Real time monitoring
Queries for services:
Scheduled queries, Visualization
Latency: hourly rotation for any queries
SPOF: NameNode (VIP & DRBD is xxxx...)
13年1月21日月曜日
39. 4th gen: CDH4.1.2
NO SPOF: QJM based NameNode HA
Performance: YARN (?)
Latency: multiple rotation in an hour
with hive table schema change
NONE should be improved!
13年1月21日月曜日
40. Good parts for solo engineer:
RPC: Loosely-coupled architecture
High compatibility / Low maintenance cost
Open Source
All components are OSS
Open knowledge
Well blogged / presentationed
13年1月21日月曜日
41. OUR DRIVER IS
"OPENNESS"
thanks to crouton & @kbysmnr !
13年1月21日月曜日
43. See also:
Hadoop and Subsystem in livedoor (2011)
http://www.slideshare.net/tagomoris/hadoop-and-subsystems-in-livedoor-hcj11f
Distributed message stream processing on Fluentd
http://www.slideshare.net/tagomoris/distributed-stream-processing-on-fluentd-fluentd
Hive Tools in NHN Japan
http://www.slideshare.net/tagomoris/hive-tools-in-nhn-japan-hadoopreading
OSS based large scale log aggregation in livedoor
http://www.slideshare.net/tagomoris/oss-nhntech
Fluentd and WebHDFS
http://www.slideshare.net/tagomoris/fluentd-and-webhdfs
13年1月21日月曜日