Prdc2012
Upcoming SlideShare
Loading in...5
×
 

Prdc2012

on

  • 397 views

 

Statistics

Views

Total Views
397
Views on SlideShare
315
Embed Views
82

Actions

Likes
1
Downloads
5
Comments
0

2 Embeds 82

http://karahiyo.hatenablog.com 80
http://webcache.googleusercontent.com 2

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Prdc2012 Prdc2012 Presentation Transcript

    • Method for Monitoring andProfiling of Hadoop using AspectJ Yusuke Shimizu, Kouhei Sakurai, Satoshi Yamane Graduate School of Natural Science & Technology, Kanazawa University PRDC2012@TOKIMESSE
    • IntroductionThe use scene of Large-scale Distributed Systems is increasing Large-scale Distributed System is ... “Flexible and available architecture for large scale computation and data processing on a network of commodity hardware” [-- P. Julio, 2009] - e.g. Apache Hadoop
    • For Dependable Distributed System .. We have to considerabout and deal with ... Only using advance- Non-deterministic and static analysis network or verification- Fault tolerance is difficult- Incomprehensible users We also need runtime monitoring and analysis
    • How to monitor and debug General method of debugging or monitoring the Hadoop is ...• logging text messages• checking metrics via Web Interfaces, Ganglia, etc..
    • There are difficulties and requirements General method of debugging or monitoring the Hadoop is ...• logging text message → Difficulties by a huge number of nodes• checking metrics via Web Interfaces, Ganglia, etc.. → For operators, not enough to developers
    • Introduction Proposal 1. The Method Level Monitor 2. The Adaptive Profiling- Provide effective information for development- Help developers to understand system behaviorsand specifications
    • Outline of TalkIntroduction- Distributed system’s difficultyProposal- Monitor- Profile MethodExperimental Results & Conclusion
    • 2. PROPOSALS The Runtime Monitor & The Adaptive Profiling Method
    • Outline of Proposed System Hadoop Monitor Profile•MapReduce Record Trace Count up using AspectJ frequency•HDFS of•RPC instruction
    • Monitor• observe the system behavior at runtime• logging executed instructions passively = make “Trace” ‣ using AspectJ - “AspectJ is implementation of “Aspect Oriented Programming” using Java “ ‣ no modification is needed to applications
    • Architecture of Hadoop & Monitor Master Name Job Slaves Node Tracker Map Map Data Reduce Data Reduce Blocks Blocks Monitor Data Task Data Task Node Tracker Node Tracker RPC RPC Monitor Monitor
    • Architecture of Hadoop & Monitor Master Name Job Slaves Node Tracker Map Map Data Reduce Data Reduce Blocks Blocks Monitor Data Task Data Task Node Tracker Node Tracker RPC RPC Monitor Monitor
    • Architecture of Hadoop & Monitor Master Name Job Slaves Node Tracker Map Map Data Reduce Data Reduce Blocks Blocks Monitor Data Task Data Task Node Tracker Node Tracker RPC RPC Monitor Monitor Master’s Trace Slaves’ Trace ‣NameNode Trace ‣DataNode Trace ‣JobTracker Trace ‣TaskTracker Trace ‣RPC Trace ‣RPC Trace
    • Method of Profiling• based on frequency of instructions• count up instructions involved in “Trace”• count up on each grain ➡ each node ➡ each process ➡ each method
    • Outline of TalkIntroduction- Distributed system’s difficultyProposal- Monitor- Profile MethodExperimental Results & Conclusion
    • 3. EXPERIMENT Benchmark on the impact of the Monitor & do Profiling & Visualize the profiling results
    • Benchmark - the impact of MonitorThroughput [MB/sec] = Data size / Elapsed time Data size Elapsed time Throughput Trace size Monitor [GB] [sec] [MB/sec] [MB] 1 ⃝ 2m 25s (145sec) 6.9 2.4 84.1% 1 × 2m 2s (122s) 8.2 0 10 ⃝ 8m 45s (525sec) 19.0 3.6 88.3% 10 × 7m 45s (465sec) 21.5 0 1h 21m 54s 100 ⃝ 20.4 31.6 96.2% (4,914sec) 1h 18m 37s 100 × 21.2 0 (4,717sec) use “terasort” - a sample sorting program using MapReduce Trace size increase by 6.43 KB/sec
    • A Part of Profiling the statistics of the last 10 seconds, about master Tue Nov 13 12:30:08 JST 2012from 1352777408766 until 10000 afterHOSTNAME ::> DAEMON & PROCESS = { METHODS }--------------------------sirius:177 ::>> [namenodetrace : 23, jobtrackertrace : 41, datanodetrace : 0,tasktrackertrace : 0, rpctrace : 113] ={! hdfs.server.namenode.CorruptReplicasMap.numCorruptReplicas=5! hdfs.server.namenode.FSNamesystem.getBlockLocations=3! hdfs.server.namenode.FSNamesystem.getDatanode=1! hdfs.server.namenode.NameNode.getBlockLocations=4! hdfs.server.namenode.NameNode.getFileInfo=2! hdfs.server.namenode.NameNode.sendHeartbeat=2! hdfs.server.namenode.NameNode.verifyVersion=3! hdfs.server.namenode.UnderReplicatedBlocks.BlockIterator.hasNext=2! hdfs.server.namenode.UnderReplicatedBlocks.BlockIterator.next=1! ipc.Client.Connection.PingInputStream.read=4! ipc.Client.Connection.sendParam=2! ipc.Client.call=1! ipc.ConnectionHeader.readFields=4
    • Node Level Profiling Node Level Profiling is -- profiling by aggregating frequencies of instruction within each node for per unit time. 800 192.168.1.10 192.168.1.11number  of  occurrences 640 192.168.1.12 192.168.1.13 192.168.1.14 192.168.1.15 480 320 160 0 time(s) 6420
    • Process Level Profiling about MASTER Process Level Profiling is-- profiling by aggregating frequencies of instruction of each processwithin each node for per unit time. Master 400 rpc number  of  occurrences 300 jobtracker namenode 200 100 0 6420 time(s)
    • Process Level Profiling about Slaves 192.168.1.11 200 number  of  occurrences rpctrace 150 tasktrackertrace datanodetrace 100 50 0 6420 time(s) Map phase Reduce phase 192.168.1.12 192.168.1.13 There are free resouces.200150 150 113 should do100 75 speculative executions. 50 38 192.168.1.14 192.168.1.15200 200150100 150 100 Imbalance of RPC 50 50
    • Conclusion summary• Proposal - the lightweight method-level monitor using AspectJ - the profiling method based on frequency of instruction• Provide effective information for development• Help developers to understand system behaviors and specifications future work• Create an algorithm for determining the degree of deviation using a profiling results indicate the possibility of failure.
    • Thank you for your kind attention