Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Hadoop fault tolerance


Published on

Published in: Technology, Education

Hadoop fault tolerance

  1. 1. A survey on Hadoop: Fault Tolerance and Optimization of Fault Tolerance Model Group-11 Project Guide : Mr. R Patgiri Members: Pallav(10-1-5-023) Prabhakar Barua(10-1-5-017) Prabodh Hend(10-1-5-053) Prem Chandra(09-1-5-062) Jugal Assudani(10-1-5-068)
  2. 2. What is Apache Hadoop? • Large scale, open source software framework ▫ Yahoo! has been the largest contributor to date • Dedicated to scalable, distributed, data-intensive computing • Handles thousands of nodes and petabytes of data • Supports applications under a free license • 2 Hadoop subprojects: ▫ HDFS: Hadoop Distributed File System with high throughput access to application data ▫ MapReduce: A software framework for distributed processing of large data sets on computer clusters
  3. 3. Hadoop MapReduce • MapReduce is a programming model and software framework first developed by Google (Google’s MapReduce paper submitted in 2004) • Intended to facilitate and simplify the processing of vast amounts of data in parallel on large clusters of commodity hardware in a reliable, fault-tolerant manner ▫ Petabytes of data ▫ Thousands of nodes • Computational processing occurs on both: ▫ Unstructured data : filesystem ▫ Structured data : database
  4. 4. Hadoop Distributed File System (HDFS) • Inspired by Google File System • Scalable, distributed, portable filesystem written in Java for Hadoop framework ▫ Primary distributed storage used by Hadoop applications • HFDS can be part of a Hadoop cluster or can be a stand-alone general purpose distributed file system • An HFDS cluster primarily consists of ▫ NameNode that manages file system metadata ▫ DataNode that stores actual data • Stores very large files in blocks across machines in a large cluster ▫ Reliability and fault tolerance ensured by replicating data across multiple hosts • Has data awareness between nodes • Designed to be deployed on low-cost hardware
  5. 5. Typical Hadoop cluster integrates MapReduce and HFDS • Master/slave architecture • Master node contains ▫ Job tracker node (MapReduce layer) ▫ Task tracker node (MapReduce layer) ▫ Name node (HFDS layer) ▫ Data node (HFDS layer) • Multiple slave nodes contain ▫ Task tracker node (MapReduce layer) ▫ Data node (HFDS layer) • MapReduce layer has job and task tracker nodes • HFDS layer has name and data nodes
  6. 6. Hadoop simple cluster graphic
  7. 7. MapReduce framework • Per cluster node: ▫ Single JobTracker per master 1. Responsible for scheduling the jobs component tasks on the slaves 2. Monitors slave progress 3. Re-executing failed tasks ▫ Single TaskTracker per slave 1. Execute the tasks as directed by the master
  8. 8. MapReduce core functionality • Code usually written in Java- though it can be written in other languages with the Hadoop Streaming API • Two fundamental pieces: ▫ Map step 1. Master node takes large problem input and slices it into smaller sub problems; distributes these to worker nodes. 2.Worker node may do this again; leads to a multi-level tree structure 3.Worker processes smaller problem and hands back to master ▫ Reduce step 1. Master node takes the answers to the sub problems and combines them in a predefined way to get the output/answer to original problem
  9. 9. MapReduce core functionality (II) • Data flow beyond the two key pieces (map and reduce): ▫ Input reader – divides input into appropriate size splits which get assigned to a Map function ▫ Map function – maps file data to smaller, intermediate <key, value> pairs ▫ Compare function – input for Reduce is pulled from the Map intermediate output and sorted according to ths compare function ▫ Reduce function – takes intermediate values and reduces to a smaller solution handed back to the framework ▫ Output writer – writes file output
  10. 10. MapReduce core functionality (III) • A MapReduce Job controls the execution ▫ Splits the input dataset into independent chunks ▫ Processed by the map tasks in parallel • The framework sorts the outputs of the maps • A MapReduce Task is sent the output of the framework to reduce and combine • Both the input and output of the job are stored in a filesystem • Framework handles scheduling ▫ Monitors and re-executes failed tasks
  11. 11. MapReduce input and output • MapReduce operates exclusively on <key, value> pairs • Job Input: <key, value> pairs • Job Output: <key, value> pairs
  12. 12. Input and Output (II)
  13. 13. example: WordCount Two input files: file1: “hello world hello moon” file2: “goodbye world goodnight moon” Three operations: map combine reduce
  14. 14. Output per step MAP First map: < hello, 1 > < world, 1 > < hello, 1 > < moon, 1 > MAP Second map: < goodbye, 1 > < world, 1 > < goodnight, 1 > < moon, 1 >
  15. 15. Output per step • • • • • COMBINE First map: < moon, 1 > < world, 1 > < hello, 2 > • • • • • Second map: < goodbye, 1 > < world, 1 > < goodnight, 1> < moon, 1 >
  16. 16. Map class (II) Two maps are generated (1 per file) Second map emits: First map emits: < hello, 1 > < world, 1 > < hello, 1 > < moon, 1 > < goodbye, 1 > < world, 1 > < goodnight, 1 > < moon, 1 >
  17. 17. Hadoop MapReduce Word Count Source public static class MapClass extends MapReduceBase implements Mapper<LongWritable, Text, Text, IntWritable> { private final static IntWritable one = new IntWritable(1); private Text word = new Text(); public void map(LongWritable key, Text value, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException { String line = value.toString(); StringTokenizer itr = new StringTokenizer(line); while (itr.hasMoreTokens()) { word.set(itr.nextToken()); output.collect(word, one); } } }
  18. 18. /** * A reducer class that just emits the sum of the input values. */ public static class Reduce extends MapReduceBase implements Reducer<Text, IntWritable, Text, IntWritable> { public void reduce(Text key, Iterator<IntWritable> values, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException { int sum = 0; while (values.hasNext()) { sum +=; } output.collect(key, new IntWritable(sum)); } }
  19. 19. Hadoop MapReduce Word Count Driver public void run(String inputPath, String outputPath) throws Exception { JobConf conf = new JobConf(WordCount.class); conf.setJobName("wordcount"); // the keys are words (strings) conf.setOutputKeyClass(Text.class); // the values are counts (ints) conf.setOutputValueClass(IntWritable.class); conf.setMapperClass(MapClass.class); conf.setReducerClass(Reduce.class); FileInputFormat.addInputPath(conf, new Path(inputPath)); FileOutputFormat.setOutputPath(conf, new Path(outputPath)); JobClient.runJob(conf); }
  21. 21. 1.Addition of one more “Backup” state in Hadoop Pipeline • Motive In a Hadoop pipeline when system failure occurs then then the whole process of MapReduce is computed again, although it is a straight forward and the most plausible remedy, However when the system turns faulty after the mapping process the whole process of MapReduce is done again. If the Intermediate key value pairs can be backedup then even after the faults in the system the intermediate key value pair data can be fed to another cluster of reducers.
  22. 22. 1.Addition of one more “Backup” state in Hadoop Pipeline
  23. 23. 1.Addition of one more “Backup” state in Hadoop Pipeline
  24. 24. 1.Addition of one more “Backup” state in Hadoop Pipeline • Once this backup system is installed in the pipeline then the adaptivity of hadoop in terms of fault tolerance will increase. As the intermediate data is preserved for the reducer, even if the earlier cluster is non functional, the preserved data could be fed into a new reducer. Although the scheduling decisions to be taken by the master remains unchanged i.e. O(M+R), where M and R are the mappers and Reducers, but time required to allocate a new cluster for computing MapReduce again will be saved. Once the Reducer completes the process the indermediate data is removed from the backup device.
  25. 25. 1.Addition of one more “Backup” state in Hadoop Pipeline Advantages & Disadvantages Advantages Unnecessary computation of Mapping is subsided. BackUp is created as a new state in the pipeline. Disadvantage Cost overhead There will be some delay in the reducer function as backup also will be accessing the data.
  26. 26. 2. Protocol for Supernode • Motive Hadoop usually does not have any communication between its slave nodes in which control signals regarding the status is shared. This proposal tries to estalblish a differnt type of communication between the slave nodes so that when any node turns faulty, the neighbouring nodes try to do its job. In this new infrastructure configuration, neighboring nodes will behave as a supernode, and each node will know the other nodes in the supernode and the tasks they have assigned. Thus, in case one of the nodes fails, another node in the supernode can take the role of the failed one, without needing the JobTracker to know about it or take any action.
  27. 27. 2. Protocol for Supernode
  28. 28. 2. Protocol for Supernode
  29. 29. 2. Protocol for Supernode
  30. 30. 2. Protocol for Supernode
  31. 31. 2. Protocol for Supernode • Detection of Fault Every neighbouring nodes (Node 1, Node 2, Node 3, Node 4, Node 5 ) will ping Node 1 periodically after time T(T<heartbeat) and keep track of it. If there is no response from Node 1 for a certain time interval it will be assumed as a non functional node. • Info Node 1 sends to its neighbour during normal execution 1. 2. 3. Task Information Location of Data in Shared Memory Progress of Task(Checkpoint) • Failure of Node Nodes 1,2,3,4,5 ping each other periodically, These nodes should get a response from other nodes, however if there comes no response from other nodes for a period of three handshake time, it will be considered as a faulty node.
  32. 32. 2. Protocol for Supernode • Re assingment of tasks to Node 2 Any of the neighbour 2,3,4,5 which has completed its job or it is about to complete i.e. it has free resources, will assign itself this task and its task tracker will schedule it. • Revival of Node 1 If Node 1 starts working again and tries to gain access to the shared memory where Node 2 is already performing the task allocated to Node 1,the Control Unit of the shared memory will prevent the Node 1 from accessing the shared memory location.
  33. 33. 2. Protocol for Supernode • Control Unit Control unit is present in the shared memory. Its job is to handle all theshared memory in the cluster. It also prevents simultaneous access of more than 1 node to a particular memory segment. • Client App request Before this structure was theorised, ie. the current system in use, the client app requests the main node for the address of the reducers where it may find the required answers, now will request the CU of the shared memory associated with the task related to the client app via name node, the address of the shared memory which will be kept track of by the main node.
  34. 34. 2. Protocol for Supernode • Advantages 1. More fault tolerant When any node becomes non-functional, then the node present nearby (ie. Supernode), which is near completion or has already completed its task reassigns itself to the task of that faulty node, The description of which is present in the shared memory. Therefore a faulty node does not have to wait for the Master node to notice about its non-functionality and hence reduce execution time in case any of the node gets faulty. 2. Shared Memory By use of shared memory, the memory of a particular node does not get blocked in case it gets faulty, it is available to other nodes, with a controlled access to it. 3. Control Unit Control Unit prevents simultaneous access of the same memory block, thereby providing data integrity. It also provides the client application about the data it requires.
  35. 35. 2. Protocol for Supernode • Disadvantages • Cost overhead Extra hardware is needed to implement this structure. • Bandwidth consumption There will be some bandwidth consumption during the interaction between the nodes.
  36. 36. System Of Slaves Motive In Hadoop every node is a single commodity hardware. However a vision of system of nodes has never been implemented. If a node turns non functional, its task is reassigned by the master to another node. To decrease the time in assigning this task and all the overhead caused during this process is reduced by making a system of multiple nodes as a unit.
  37. 37. System Of Slaves
  38. 38. System Of Slaves Theorization Unlike conventional Hadoop Mapreduce each slave is now considered as a system of slaves which contain multiple nodes. Athough similar to the process of task assignment in Hadoop here also master distributes the task among several system of slaves.Name node informs each system about the location of data in the shared memory.System seeks for the data and after it has got the access permission N1 and N2 acquires the data and they start computation. If N1 gets nonfunctional N2 resumes N1’s functioning by completing its own task and after moving forwad to N1’s task. N2 gets the information of N1 just by going through the checkpoints and output data. If the whole system fails supernode algorithm will come into play.
  39. 39. System Of Slaves
  40. 40. System Of Slaves Division of data:If chunk size is 64 MB, N1 and N2 each will get 32MB of data. Although N1 and N2 cannot access the data simultaneously. Firstly when the map-reduce job will be assigned to the system, N1 will acquire the data. Here N1 is given higher priority. After N1 takes its data, N2 will proceed. However if N1 turns faulty, after some time, N2 will start processing its data, and while submitting the checkpoint it will notice the inavailablity of N1.
  41. 41. THANK YOU