SlideShare a Scribd company logo
1 of 3
Download to read offline
IJSRD - International Journal for Scientific Research & Development| Vol. 3, Issue 10, 2015 | ISSN (online): 2321-0613
All rights reserved by www.ijsrd.com 999
Fault Tolerance in Big Data Processing using Heartbeat Messages and
Data Replication
T.Cowsalya1 N.Gomathi2 R.Arunkumar3
1,2,3
Assistant Professor
1,2,3
Department of Computer Science and Engineering
1,2,3
SVS College of Engineering, Coimbatore, Tamil Nadu, India, Pincode-642109
Abstract— Big data is a popular term used to define the
exponential evolution and availability of data, includes both
structured and unstructured data. The volatile progression of
demands on big data processing imposes heavy burden on
computation, communication and storage in geographically
distributed data centers. Hence it is necessary to minimize
the cost of big data processing, which also includes fault
tolerance cost. Big Data processing involves two types of
faults: node failure and data loss. Both the faults can be
recovered using heartbeat messages. Here heartbeat
messages acts as an acknowledgement messages between
two servers. This paper depicts about the study of node
failure and recovery, data replication and heartbeat
messages.
Key words: Big data, Fault Tolerance, Heartbeat Messages,
Node Recovery, Data Replication
I. INTRODUCTION
Big data is a slogan, used to define a gigantic measurement
of both structured and Unstructured data that is so large and
difficult to process using traditional database architecture.
Due to its explosive growth the volatile progression of
demands on big data processing imposes heavy burden on
computation, communication and storage in geographically
distributed data centers. The incoming large data set is
broken up into multiple chunks and each individual multiple
chunks are placed in different data canters with the help of
volley system. The Volley System [2] makes use of logs to
submit the jobs to the data center. Cloud users make use of
volley system foe automatic data placement.
A. Geo-Distributed Data Center:
The data centers distributed at multiple geographical regions
are known as geographically distributed data centers [1].For
example Google has 13 datacenters over 8 countries and 4
continents.
Fig. 1: Data Center Topology
II. FAULT TOLERANCE
The challenge of big data includes analysis, capture, search,
sharing, storage, transfer, visualization and privacy
violations. Among these challenges fault tolerance is one of
the main challenge in big data. There are possibly two faults
that can occur while processing big data. First the data
chunk may loss while transferring the data to multiple data
center. Second the server may fail or slows down.
A. Heartbeat Messages;
The solution for the above two problems are heartbeat
messages. Here heartbeat message is a message sent from an
inventor to the endpoint to identify if and when the inventor
fails or is no longer available. Heartbeat messages are non-
stop on a periodic recurring basis from the inventor’s startup
until the inventor’s shutdown. When the receiver identifies
lack of heartbeat messages during an anticipated arrival
period, the destination may determine that the inventor has
failed, shutdown, or is generally no longer available.
The developmentrelays to fault recovery in
multiprocessor system, where the processors constantly
monitor heartbeat messages from the other processor is
capable of taking autonomous recovery action in response to
a failure to receive heartbeat messages, advantageously
without the overall guidance of an executive processor.
B. Data Loss Prevention:
When transmitting the jobs to multiple data centers, there
may be chance of data loss. Data loss may occur due to
network link failure. The links in networks may vary on
transmissionrates according to their unique features. For
example the distances and optical fiber facilities between
multiple
data centers. Due to capacity constraints, all tasks
are not placedonto the same server, on which
theconsistentdata exist in. It is unavoidable when certain
data must bedownloaded from a remote server. In this case,
routingplan matters on the transmission cost.
C. Hadoop Architecture:
Hadoop is a software framework used for processing big
data in parallel. It consists of two important components
called Hadoop Distributed File System and MapReduce
1) Hadoop Distributed File System:
The Hadoop Distributed File System (HDFS)[3] is a
distributed, highly fault-tolerant file system designed to run
on low-cost commodity hardware. HDFS provides high-
throughput access to application data and it is suitable for
applications with large data sets. HDFS consists of two
nodes called name node and data node.Name node manages
file system namespace operations like opening, closing, and
renaming files and directories. A name node also maps data
blocks to data nodes, which handle read and write requests
from HDFS clients. Data nodes also create, delete, and
replicate data blocks according to instructions from the
governing name node. Name Node and Data Node send
messages to prove their identity.
Fault Tolerance in Big Data Processing using Heartbeat Messages and Data Replication
(IJSRD/Vol. 3/Issue 10/2015/227)
All rights reserved by www.ijsrd.com 1000
2) Mapreduce:
MapReduce [12] is a programming model and its associated
implementation for processing and generating large data sets
with parallel, distributed algorithm on a cluster. MapReduce
also consists of two nodes called job tracker and task
tracker. The JobTracker talks to the NameNode to determine
the location of the data. The Task Tracker node executes the
assigned tasks in the data nodes.
III. FAILURES
One of the major benefits of using Hadoop is its ability to
handle these failures and allow our job to complete.
A. Task Failure:
When the user code in the map or reduce tasks throws
runtime exception, then the child task fails. Another failure
mode is the sudden exit of child JVM. In this case the task
tracker notices the process has exited and marks the attempt
as failed. A task may also be killed, which is different from
failing.
B. Task Tracker Failure:
If a task tracker fails by crashing or running very slowly, it
will stop sending heartbeat messages to the job tracker. The
job tracker notice that the task trackers has stopped sending
heartbeat and remove it from its pool of task trackers to
schedule tasks on.
C. Job Tracker Failure:
Failure of job tracker is the most serious failure mode. It is a
single point of failure. This failure mode has low chance of
occurring, since the chance of particular machine failing is
low.
D. Name Node Failure:
The name node was a single point of failure, so if it failed
that meant your cluster became unstable. Even the
secondary name node doesn’t help in this case since it is
only used for checkpoints, not as a backup for the name
node. If the name node fails someone like an administrator
would have to restart the name node.
E. Data Node Failure;
A compute node can fail for any variety of reasons, for
example broken node hardware, a broken network, software
bugs, or inadequate hardware resources
Fig. 2: Name node Schema
When a compute node fails, all jobs running on that node
fail. Even though the running jobs running on other nodes
that weren’t communicating with jobs on the failed node
will continue to run without a problem.
IV. SOLUTION
A. Data Replication:
An application can specify the number of replicas of a file at
the time it is created, and this number can be changed any
time after that. [6] The name node makes all decisions
concerning block replication. HDFS uses an intelligent
replica placement model for reliability and performance.
Optimizing replica placement makes HDFS unique from
most other distributed file systems, and is facilitated by a
rack-aware replica placement policy that uses network
bandwidth efficiently.
Fig. 3: Data Replication in DFS
The name node makes all decisions regarding replications of
blocks. It periodically receives a heartbeat and a block
report from each of the data nodes in the cluster. Receipt of
a heartbeat implies that the data node is functioning
properly. A block report contains a list of all blocks on a
data node.
Fault Tolerance in Big Data Processing using Heartbeat Messages and Data Replication
(IJSRD/Vol. 3/Issue 10/2015/227)
All rights reserved by www.ijsrd.com 1001
B. Replica Selection:
To minimize global bandwidth consumption and read
latency, HDFS tries to satisfy a read request from a replica
that is closest to the reader. If there exists a replica on the
same rack as the reader node, then the replica is preferred to
satisfy the read request. If HDFS clusters spans multiple
data centers, then a replica that is resident in the local data
center is preferred over any remote replica.
V. CONCLUSION
Thus the failure in big data processing has been studied.
Data replication and heartbeat messages are used as a fault
tolerant mechanism. In future practical setups can be
executed and the computation and communication cost can
be computed. Result can be compared with the cost of data
processing in non- failure node
REFERENCES
[1] Data Center Locations,”
http://www.google.com/about/data
centers/inside/locations/index.html.
[2] S. Agarwal, J. Dunagan, N. Jain, S. Saroiu, A. Wolman,
and H. Bhogan, “Volley: Automated Data Placement
for Geo-DistributedCloud Services,” in The 7th
USENIX Symposium on Networked Systems Design
and Implementation (NSDI), 2010, pp. 17–32.
[3] L. Rao, X. Liu, L. Xie, and W. Liu, “Minimizing
Electricity Cost: Optimization of Distributed Internet
Data Centers in a Multi-Electricity-Market
Environment,” in Proceedings of the 29th International
Conference on Computer Communications
(INFOCOM). IEEE, 2010, pp. 1–9.
[4] A. Qureshi, R. Weber, H. Balakrishnan, J. Guttag, and
B. Maggs, “Cutting the Electric Bill for Internet-scale
Systems,” in Proceedings of the ACM Special Interest
Group on Data Communication (SIGCOMM). ACM,
2009, pp. 123–134.
[5] R. Urgaonkar, B. Urgaonkar, M. J. Neely, and A.
Sivasubramaniam, “Optimal Power Cost Management
Using Stored Energyin Data Centers,” in Proceedings of
International Conference on Measurement and
Modeling of Computer Systems (SIGMETRICS).ACM,
2011, pp. 221–232.
[6] X. Fan, W.-D.Weber, and L. A. Barroso, “Power
Provisioning for A Warehouse-sized Computer,” in
Proceedings of the 34th Annual International
Symposium on Computer Architecture (ISCA). ACM,
2007, pp. 13–23.
[7] S. Govindan, A. Sivasubramaniam, and B. Urgaonkar,
“Benefits and Limitations of Tapping Into Stored
Energy for Datacenters,”in Proceedings of the 38th
Annual International Symposium on Computer
Architecture (ISCA). ACM, 2011, pp. 341–352
[8] P. X. Gao, A. R. Curtis, B. Wong, and S. Keshav, “It’s
Not Easy Being Green,” in Proceedings of the ACM
Special Interest Group on Data Communication
(SIGCOMM). ACM, 2012, pp. 211–222.
[9] S. A. Yazd, S. Venkatesan, and N. Mittal, “Boosting
energy efficiency with mirrored data block replication
policy and energyscheduler,” SIGOPS Oper. Syst. Rev.,
vol. 47, no. 2, pp. 33–40, 2013.
[10]J. Cohen, B. Dolan, M. Dunlap, J. M. Hellerstein, and
C. Welton, “Mad skills: new analysis practices for big
data,” Proc. VLDBEndow., vol. 2, no. 2, pp. 1481–
1492, 2009.
[11]R. Kaushik and K. Nahrstedt, “T*: A data-centric
cooling energy costs reduction approach for Big Data
analytics cloud,” in 2012 International Conference for
High Performance Computing, Networking, Storage
and Analysis (SC), 2012, pp. 1–11.
[12]MapReduce: Simpli_ed Data Processing on Large
Clusters, Jeffrey Dean and Sanjay Ghemawat,
jeff@google.com, sanjay@google.com, Google, Inc.

More Related Content

What's hot

Exploiting Multi Core Architectures for Process Speed Up
Exploiting Multi Core Architectures for Process Speed UpExploiting Multi Core Architectures for Process Speed Up
Exploiting Multi Core Architectures for Process Speed UpIJERD Editor
 
Data Distribution Handling on Cloud for Deployment of Big Data
Data Distribution Handling on Cloud for Deployment of Big DataData Distribution Handling on Cloud for Deployment of Big Data
Data Distribution Handling on Cloud for Deployment of Big Dataijccsa
 
An asynchronous replication model to improve data available into a heterogene...
An asynchronous replication model to improve data available into a heterogene...An asynchronous replication model to improve data available into a heterogene...
An asynchronous replication model to improve data available into a heterogene...Alexander Decker
 
Synchronization and replication through ocmdbs
Synchronization and replication through ocmdbsSynchronization and replication through ocmdbs
Synchronization and replication through ocmdbsIAEME Publication
 
Survey on cloud backup services of personal storage
Survey on cloud backup services of personal storageSurvey on cloud backup services of personal storage
Survey on cloud backup services of personal storageeSAT Journals
 
Drops division and replication of data in cloud for optimal performance and s...
Drops division and replication of data in cloud for optimal performance and s...Drops division and replication of data in cloud for optimal performance and s...
Drops division and replication of data in cloud for optimal performance and s...Pvrtechnologies Nellore
 
Survey on Division and Replication of Data in Cloud for Optimal Performance a...
Survey on Division and Replication of Data in Cloud for Optimal Performance a...Survey on Division and Replication of Data in Cloud for Optimal Performance a...
Survey on Division and Replication of Data in Cloud for Optimal Performance a...IJSRD
 
LARGE-SCALE DATA PROCESSING USING MAPREDUCE IN CLOUD COMPUTING ENVIRONMENT
LARGE-SCALE DATA PROCESSING USING MAPREDUCE IN CLOUD COMPUTING ENVIRONMENTLARGE-SCALE DATA PROCESSING USING MAPREDUCE IN CLOUD COMPUTING ENVIRONMENT
LARGE-SCALE DATA PROCESSING USING MAPREDUCE IN CLOUD COMPUTING ENVIRONMENTijwscjournal
 
A DDS-Based Scalable and Reconfigurable Framework for Cyber-Physical Systems
A DDS-Based Scalable and Reconfigurable Framework for Cyber-Physical SystemsA DDS-Based Scalable and Reconfigurable Framework for Cyber-Physical Systems
A DDS-Based Scalable and Reconfigurable Framework for Cyber-Physical Systemsijseajournal
 
Cloak-Reduce Load Balancing Strategy for Mapreduce
Cloak-Reduce Load Balancing Strategy for MapreduceCloak-Reduce Load Balancing Strategy for Mapreduce
Cloak-Reduce Load Balancing Strategy for MapreduceAIRCC Publishing Corporation
 

What's hot (17)

Exploiting Multi Core Architectures for Process Speed Up
Exploiting Multi Core Architectures for Process Speed UpExploiting Multi Core Architectures for Process Speed Up
Exploiting Multi Core Architectures for Process Speed Up
 
Data Distribution Handling on Cloud for Deployment of Big Data
Data Distribution Handling on Cloud for Deployment of Big DataData Distribution Handling on Cloud for Deployment of Big Data
Data Distribution Handling on Cloud for Deployment of Big Data
 
An asynchronous replication model to improve data available into a heterogene...
An asynchronous replication model to improve data available into a heterogene...An asynchronous replication model to improve data available into a heterogene...
An asynchronous replication model to improve data available into a heterogene...
 
Dremel
DremelDremel
Dremel
 
Synchronization and replication through ocmdbs
Synchronization and replication through ocmdbsSynchronization and replication through ocmdbs
Synchronization and replication through ocmdbs
 
Survey on cloud backup services of personal storage
Survey on cloud backup services of personal storageSurvey on cloud backup services of personal storage
Survey on cloud backup services of personal storage
 
Drops division and replication of data in cloud for optimal performance and s...
Drops division and replication of data in cloud for optimal performance and s...Drops division and replication of data in cloud for optimal performance and s...
Drops division and replication of data in cloud for optimal performance and s...
 
Survey on Division and Replication of Data in Cloud for Optimal Performance a...
Survey on Division and Replication of Data in Cloud for Optimal Performance a...Survey on Division and Replication of Data in Cloud for Optimal Performance a...
Survey on Division and Replication of Data in Cloud for Optimal Performance a...
 
Advance DBMS
Advance DBMSAdvance DBMS
Advance DBMS
 
DDBMS Paper with Solution
DDBMS Paper with SolutionDDBMS Paper with Solution
DDBMS Paper with Solution
 
LARGE-SCALE DATA PROCESSING USING MAPREDUCE IN CLOUD COMPUTING ENVIRONMENT
LARGE-SCALE DATA PROCESSING USING MAPREDUCE IN CLOUD COMPUTING ENVIRONMENTLARGE-SCALE DATA PROCESSING USING MAPREDUCE IN CLOUD COMPUTING ENVIRONMENT
LARGE-SCALE DATA PROCESSING USING MAPREDUCE IN CLOUD COMPUTING ENVIRONMENT
 
Hm2413291336
Hm2413291336Hm2413291336
Hm2413291336
 
Distributed DBMS - Unit 1 - Introduction
Distributed DBMS - Unit 1 - IntroductionDistributed DBMS - Unit 1 - Introduction
Distributed DBMS - Unit 1 - Introduction
 
[IJET V2I5P18] Authors:Pooja Mangla, Dr. Sandip Kumar Goyal
[IJET V2I5P18] Authors:Pooja Mangla, Dr. Sandip Kumar Goyal[IJET V2I5P18] Authors:Pooja Mangla, Dr. Sandip Kumar Goyal
[IJET V2I5P18] Authors:Pooja Mangla, Dr. Sandip Kumar Goyal
 
4 026
4 0264 026
4 026
 
A DDS-Based Scalable and Reconfigurable Framework for Cyber-Physical Systems
A DDS-Based Scalable and Reconfigurable Framework for Cyber-Physical SystemsA DDS-Based Scalable and Reconfigurable Framework for Cyber-Physical Systems
A DDS-Based Scalable and Reconfigurable Framework for Cyber-Physical Systems
 
Cloak-Reduce Load Balancing Strategy for Mapreduce
Cloak-Reduce Load Balancing Strategy for MapreduceCloak-Reduce Load Balancing Strategy for Mapreduce
Cloak-Reduce Load Balancing Strategy for Mapreduce
 

Similar to IJSRD Journal Fault Tolerance in Big Data Using Heartbeat Messages

Database 2 ddbms,homogeneous & heterognus adv & disadvan
Database 2 ddbms,homogeneous & heterognus adv & disadvanDatabase 2 ddbms,homogeneous & heterognus adv & disadvan
Database 2 ddbms,homogeneous & heterognus adv & disadvanIftikhar Ahmad
 
BDA Mod2@AzDOCUMENTS.in.pdf
BDA Mod2@AzDOCUMENTS.in.pdfBDA Mod2@AzDOCUMENTS.in.pdf
BDA Mod2@AzDOCUMENTS.in.pdfKUMARRISHAV37
 
New Framework for Improving Bigdata Analaysis Using Mobile Agent
New Framework for Improving Bigdata Analaysis Using Mobile AgentNew Framework for Improving Bigdata Analaysis Using Mobile Agent
New Framework for Improving Bigdata Analaysis Using Mobile AgentMohammed Adam
 
Data Distribution Handling on Cloud for Deployment of Big Data
Data Distribution Handling on Cloud for Deployment of Big DataData Distribution Handling on Cloud for Deployment of Big Data
Data Distribution Handling on Cloud for Deployment of Big Dataneirew J
 
hadoop seminar training report
hadoop seminar  training reporthadoop seminar  training report
hadoop seminar training reportSarvesh Meena
 
Advanced Design and Optimization of Data Center Interconnection Networks.pptx
Advanced Design and Optimization of Data Center Interconnection Networks.pptxAdvanced Design and Optimization of Data Center Interconnection Networks.pptx
Advanced Design and Optimization of Data Center Interconnection Networks.pptxService Solutions Pvt. Ltd. (SSL)
 
Data-Intensive Technologies for Cloud Computing
Data-Intensive Technologies for CloudComputingData-Intensive Technologies for CloudComputing
Data-Intensive Technologies for Cloud Computinghuda2018
 
IRJET- HHH- A Hyped-up Handling of Hadoop based SAMR-MST for DDOS Attacks...
IRJET-  	  HHH- A Hyped-up Handling of Hadoop based SAMR-MST for DDOS Attacks...IRJET-  	  HHH- A Hyped-up Handling of Hadoop based SAMR-MST for DDOS Attacks...
IRJET- HHH- A Hyped-up Handling of Hadoop based SAMR-MST for DDOS Attacks...IRJET Journal
 
thilaganga journal 1
thilaganga journal 1thilaganga journal 1
thilaganga journal 1thilaganga
 
Cloud Computing Ambiance using Secluded Access Control Method
Cloud Computing Ambiance using Secluded Access Control MethodCloud Computing Ambiance using Secluded Access Control Method
Cloud Computing Ambiance using Secluded Access Control MethodIRJET Journal
 
A Novel Approach for Workload Optimization and Improving Security in Cloud Co...
A Novel Approach for Workload Optimization and Improving Security in Cloud Co...A Novel Approach for Workload Optimization and Improving Security in Cloud Co...
A Novel Approach for Workload Optimization and Improving Security in Cloud Co...IOSR Journals
 
Software architecture case study - why and why not sql server replication
Software architecture   case study - why and why not sql server replicationSoftware architecture   case study - why and why not sql server replication
Software architecture case study - why and why not sql server replicationShahzad
 
Unit-1 Introduction to Big Data.pptx
Unit-1 Introduction to Big Data.pptxUnit-1 Introduction to Big Data.pptx
Unit-1 Introduction to Big Data.pptxAnkitChauhan817826
 
A Comprehensive Study on Big Data Applications and Challenges
A Comprehensive Study on Big Data Applications and ChallengesA Comprehensive Study on Big Data Applications and Challenges
A Comprehensive Study on Big Data Applications and Challengesijcisjournal
 
Postponed Optimized Report Recovery under Lt Based Cloud Memory
Postponed Optimized Report Recovery under Lt Based Cloud MemoryPostponed Optimized Report Recovery under Lt Based Cloud Memory
Postponed Optimized Report Recovery under Lt Based Cloud MemoryIJARIIT
 
Seminar_Report_hadoop
Seminar_Report_hadoopSeminar_Report_hadoop
Seminar_Report_hadoopVarun Narang
 
Big Data Analysis and Its Scheduling Policy – Hadoop
Big Data Analysis and Its Scheduling Policy – HadoopBig Data Analysis and Its Scheduling Policy – Hadoop
Big Data Analysis and Its Scheduling Policy – HadoopIOSR Journals
 

Similar to IJSRD Journal Fault Tolerance in Big Data Using Heartbeat Messages (20)

Database 2 ddbms,homogeneous & heterognus adv & disadvan
Database 2 ddbms,homogeneous & heterognus adv & disadvanDatabase 2 ddbms,homogeneous & heterognus adv & disadvan
Database 2 ddbms,homogeneous & heterognus adv & disadvan
 
BDA Mod2@AzDOCUMENTS.in.pdf
BDA Mod2@AzDOCUMENTS.in.pdfBDA Mod2@AzDOCUMENTS.in.pdf
BDA Mod2@AzDOCUMENTS.in.pdf
 
New Framework for Improving Bigdata Analaysis Using Mobile Agent
New Framework for Improving Bigdata Analaysis Using Mobile AgentNew Framework for Improving Bigdata Analaysis Using Mobile Agent
New Framework for Improving Bigdata Analaysis Using Mobile Agent
 
Data Distribution Handling on Cloud for Deployment of Big Data
Data Distribution Handling on Cloud for Deployment of Big DataData Distribution Handling on Cloud for Deployment of Big Data
Data Distribution Handling on Cloud for Deployment of Big Data
 
hadoop seminar training report
hadoop seminar  training reporthadoop seminar  training report
hadoop seminar training report
 
Advanced Design and Optimization of Data Center Interconnection Networks.pptx
Advanced Design and Optimization of Data Center Interconnection Networks.pptxAdvanced Design and Optimization of Data Center Interconnection Networks.pptx
Advanced Design and Optimization of Data Center Interconnection Networks.pptx
 
Data-Intensive Technologies for Cloud Computing
Data-Intensive Technologies for CloudComputingData-Intensive Technologies for CloudComputing
Data-Intensive Technologies for Cloud Computing
 
IRJET- HHH- A Hyped-up Handling of Hadoop based SAMR-MST for DDOS Attacks...
IRJET-  	  HHH- A Hyped-up Handling of Hadoop based SAMR-MST for DDOS Attacks...IRJET-  	  HHH- A Hyped-up Handling of Hadoop based SAMR-MST for DDOS Attacks...
IRJET- HHH- A Hyped-up Handling of Hadoop based SAMR-MST for DDOS Attacks...
 
thilaganga journal 1
thilaganga journal 1thilaganga journal 1
thilaganga journal 1
 
H04502048051
H04502048051H04502048051
H04502048051
 
Cloud Computing Ambiance using Secluded Access Control Method
Cloud Computing Ambiance using Secluded Access Control MethodCloud Computing Ambiance using Secluded Access Control Method
Cloud Computing Ambiance using Secluded Access Control Method
 
D017212027
D017212027D017212027
D017212027
 
A Novel Approach for Workload Optimization and Improving Security in Cloud Co...
A Novel Approach for Workload Optimization and Improving Security in Cloud Co...A Novel Approach for Workload Optimization and Improving Security in Cloud Co...
A Novel Approach for Workload Optimization and Improving Security in Cloud Co...
 
Software architecture case study - why and why not sql server replication
Software architecture   case study - why and why not sql server replicationSoftware architecture   case study - why and why not sql server replication
Software architecture case study - why and why not sql server replication
 
Unit-1 Introduction to Big Data.pptx
Unit-1 Introduction to Big Data.pptxUnit-1 Introduction to Big Data.pptx
Unit-1 Introduction to Big Data.pptx
 
A Comprehensive Study on Big Data Applications and Challenges
A Comprehensive Study on Big Data Applications and ChallengesA Comprehensive Study on Big Data Applications and Challenges
A Comprehensive Study on Big Data Applications and Challenges
 
Postponed Optimized Report Recovery under Lt Based Cloud Memory
Postponed Optimized Report Recovery under Lt Based Cloud MemoryPostponed Optimized Report Recovery under Lt Based Cloud Memory
Postponed Optimized Report Recovery under Lt Based Cloud Memory
 
Seminar_Report_hadoop
Seminar_Report_hadoopSeminar_Report_hadoop
Seminar_Report_hadoop
 
Hadoop Cluster Analysis and Assessment
Hadoop Cluster Analysis and AssessmentHadoop Cluster Analysis and Assessment
Hadoop Cluster Analysis and Assessment
 
Big Data Analysis and Its Scheduling Policy – Hadoop
Big Data Analysis and Its Scheduling Policy – HadoopBig Data Analysis and Its Scheduling Policy – Hadoop
Big Data Analysis and Its Scheduling Policy – Hadoop
 

More from IJSRD

#IJSRD #Research Paper Publication
#IJSRD #Research Paper Publication#IJSRD #Research Paper Publication
#IJSRD #Research Paper PublicationIJSRD
 
Maintaining Data Confidentiality in Association Rule Mining in Distributed En...
Maintaining Data Confidentiality in Association Rule Mining in Distributed En...Maintaining Data Confidentiality in Association Rule Mining in Distributed En...
Maintaining Data Confidentiality in Association Rule Mining in Distributed En...IJSRD
 
Performance and Emission characteristics of a Single Cylinder Four Stroke Die...
Performance and Emission characteristics of a Single Cylinder Four Stroke Die...Performance and Emission characteristics of a Single Cylinder Four Stroke Die...
Performance and Emission characteristics of a Single Cylinder Four Stroke Die...IJSRD
 
Preclusion of High and Low Pressure In Boiler by Using LABVIEW
Preclusion of High and Low Pressure In Boiler by Using LABVIEWPreclusion of High and Low Pressure In Boiler by Using LABVIEW
Preclusion of High and Low Pressure In Boiler by Using LABVIEWIJSRD
 
Prevention and Detection of Man in the Middle Attack on AODV Protocol
Prevention and Detection of Man in the Middle Attack on AODV ProtocolPrevention and Detection of Man in the Middle Attack on AODV Protocol
Prevention and Detection of Man in the Middle Attack on AODV ProtocolIJSRD
 
Comparative Analysis of PAPR Reduction Techniques in OFDM Using Precoding Tec...
Comparative Analysis of PAPR Reduction Techniques in OFDM Using Precoding Tec...Comparative Analysis of PAPR Reduction Techniques in OFDM Using Precoding Tec...
Comparative Analysis of PAPR Reduction Techniques in OFDM Using Precoding Tec...IJSRD
 
Evaluation the Effect of Machining Parameters on MRR of Mild Steel
Evaluation the Effect of Machining Parameters on MRR of Mild SteelEvaluation the Effect of Machining Parameters on MRR of Mild Steel
Evaluation the Effect of Machining Parameters on MRR of Mild SteelIJSRD
 
Filter unwanted messages from walls and blocking nonlegitimate user in osn
Filter unwanted messages from walls and blocking nonlegitimate user in osnFilter unwanted messages from walls and blocking nonlegitimate user in osn
Filter unwanted messages from walls and blocking nonlegitimate user in osnIJSRD
 
Keystroke Dynamics Authentication with Project Management System
Keystroke Dynamics Authentication with Project Management SystemKeystroke Dynamics Authentication with Project Management System
Keystroke Dynamics Authentication with Project Management SystemIJSRD
 
Diagnosing lungs cancer Using Neural Networks
Diagnosing lungs cancer Using Neural NetworksDiagnosing lungs cancer Using Neural Networks
Diagnosing lungs cancer Using Neural NetworksIJSRD
 
A Survey on Sentiment Analysis and Opinion Mining
A Survey on Sentiment Analysis and Opinion MiningA Survey on Sentiment Analysis and Opinion Mining
A Survey on Sentiment Analysis and Opinion MiningIJSRD
 
A Defect Prediction Model for Software Product based on ANFIS
A Defect Prediction Model for Software Product based on ANFISA Defect Prediction Model for Software Product based on ANFIS
A Defect Prediction Model for Software Product based on ANFISIJSRD
 
Experimental Investigation of Granulated Blast Furnace Slag ond Quarry Dust a...
Experimental Investigation of Granulated Blast Furnace Slag ond Quarry Dust a...Experimental Investigation of Granulated Blast Furnace Slag ond Quarry Dust a...
Experimental Investigation of Granulated Blast Furnace Slag ond Quarry Dust a...IJSRD
 
Product Quality Analysis based on online Reviews
Product Quality Analysis based on online ReviewsProduct Quality Analysis based on online Reviews
Product Quality Analysis based on online ReviewsIJSRD
 
Solving Fuzzy Matrix Games Defuzzificated by Trapezoidal Parabolic Fuzzy Numbers
Solving Fuzzy Matrix Games Defuzzificated by Trapezoidal Parabolic Fuzzy NumbersSolving Fuzzy Matrix Games Defuzzificated by Trapezoidal Parabolic Fuzzy Numbers
Solving Fuzzy Matrix Games Defuzzificated by Trapezoidal Parabolic Fuzzy NumbersIJSRD
 
Study of Clustering of Data Base in Education Sector Using Data Mining
Study of Clustering of Data Base in Education Sector Using Data MiningStudy of Clustering of Data Base in Education Sector Using Data Mining
Study of Clustering of Data Base in Education Sector Using Data MiningIJSRD
 
Fault Tolerance in Big Data Processing Using Heartbeat Messages and Data Repl...
Fault Tolerance in Big Data Processing Using Heartbeat Messages and Data Repl...Fault Tolerance in Big Data Processing Using Heartbeat Messages and Data Repl...
Fault Tolerance in Big Data Processing Using Heartbeat Messages and Data Repl...IJSRD
 
Investigation of Effect of Process Parameters on Maximum Temperature during F...
Investigation of Effect of Process Parameters on Maximum Temperature during F...Investigation of Effect of Process Parameters on Maximum Temperature during F...
Investigation of Effect of Process Parameters on Maximum Temperature during F...IJSRD
 
Review Paper on Computer Aided Design & Analysis of Rotor Shaft of a Rotavator
Review Paper on Computer Aided Design & Analysis of Rotor Shaft of a RotavatorReview Paper on Computer Aided Design & Analysis of Rotor Shaft of a Rotavator
Review Paper on Computer Aided Design & Analysis of Rotor Shaft of a RotavatorIJSRD
 
A Survey on Data Mining Techniques for Crime Hotspots Prediction
A Survey on Data Mining Techniques for Crime Hotspots PredictionA Survey on Data Mining Techniques for Crime Hotspots Prediction
A Survey on Data Mining Techniques for Crime Hotspots PredictionIJSRD
 

More from IJSRD (20)

#IJSRD #Research Paper Publication
#IJSRD #Research Paper Publication#IJSRD #Research Paper Publication
#IJSRD #Research Paper Publication
 
Maintaining Data Confidentiality in Association Rule Mining in Distributed En...
Maintaining Data Confidentiality in Association Rule Mining in Distributed En...Maintaining Data Confidentiality in Association Rule Mining in Distributed En...
Maintaining Data Confidentiality in Association Rule Mining in Distributed En...
 
Performance and Emission characteristics of a Single Cylinder Four Stroke Die...
Performance and Emission characteristics of a Single Cylinder Four Stroke Die...Performance and Emission characteristics of a Single Cylinder Four Stroke Die...
Performance and Emission characteristics of a Single Cylinder Four Stroke Die...
 
Preclusion of High and Low Pressure In Boiler by Using LABVIEW
Preclusion of High and Low Pressure In Boiler by Using LABVIEWPreclusion of High and Low Pressure In Boiler by Using LABVIEW
Preclusion of High and Low Pressure In Boiler by Using LABVIEW
 
Prevention and Detection of Man in the Middle Attack on AODV Protocol
Prevention and Detection of Man in the Middle Attack on AODV ProtocolPrevention and Detection of Man in the Middle Attack on AODV Protocol
Prevention and Detection of Man in the Middle Attack on AODV Protocol
 
Comparative Analysis of PAPR Reduction Techniques in OFDM Using Precoding Tec...
Comparative Analysis of PAPR Reduction Techniques in OFDM Using Precoding Tec...Comparative Analysis of PAPR Reduction Techniques in OFDM Using Precoding Tec...
Comparative Analysis of PAPR Reduction Techniques in OFDM Using Precoding Tec...
 
Evaluation the Effect of Machining Parameters on MRR of Mild Steel
Evaluation the Effect of Machining Parameters on MRR of Mild SteelEvaluation the Effect of Machining Parameters on MRR of Mild Steel
Evaluation the Effect of Machining Parameters on MRR of Mild Steel
 
Filter unwanted messages from walls and blocking nonlegitimate user in osn
Filter unwanted messages from walls and blocking nonlegitimate user in osnFilter unwanted messages from walls and blocking nonlegitimate user in osn
Filter unwanted messages from walls and blocking nonlegitimate user in osn
 
Keystroke Dynamics Authentication with Project Management System
Keystroke Dynamics Authentication with Project Management SystemKeystroke Dynamics Authentication with Project Management System
Keystroke Dynamics Authentication with Project Management System
 
Diagnosing lungs cancer Using Neural Networks
Diagnosing lungs cancer Using Neural NetworksDiagnosing lungs cancer Using Neural Networks
Diagnosing lungs cancer Using Neural Networks
 
A Survey on Sentiment Analysis and Opinion Mining
A Survey on Sentiment Analysis and Opinion MiningA Survey on Sentiment Analysis and Opinion Mining
A Survey on Sentiment Analysis and Opinion Mining
 
A Defect Prediction Model for Software Product based on ANFIS
A Defect Prediction Model for Software Product based on ANFISA Defect Prediction Model for Software Product based on ANFIS
A Defect Prediction Model for Software Product based on ANFIS
 
Experimental Investigation of Granulated Blast Furnace Slag ond Quarry Dust a...
Experimental Investigation of Granulated Blast Furnace Slag ond Quarry Dust a...Experimental Investigation of Granulated Blast Furnace Slag ond Quarry Dust a...
Experimental Investigation of Granulated Blast Furnace Slag ond Quarry Dust a...
 
Product Quality Analysis based on online Reviews
Product Quality Analysis based on online ReviewsProduct Quality Analysis based on online Reviews
Product Quality Analysis based on online Reviews
 
Solving Fuzzy Matrix Games Defuzzificated by Trapezoidal Parabolic Fuzzy Numbers
Solving Fuzzy Matrix Games Defuzzificated by Trapezoidal Parabolic Fuzzy NumbersSolving Fuzzy Matrix Games Defuzzificated by Trapezoidal Parabolic Fuzzy Numbers
Solving Fuzzy Matrix Games Defuzzificated by Trapezoidal Parabolic Fuzzy Numbers
 
Study of Clustering of Data Base in Education Sector Using Data Mining
Study of Clustering of Data Base in Education Sector Using Data MiningStudy of Clustering of Data Base in Education Sector Using Data Mining
Study of Clustering of Data Base in Education Sector Using Data Mining
 
Fault Tolerance in Big Data Processing Using Heartbeat Messages and Data Repl...
Fault Tolerance in Big Data Processing Using Heartbeat Messages and Data Repl...Fault Tolerance in Big Data Processing Using Heartbeat Messages and Data Repl...
Fault Tolerance in Big Data Processing Using Heartbeat Messages and Data Repl...
 
Investigation of Effect of Process Parameters on Maximum Temperature during F...
Investigation of Effect of Process Parameters on Maximum Temperature during F...Investigation of Effect of Process Parameters on Maximum Temperature during F...
Investigation of Effect of Process Parameters on Maximum Temperature during F...
 
Review Paper on Computer Aided Design & Analysis of Rotor Shaft of a Rotavator
Review Paper on Computer Aided Design & Analysis of Rotor Shaft of a RotavatorReview Paper on Computer Aided Design & Analysis of Rotor Shaft of a Rotavator
Review Paper on Computer Aided Design & Analysis of Rotor Shaft of a Rotavator
 
A Survey on Data Mining Techniques for Crime Hotspots Prediction
A Survey on Data Mining Techniques for Crime Hotspots PredictionA Survey on Data Mining Techniques for Crime Hotspots Prediction
A Survey on Data Mining Techniques for Crime Hotspots Prediction
 

Recently uploaded

How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPCeline George
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYKayeClaireEstoconing
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptxmary850239
 
Science 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxScience 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxMaryGraceBautista27
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersSabitha Banu
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...Nguyen Thanh Tu Collection
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17Celine George
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSJoshuaGantuangco2
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPCeline George
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Celine George
 
Q4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptxQ4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptxnelietumpap1
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...JhezDiaz1
 

Recently uploaded (20)

How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERP
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx
 
Science 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxScience 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptx
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginners
 
Raw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptxRaw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptx
 
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptxLEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERP
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17
 
Q4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptxQ4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptx
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptxYOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
 
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptxFINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
 

IJSRD Journal Fault Tolerance in Big Data Using Heartbeat Messages

  • 1. IJSRD - International Journal for Scientific Research & Development| Vol. 3, Issue 10, 2015 | ISSN (online): 2321-0613 All rights reserved by www.ijsrd.com 999 Fault Tolerance in Big Data Processing using Heartbeat Messages and Data Replication T.Cowsalya1 N.Gomathi2 R.Arunkumar3 1,2,3 Assistant Professor 1,2,3 Department of Computer Science and Engineering 1,2,3 SVS College of Engineering, Coimbatore, Tamil Nadu, India, Pincode-642109 Abstract— Big data is a popular term used to define the exponential evolution and availability of data, includes both structured and unstructured data. The volatile progression of demands on big data processing imposes heavy burden on computation, communication and storage in geographically distributed data centers. Hence it is necessary to minimize the cost of big data processing, which also includes fault tolerance cost. Big Data processing involves two types of faults: node failure and data loss. Both the faults can be recovered using heartbeat messages. Here heartbeat messages acts as an acknowledgement messages between two servers. This paper depicts about the study of node failure and recovery, data replication and heartbeat messages. Key words: Big data, Fault Tolerance, Heartbeat Messages, Node Recovery, Data Replication I. INTRODUCTION Big data is a slogan, used to define a gigantic measurement of both structured and Unstructured data that is so large and difficult to process using traditional database architecture. Due to its explosive growth the volatile progression of demands on big data processing imposes heavy burden on computation, communication and storage in geographically distributed data centers. The incoming large data set is broken up into multiple chunks and each individual multiple chunks are placed in different data canters with the help of volley system. The Volley System [2] makes use of logs to submit the jobs to the data center. Cloud users make use of volley system foe automatic data placement. A. Geo-Distributed Data Center: The data centers distributed at multiple geographical regions are known as geographically distributed data centers [1].For example Google has 13 datacenters over 8 countries and 4 continents. Fig. 1: Data Center Topology II. FAULT TOLERANCE The challenge of big data includes analysis, capture, search, sharing, storage, transfer, visualization and privacy violations. Among these challenges fault tolerance is one of the main challenge in big data. There are possibly two faults that can occur while processing big data. First the data chunk may loss while transferring the data to multiple data center. Second the server may fail or slows down. A. Heartbeat Messages; The solution for the above two problems are heartbeat messages. Here heartbeat message is a message sent from an inventor to the endpoint to identify if and when the inventor fails or is no longer available. Heartbeat messages are non- stop on a periodic recurring basis from the inventor’s startup until the inventor’s shutdown. When the receiver identifies lack of heartbeat messages during an anticipated arrival period, the destination may determine that the inventor has failed, shutdown, or is generally no longer available. The developmentrelays to fault recovery in multiprocessor system, where the processors constantly monitor heartbeat messages from the other processor is capable of taking autonomous recovery action in response to a failure to receive heartbeat messages, advantageously without the overall guidance of an executive processor. B. Data Loss Prevention: When transmitting the jobs to multiple data centers, there may be chance of data loss. Data loss may occur due to network link failure. The links in networks may vary on transmissionrates according to their unique features. For example the distances and optical fiber facilities between multiple data centers. Due to capacity constraints, all tasks are not placedonto the same server, on which theconsistentdata exist in. It is unavoidable when certain data must bedownloaded from a remote server. In this case, routingplan matters on the transmission cost. C. Hadoop Architecture: Hadoop is a software framework used for processing big data in parallel. It consists of two important components called Hadoop Distributed File System and MapReduce 1) Hadoop Distributed File System: The Hadoop Distributed File System (HDFS)[3] is a distributed, highly fault-tolerant file system designed to run on low-cost commodity hardware. HDFS provides high- throughput access to application data and it is suitable for applications with large data sets. HDFS consists of two nodes called name node and data node.Name node manages file system namespace operations like opening, closing, and renaming files and directories. A name node also maps data blocks to data nodes, which handle read and write requests from HDFS clients. Data nodes also create, delete, and replicate data blocks according to instructions from the governing name node. Name Node and Data Node send messages to prove their identity.
  • 2. Fault Tolerance in Big Data Processing using Heartbeat Messages and Data Replication (IJSRD/Vol. 3/Issue 10/2015/227) All rights reserved by www.ijsrd.com 1000 2) Mapreduce: MapReduce [12] is a programming model and its associated implementation for processing and generating large data sets with parallel, distributed algorithm on a cluster. MapReduce also consists of two nodes called job tracker and task tracker. The JobTracker talks to the NameNode to determine the location of the data. The Task Tracker node executes the assigned tasks in the data nodes. III. FAILURES One of the major benefits of using Hadoop is its ability to handle these failures and allow our job to complete. A. Task Failure: When the user code in the map or reduce tasks throws runtime exception, then the child task fails. Another failure mode is the sudden exit of child JVM. In this case the task tracker notices the process has exited and marks the attempt as failed. A task may also be killed, which is different from failing. B. Task Tracker Failure: If a task tracker fails by crashing or running very slowly, it will stop sending heartbeat messages to the job tracker. The job tracker notice that the task trackers has stopped sending heartbeat and remove it from its pool of task trackers to schedule tasks on. C. Job Tracker Failure: Failure of job tracker is the most serious failure mode. It is a single point of failure. This failure mode has low chance of occurring, since the chance of particular machine failing is low. D. Name Node Failure: The name node was a single point of failure, so if it failed that meant your cluster became unstable. Even the secondary name node doesn’t help in this case since it is only used for checkpoints, not as a backup for the name node. If the name node fails someone like an administrator would have to restart the name node. E. Data Node Failure; A compute node can fail for any variety of reasons, for example broken node hardware, a broken network, software bugs, or inadequate hardware resources Fig. 2: Name node Schema When a compute node fails, all jobs running on that node fail. Even though the running jobs running on other nodes that weren’t communicating with jobs on the failed node will continue to run without a problem. IV. SOLUTION A. Data Replication: An application can specify the number of replicas of a file at the time it is created, and this number can be changed any time after that. [6] The name node makes all decisions concerning block replication. HDFS uses an intelligent replica placement model for reliability and performance. Optimizing replica placement makes HDFS unique from most other distributed file systems, and is facilitated by a rack-aware replica placement policy that uses network bandwidth efficiently. Fig. 3: Data Replication in DFS The name node makes all decisions regarding replications of blocks. It periodically receives a heartbeat and a block report from each of the data nodes in the cluster. Receipt of a heartbeat implies that the data node is functioning properly. A block report contains a list of all blocks on a data node.
  • 3. Fault Tolerance in Big Data Processing using Heartbeat Messages and Data Replication (IJSRD/Vol. 3/Issue 10/2015/227) All rights reserved by www.ijsrd.com 1001 B. Replica Selection: To minimize global bandwidth consumption and read latency, HDFS tries to satisfy a read request from a replica that is closest to the reader. If there exists a replica on the same rack as the reader node, then the replica is preferred to satisfy the read request. If HDFS clusters spans multiple data centers, then a replica that is resident in the local data center is preferred over any remote replica. V. CONCLUSION Thus the failure in big data processing has been studied. Data replication and heartbeat messages are used as a fault tolerant mechanism. In future practical setups can be executed and the computation and communication cost can be computed. Result can be compared with the cost of data processing in non- failure node REFERENCES [1] Data Center Locations,” http://www.google.com/about/data centers/inside/locations/index.html. [2] S. Agarwal, J. Dunagan, N. Jain, S. Saroiu, A. Wolman, and H. Bhogan, “Volley: Automated Data Placement for Geo-DistributedCloud Services,” in The 7th USENIX Symposium on Networked Systems Design and Implementation (NSDI), 2010, pp. 17–32. [3] L. Rao, X. Liu, L. Xie, and W. Liu, “Minimizing Electricity Cost: Optimization of Distributed Internet Data Centers in a Multi-Electricity-Market Environment,” in Proceedings of the 29th International Conference on Computer Communications (INFOCOM). IEEE, 2010, pp. 1–9. [4] A. Qureshi, R. Weber, H. Balakrishnan, J. Guttag, and B. Maggs, “Cutting the Electric Bill for Internet-scale Systems,” in Proceedings of the ACM Special Interest Group on Data Communication (SIGCOMM). ACM, 2009, pp. 123–134. [5] R. Urgaonkar, B. Urgaonkar, M. J. Neely, and A. Sivasubramaniam, “Optimal Power Cost Management Using Stored Energyin Data Centers,” in Proceedings of International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS).ACM, 2011, pp. 221–232. [6] X. Fan, W.-D.Weber, and L. A. Barroso, “Power Provisioning for A Warehouse-sized Computer,” in Proceedings of the 34th Annual International Symposium on Computer Architecture (ISCA). ACM, 2007, pp. 13–23. [7] S. Govindan, A. Sivasubramaniam, and B. Urgaonkar, “Benefits and Limitations of Tapping Into Stored Energy for Datacenters,”in Proceedings of the 38th Annual International Symposium on Computer Architecture (ISCA). ACM, 2011, pp. 341–352 [8] P. X. Gao, A. R. Curtis, B. Wong, and S. Keshav, “It’s Not Easy Being Green,” in Proceedings of the ACM Special Interest Group on Data Communication (SIGCOMM). ACM, 2012, pp. 211–222. [9] S. A. Yazd, S. Venkatesan, and N. Mittal, “Boosting energy efficiency with mirrored data block replication policy and energyscheduler,” SIGOPS Oper. Syst. Rev., vol. 47, no. 2, pp. 33–40, 2013. [10]J. Cohen, B. Dolan, M. Dunlap, J. M. Hellerstein, and C. Welton, “Mad skills: new analysis practices for big data,” Proc. VLDBEndow., vol. 2, no. 2, pp. 1481– 1492, 2009. [11]R. Kaushik and K. Nahrstedt, “T*: A data-centric cooling energy costs reduction approach for Big Data analytics cloud,” in 2012 International Conference for High Performance Computing, Networking, Storage and Analysis (SC), 2012, pp. 1–11. [12]MapReduce: Simpli_ed Data Processing on Large Clusters, Jeffrey Dean and Sanjay Ghemawat, jeff@google.com, sanjay@google.com, Google, Inc.