Guided By: Prepared By:
Dr. Padmaja Naraharisetty Ms. Pooja Mehta
IT Systems & Network Security,
GTU PG School,
Gandhinagar
Research Paper Review
 Title : Fault Tolerance Techniques in Big Data Tools
 Authors : Manjula Dyavanur, Kavita Kori
 Journal Published: International Journal of Innovative Research in
Computer and Communication Engineering, Vol.2, May 2014
 Abstract: Big Data, Big Data Tools, Fault Tolerance, Hadoop, MongoDB
 Contribution: Strategies that will include Data duplication, Checkpoint,
Automatic Recovery
 Conclusion: Efficient solution with recovery methods
Fault Tolerance in Big Data - Pooja Mehta 2
Introduction
 Structured Data & Unstructured Data
 Scalability
 Fault Tolerance
Fault Tolerance in Big Data - Pooja Mehta 3
Objective
 New Technology
 Cost Factor
 Less Consumption of Recourses
 Backward Compatibility
Fault Tolerance in Big Data - Pooja Mehta 4
Literature Review
 Peng Hu, Wei Dai, “Enhancing Fault Tolerance Based on Hadoop Cluster”,
Inernational Journal of Database Thoery and Applications, Vol.7, No.1
(2014), pp.37-48
 Abstract : Hadoop, Checkpoint
 T. Cowsalya and S.R. Mugunthan, “Hadoop Architecture And Fault
Tolerance Based Hadoop Clusters In Geographically Distributed Data
Center”, ARPN Journal of Engineering and Applied Sciences, VOL. 10,
NO. 7, APRIL 2015
 Abstract : Hadoop, fault tolerance, HDFS, name node, data node
Fault Tolerance in Big Data - Pooja Mehta 5
Proposed Solution
 Fault tolerance using COTS (Commercial-Off-The-
Shelf) technology
Fault Tolerance in Big Data - Pooja Mehta 6
Conclusion
 Critical components may become unavailable and
impossible to reproduce
 Environmental Constraints
 Special Operating Systems
Fault Tolerance in Big Data - Pooja Mehta 7
Thank You !
Fault Tolerance in Big Data - Pooja Mehta 8

Fault tolerance in Big Data

  • 1.
    Guided By: PreparedBy: Dr. Padmaja Naraharisetty Ms. Pooja Mehta IT Systems & Network Security, GTU PG School, Gandhinagar
  • 2.
    Research Paper Review Title : Fault Tolerance Techniques in Big Data Tools  Authors : Manjula Dyavanur, Kavita Kori  Journal Published: International Journal of Innovative Research in Computer and Communication Engineering, Vol.2, May 2014  Abstract: Big Data, Big Data Tools, Fault Tolerance, Hadoop, MongoDB  Contribution: Strategies that will include Data duplication, Checkpoint, Automatic Recovery  Conclusion: Efficient solution with recovery methods Fault Tolerance in Big Data - Pooja Mehta 2
  • 3.
    Introduction  Structured Data& Unstructured Data  Scalability  Fault Tolerance Fault Tolerance in Big Data - Pooja Mehta 3
  • 4.
    Objective  New Technology Cost Factor  Less Consumption of Recourses  Backward Compatibility Fault Tolerance in Big Data - Pooja Mehta 4
  • 5.
    Literature Review  PengHu, Wei Dai, “Enhancing Fault Tolerance Based on Hadoop Cluster”, Inernational Journal of Database Thoery and Applications, Vol.7, No.1 (2014), pp.37-48  Abstract : Hadoop, Checkpoint  T. Cowsalya and S.R. Mugunthan, “Hadoop Architecture And Fault Tolerance Based Hadoop Clusters In Geographically Distributed Data Center”, ARPN Journal of Engineering and Applied Sciences, VOL. 10, NO. 7, APRIL 2015  Abstract : Hadoop, fault tolerance, HDFS, name node, data node Fault Tolerance in Big Data - Pooja Mehta 5
  • 6.
    Proposed Solution  Faulttolerance using COTS (Commercial-Off-The- Shelf) technology Fault Tolerance in Big Data - Pooja Mehta 6
  • 7.
    Conclusion  Critical componentsmay become unavailable and impossible to reproduce  Environmental Constraints  Special Operating Systems Fault Tolerance in Big Data - Pooja Mehta 7
  • 8.
    Thank You ! FaultTolerance in Big Data - Pooja Mehta 8

Editor's Notes

  • #3 Big Data usually includes data sets with sizes beyond the ability of commonly used software tools to capture, curate, manage, and process the data within a tolerable elapsed time. Fault tolerant system is one that can provide continue correct performance of its specified tasks in presence of failure.This paper is based on a survey of different kind of fault tolerance techniques in big data tools such as Hadoop and MongoDB.
  • #4 Scalability is the capability of a system, network, or process to handle a growing amount of work, or its potential to be enlarged in order to accommodate that growth.
  • #5 Backward compatible (or sometimes backward-compatible or backwards compatible) refers to a hardware or software system that can successfully use interfaces and data from earlier versions of the system or with other systems.
  • #7 Cheaper (large quantity production) General Purpose (more flexible for different applications) Large user base uncovers design defects early Provides current technology solutions Emerging technology tends to be backward compatible with legacy products (allows solutions to advance with technology) Avoids binding solution to single hardware/software source