Fault tolerant mechanisms in Big Data

+
Fault-tolerant mechanisms
in Big Data
Karan Pardeshi

+
Agenda
 Introduction
 Distributed Fault-tolerant mechanisms in Big Data
 Current Model
 Use of Features to build a better model
 Future Work

+
Introduction
 Cloud computing is everywhere.
 Advantages
 Cost Efficient
 Unlimited storage
 Seamless access
 Importance of Fault Tolerance
 Mass outage at Amazon Web Services
 A zone was off for an entire day!
 Time critical systems
 Rocket on a mission
 Bank applications

+
Fault tolerant mechanisms in
Distributed Systems
 Google File System (GFS)
 Focused on storage
 Replication mechanism
 different machines on different racks, N=3.
 Shadow-master’s in support to primary master
 Read access
 Checksums for data reliability
 CRC
 Amazon Dynamo
 Focused on High Availability
 Use Vector Clocks
 For semantic reconcilation
 Hinted hand-off
 Merkle Tree
 To detect and correct instabilities

+
Distributed Systems (continued)
 Facebook’s Cassandra
 Accrual Failure detection mechanism with gossip based protocol.
 First of its kind
 Probabilistic failure rate estimator
 Zookeeper
 Group of workstations acting as servers
 One master, other service providers in accordance with the main master
 High availability
 Bigtable
 Works on top of GFS
 Chubby service – metadata storage
 Heart of Bigtable
 Primary co-ordinator of Bigtable
 Data persistence

+
Distributed Systems (continued)
 MapReduce
 Classic Master-Slave configuration
 Ex - Hadoop
 Re-execution of entire operation
 If any operation terminates in between
 Operational even if some worker’s fail
 Efficient load balancing
 HDFS

+
Existing Fault tolerant model for
Cloud Computing
 Proposed by Anjali Meshram, A.S Sambare, S.D Zade
 Input is passed to all VM’s
 Accepter
 Testing carried out on algorithms for every VM.
 Timer
 Monitoring time constraint for each VM
 Reliability Assessor (RA)
 Starts with reliability of 100% for every VM
 Calculated with time taken for every result for each VM
 Decision Maker
 Selects output of node with highest reliability.
 Raises failure if reliability falls below minimum and node is removed.

+
Features that can be combined to
create a new Fault Tolerant Model
 Master Node
 Co-ordinator
 Built on Zookeeper service
 Each job carried on three different
node
 Accrual Fault Detectors
 Probabilistic failure value
 Measured on ping responses from
Master
 Decision Maker
 Selects the majority vote to produce
final output

+
Future Work
 Develop a better and a more robust fault tolerant model
using the features described in earlier slides.

Fault tolerant mechanisms in Big Data

More Related Content

What's hot

Similar to Fault tolerant mechanisms in Big Data

Recently uploaded

Fault tolerant mechanisms in Big Data