Hadoop Distributed File System

Anand L. Kulkarni.
Hadoop Distributed File System
A Presentation By ,

28 August 2015Hadoop Distributed File System 2
 Need for large data processing –
 Challenges at large scale –
 What is Distributed File System(DFS)?

 “Framework for running [distributed]
applications on large cluster built of commodity
hardware“ .
- From Hadoop Wiki.
 Originally created by Doug Cutting .
 Named the project after his son’s name.
 Inspired by Google’s architecture: Map Reduce
and GFS

 The name “Hadoop” has now evolved to cover a
family of products, but at its core, it’s essentially just
the
 - MapReduce programming paradigm and
 - A distributed file system(HDFS).

 Master/slave architecture
 Fault tolerant via replication .
 Optimized for larger files.
 Hardware failures assumed in
design.
Name Node
(Master)
(Slaves)

 Written in Java.
 Focus on streaming data
(High throughput > low-latency)
 Designed to run on commodity hardware
 HDFS is a File System, not a DBMS.

Block Data Node
Name
Node
Checkpoint
Node
Backup
Node

Name Node Backup Node
Data Node Data Node Data Node Data NodeData Node
( Replication, Heartbeats,
balancing )
(Namespace backups)
(Namespace , Metadata
operations)
(Writes to local disks)

10010011001
01001010100
10101010101
00101010010
10101010100
10101010101
010101
File
HDFS
Client
( File locations, block size, file system
operations )
(Data transfer)

10010011001
01001010100
10101010101
00101010010
10101010100
10101010101
010101
File
HDFS
Client

10010011001
01001010100
10101010101
00101010010
10101010100
10101010101
010101
File
HDFS
Client
(Return locations of blocks for
a file.)

 The Files system namespace
 Replica management
 Replica Selection
 Safe mode

 The Persistence Of File System Metadata
 Robustness
 Space Reclamation-
◦ File Deletes And Undeletes
◦ Decrease Replication Factor

 Name Node Recovery.
 Data Node Recovery.
 Metadata Disk Failure.

Data Node Data Node Data Node
Data Node
Data Node

Scalability of Name node.
Automation of Name node recovery.

Hadoop Distributed File System

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Hadoop Distributed File System

Similar to Hadoop Distributed File System (20)

Hadoop Distributed File System