2. HDFS design Concepts
• The Hadoop Distributed File System (HDFS) is
designed to provide a fault-tolerant file system
designed to run on commodity hardware.
• The primary objective of HDFS is to store data
reliably even in the presence of failures
including Name Node failures, Data Node
failures and network partitions.
3.
4. The main components of HDFS are as
described below:
• NameNode and DataNodes:
• NameNode
• The NameNode executes file system
namespace operations like opening, closing,
and renaming files and directories. It also
determines the mapping of blocks to
DataNodes
5. • DataNode
• There are a number of DataNodes, usually one
per node in the cluster, which manage storage
attached to the nodes that they run on.
• HDFS exposes a file system namespace and
allows user data to be stored in files.
Internally, a file is split into one or more blocks
and these blocks are stored in a set of
DataNodes..
6. The File System Namespace:
• HDFS supports a traditional hierarchical file
organization.
• A user or an application can create directories
and store files inside these directories.
• One can create and remove files, move a file from
one directory to another, or rename a file.
• The NameNode maintains the file system
namespace. Any change to the file system
namespace or its properties is recorded by the
NameNode.
7. Hdfs works well on below scenario
•
• 1. Very large files – Files that are hundreds of
megabytes, gigabytes, or terabytes in size.
• 2. Streaming data access – HDFS is built around
the idea that the most efficient data processing
pattern is a write-once, read-many-times pattern.
• 3. Commodity hardware – Hadoop doesn’t
require expensive, highly reliable hardware. It’s
designed to run on clusters of commodity
hardware.
8. Hdfs is not suitable for below cases
1. Low-latency data access – Applications that
require low-latency access to data, in the tens of
milliseconds range, will not work well with HDFS.
2. Lots of small files – Because the namenode holds
filesystem metadata in memory, the limit to the
number of files in a filesystem is governed by the
amount of memory on the namenode.
9. Blocks
• HDFS has the concept of a block which is 128 MB by
default.
• Unlike a filesystem for a single disk, a file in HDFS that is
smaller than a single block does not occupy a full block’s
worth of underlying storage. (For example, a 1 MB file
stored with a block size of 128 MB uses 1 MB of disk space,
not 128 MB.).
• HDFS blocks are large compared to disk blocks, and the
reason is to minimize the cost of seeks. If the block is large
enough, the time it takes to transfer the data from the disk
can be significantly longer than the time to seek to the start
of the block. Thus, transferring a large file made of multiple
blocks operates at the disk transfer rate.
10. Data Replication:
• HDFS is designed to reliably store very large files across
machines in a large cluster. It stores each file as a
sequence of blocks; all blocks in a file except the last
block are the same size. The blocks of a file are
replicated for fault tolerance. The block size and
replication factor are configurable per file.
• An application can specify the number of replicas of a
file. The replication factor can be specified at file
creation time and can be changed later. Files in HDFS
are write-once and have strictly one writer at any time.
11. • The NameNode makes all decisions regarding
replication of blocks. It periodically receives a
Heartbeat and a Blockreport from each of the
DataNodes in the cluster. Receipt of a
Heartbeat implies that the DataNode is
functioning properly.
• A Blockreport contains a list of all blocks on a
DataNode.
12. • A few architectural changes are needed to allow this to
happen:
• 1. The namenodes must use highly available shared
storage to share the edit log. When a standby
namenode comes up, it reads up to the end of the
shared edit log to synchronize its state with the active
namenode, and then continues to read new entries as
they are written by the active namenode.
• 2. Datanodes must send block reports to both
namenodes because the block mappings are stored in a
namenode’s memory, and not on disk.
13. • 3. Clients must be configured to handle
namenode failover, using a mechanism that is
transparent to users.
• 4. The secondary namenode’s role is
subsumed by the standby, which takes
periodic checkpoints of the active namenode’s
namespace.