SlideShare a Scribd company logo
HDFS – Hadoop Distributed File System
Agenda
1. HDFS – Hadoop Distributed File System
 HDFS
HDFS Design and Goal
HDFS componetnts: Namenode, Datanode, Secondary
Namenode.
HDFS blocks and replication.
Anatomy of a File Read/Write in HDFS
 Designed to reliably store very large files across machines in a
large cluster
 Data Model
 Data is organized into files and directories
 In storage layer files are divided into uniform sized blocks(64MB,
128MB, 256MB) and distributed across cluster nodes
 Blocks are replicated to handle hardware failure
 File system keeps checksums of data for corruption detection
and recovery
 HDFS exposes block placement so that computes can be
migrated to data
HDFS
HDFS is a file system designed for storing very large files with
streaming data access patterns, running on clusters of commodity
hardware.
 HDFS is File system rather then a storage.
HDFS exhibits almost all POSIX file system standards
- File, directory and sub-directory structure
-Permissions(rwx)
-Access(owner, group, other) and concept super user
 Hadoop provides many interfaces to its filesystems, and it
generally uses the URI scheme to pick the correct
filesystem instance to communicate with.
 Fault tolerant, scalable, distributed storage system
HDFS continues…
 Very Large Distributed File System
10K nodes, 100 million files, 10 PB ..! No problem, is possible !!!!!!!.
 Computation moved to data
Data locations exposed so that computations can move to where
data resides
 Streaming Data access
 Assumes Commodity Hardware
Files are replicated to handle hardware failure
Detect failures and recovers from them
 Optimized for Batch Processing
Provides very high aggregate bandwidth
 High throughput of data access Streaming access to data
 Large files Typical file is gigabytes to terabytes in size Support for tens of
millions of files.
 Simple coherency- Write-once-read-many access model
 Highly fault-tolerant runs on commodity HW, which can fail frequently
HDFS Design Goals
Master-Slave architecture
 HDFS Master “Namenode”
Manages the file system namespace
Controls read/write access to files
Manages block replication
Checkpoints namespace and journals namespace changes for
reliability
 HDFS Workers “Datanodes”
 Serve read/write requests from clients
 Perform replication tasks upon instruction by Namenode.
 Report blocks and system state.
 HDFS Namespece Backup “Secondary Namenode”
HDFS Components
HDFS components interaction
HDFS Components relation
 Single Namespace for entire cluster
 Files are broken up into sequence blocks
– For a file all blocks except the last are of the same size.
– Each block replicated on multiple DataNodes
 Data Coherency
– Write-once-read-many access model
– Client can only add/append to files, restricts to random change
 Intelligent Client
– Client can find location of blocks
– Client accesses data directly from DataNode [Q: How it is possible ?]
--- User data never flows through the NameNode
Distributed file system
Distributed file system continue…
HDFS Blocks
HDFS has Large block size
 Default 64MB
 Typical 128MB, 256MB, 512MB…
Normal Filesystem blocks are few kilobytes.
Unlike a file system for a single disk. A file in HDFS that is
smaller than a single block does not occupy a full block. if a
block is 10MB it needs only 10MB of the space of the full
block on the local drive.
A file is stored in blocks on various nodes in hadoop cluster.
Provides complte abstrction view to client.
HDFS block placement
HDFS creates several replication of the data blocks.
Each and every data block is replicated to multiple nodes across the
cluster.
BlockReport contains all the blocks on a Datanode.
HDFS Blocks continues...
 Default is 3 replicas, but settable
 Blocks are placed (writes are pipelined):
(Will seen rack –awareness in next slide)
– On same node
– On different rack
– On the other rack
 Clients read from closest replica.
 If the replication for a block drops below target, it is
automatically re-replicated.
 Pluggable policy for placing block replicas.
Block Placements
Why blocks in HDFS so large?
Minimize the cost of seeks
 Make transfer time > disk transfer rate
Desinged for porcees indepdently.
Benefit of Block abstraction
A file can be larger than any single disk in the network.
Simplify the storage subsystem.
Providing fault tolerance and availability.
Indepedent processing, failureover and distribute the
computaion. [Q: How it is independent ? ]
 Data blocks are checked with CRC32
 File Creation
Client computes checksum per 512 byte
DataNode stores the checksum
 File access
Client retrieves the data and checksum from DataNode
If Validation fails, reports and client tries other replicas
 Corrupted blocks are reported and tolerated.
 Each block replica on a DataNode is represented by two files in
the local native filesystem. The first file contains the data itself
and the second file records the block's metadata including
checksums for the data and the generation stamp.
Data Correctness
 Data Integrity maintained in block level. [Q: Why it is block level not in
file level?]
 Client copies data along with check sum and client computes the
checksum of every block , it verifies that the corresponding
checksums match. If does not match, the client can retrieve the
block from a replica. The corrupt block delete a replica will
create.
 Verified after each operation. What if access foe long time? that
might result in data corruption. Also checked periodically.
Data Integrity
 An HDFS cluster consists of a single Namenode, a
master server that manages the file system namespace
and regulates access to files by clients. HDFS file system
namespace is stored and maintained by Namenode. Name
node maintains metadata in two binary files in
namenode’s storage directory are
 edits,
 fsimage
 Name node maintains ‘namespace ID’, to persistently
stored on all nodes of the cluster. The namespace ID is
assigned to the filesystem instance when it is formatted.
Name Node
 The HDFS namespace is a hierarchy of files and directories.
Files and directories are represented on the NameNode by
inodes. Inodes record attributes like permissions, modification
and access times, namespace and disk space quotas.
 Meta-data in Memory
– The entire metadata is in main memory for Fast access
– No demand paging and I/O wait for meta-data
 Metadata content
– Hierarchical file system with directories and files
– List of Blocks for each file
– File attributes, e.g access time, replication factor
 Improved durability, redundant copies of the checkpoint and journal
are typically stored on multiple independent local volumes and at
remote NFS servers.
NameNode Metadata
 The inodes structure and the list of blocks that define the
metadata store in file : fsimage
 The ‘fsimage’ file is a persistent checkpoint of the filesystem
metadata. It load in name node start-up.
 The NameNode records changes to HDFS in a write-ahead
log called the ‘Transaction Log ‘ in its local native
filesystem infile: edits
 Transaction is recorded in the Transaction Log, and the journal
file is flushed and synced before the acknowledgment is sent
to the client.
 The location of block replicas are not part of the persistent
checkpoint.
 The checkpoint file is never changed by the NameNode;
(Will look checkpoint and secondary namenode in later topics)
NameNode Metadata continues…
[what, how and where ]
 The NameNode maintains the namespace tree
 Mapping of datanode to list of blocks
 Receiving heartbeats and Monitor datanodes health.
 Replicate missing blocks.
 Recording the file system changes.
 Authorization & Authentication.
Name node functions
DataNode is slave daemon to perform the grunt work of the
distributed filesystem—reading and writing HDFS blocks to
actual files on the local filesystem.
The ‘Slave’ stores data in files in its local file system.
Datanode has no knowledge about HDFS filesystem.
It stores each block of HDFS data in a separate file.
Clients access the blocks directly from data nodes after
communication with namenode.
Blocks are stored as underlying OS’s files, Datanode does
not create all files in the same directory, it use optimal
number of files per directory and creates directories
appropriately.
Data Node
[Read, Write, Report ]
 During startup each DataNode connects to the NameNode and
performs a handshake. The purpose of the handshake is to
verify the namespace ID and the software version of the
DataNode. If either does not match that of the NameNode, the
DataNode automatically shuts down.
 Serves read, write requests, performs block creation, deletion,
and replication upon instruction from Namenode
 Periodically send heartbeats and block reports to Namenode
 A DataNode identifies block replicas in its possession to the
NameNode by sending a block report.
 A block report contains the block ID, the generation stamp and the
length for each block replica the server hosts
Data Node functions
 During normal operation DataNodes send heartbeats to the NameNode to
confirm that the DataNode is operating and the block replicas it hosts are
available.
 Heartbeats from a DataNode also carry information about total
storage capacity, fraction of storage in use, and the number of data
transfers currently in progress etc...
 These statistics are used for the NameNode's block allocation and load
balancing decision.
 The NameNode does not directly send requests to DataNodes. It uses replies
to heartbeats to send instructions to the DataNodes. The instructions include
commands to replicate blocks to other nodes, remove local block replicas,
re-register and send an immediate block report, and shut down the node.
 To maintaining the overall system integrity it is critical to keep heartbeats
frequent even on big clusters. The NameNode can process thousands of
heartbeats per second without affecting other NameNode operations.
Data node heartbeats
 The Secondary NameNode (SNN) is an assistant daemon
for monitoring and storing(backup) the state of the
cluster HDFS.
 NameNode to take snapshots of the HDFS metadata at
intervals defined by the cluster configuration.
 The NameNode is a single point of failure for a Hadoop
cluster, and the SNN snapshots help minimize the
downtime and loss of data. Nevertheless, a NameNode
failure requires human intervention to reconfigure the
cluster to use
the SNN as the primary NameNode.
Secondary Name Node
Secondary Namenode ineraction
with Namenode
Will see the check point and recovery in later topics
SNN periodically merge the namespace image with the edit log to
prevent the edit log from becoming too large.
Anatomy of a File Read in HDFS
One important aspect of this design is that the client contacts datanodes
directly to retrieve data and is guided by the namenode to the best datanode for
each block Direct connection between client and datanode.
Failure : Move to next 'closest' node with the block.
1. Client connects to the NameNode with file name.
2. The namenode performs various checks to make sure the file
exist, client has the right permissions etc ..
3. The namenode returns the addresses of the datanodes that have a
copy of that block.(locality is considered)
4. The list of datanodes forms a pipeline—we’ll assume the
replication level is three, so there are three nodes in the
pipeline.
5. The client connects to the first(closest) datanode for the first
block in the file and reads. Then find the best datanode for the
next block… and finish reads for the file.
6. Verifies checksums for each block the data transferred to it
from the datanode.
7. During reading, if the client encounters an error while
communicating with a datanode or block corrupted, then it will
try the next closest one for that block.
Anatomy of a File Read in HDFS continue…
Anatomy of a File Read
Anatomy of a File Write in HDFS
1. Client connects to the NameNode with file name.
2. The namenodeperforms various checks to make sure the file doesn’t
already exist, client has the right permissions etc
3. NameNode places an entry for the file in its metadata, returns the
block name and list of DataNodes to the client.
4. The list of datanodes forms a pipeline—we’ll assume the replication
level is three, so there are three nodes in the pipeline.
5. Client connects to the first DataNode and starts sending data, As
data is received by the first DataNode, it connects to the second and
starts sending data Second DataNode similarly the second datanode
stores the packet and forwards it to the third datanode in the
pipeline.
6. A packet is removed from the ack queue only when it has been
acknowledged by all the datanodes in the pipeline.
7. A datanode fails while data is being written to it, partial block on the
failed datanode will be deleted if failed datanode recovers later on.
8. Client reports to the NameNode when the block is written.
Anatomy of a File Write in HDFS continue…
Replication and Rack-awareneces
 Replication in Hadoop is at the block level .
 Default Replication factor is 3 and configurable.
 Blocks are replicated for fault tolerance.
 A file’s replication factor can be changed dynamically and configurable
per file
 Rack-aware replica placement- Goal: improve reliability, availability and
network bandwidth utilization
 Many racks, communication between racks are through switches.
 Network bandwidth between machines on the same rack is greater than
those in different racks.
 Namenode determines the rack id for each DataNode.
Replication and Rack-awareneces continue…
Replication and Rack-awareneces continue…
Replicas are placed: one on a node in a local rack, one on a
different node in the local rack and one on a node in a
different rack.
1/3 of the replica on a node, 2/3 on a same rack and 1/3
distributed evenly across remaining racks.
Replica selection for READ operation: HDFS tries to
minimize the bandwidth consumption and latency.
Selection of blocks to process in a MapReduce job takes
advantage of rack-awareness.
Rack-awareness is NOT automatic, and needs to be
configured. By default, all nodes are assumed to be in the
same rack.
Block Re-replication
The necessity for re-replication may arise due to:
 A Datanode may become unavailable,
 A replica may become corrupted,
 A hard disk on a Datanode may fail, or
 The replication factor on the block may be increased.
Block under-replication & over-replication is detected by
Namenode
Balancer application rebalances blocks to balance datanode
utilization.
Will look balncer in later topic
HDFS Worst fit with
Low-latency data access
Lots of small files
Trasaction access and update
Multiple writers, arbitrary file modifications
Coherency Model
Not visible when copying
use sync()
Write onece, read many
Apply in applications
Command Line
Similar to *nix
 hadoop fs -ls /
 hadoop fs -mkdir /test
 hadoop fs -rmr /test
 hadoop fs -cp /1 /2
 hadoop fs -copyFromLocal /3 hdfs://localhost/
Namedone-specific:
 hadoop namenode -format
 start-all.sh
Command Line
Sorting: Standard method to test cluster
 TeraGen: Generate dummy data
 TeraSort: Sort
 TeraValidate: Validate sort result
Command Line:
 hadoop jar /usr/share/hadoop/hadoop-examples-1.0.3.jar
terasort hdfs://ubuntu/10GdataUnsorted /10GDataSorted41
References
Hadoop: The Definitive Guide, Third Edition by Tom White.
http://hadoop.apache.org/
http://www.cloudera.com/
https://developer.yahoo.com/hadoop/tutorial/
Hadoop HDFS Architeture and Design

More Related Content

What's hot

Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...Simplilearn
 
Big Data and Hadoop
Big Data and HadoopBig Data and Hadoop
Big Data and HadoopFlavio Vit
 
Introduction To Hadoop | What Is Hadoop And Big Data | Hadoop Tutorial For Be...
Introduction To Hadoop | What Is Hadoop And Big Data | Hadoop Tutorial For Be...Introduction To Hadoop | What Is Hadoop And Big Data | Hadoop Tutorial For Be...
Introduction To Hadoop | What Is Hadoop And Big Data | Hadoop Tutorial For Be...Simplilearn
 
Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component rebeccatho
 
Hadoop File system (HDFS)
Hadoop File system (HDFS)Hadoop File system (HDFS)
Hadoop File system (HDFS)Prashant Gupta
 
Hadoop Distributed File System
Hadoop Distributed File SystemHadoop Distributed File System
Hadoop Distributed File SystemRutvik Bapat
 
Hadoop Training | Hadoop Training For Beginners | Hadoop Architecture | Hadoo...
Hadoop Training | Hadoop Training For Beginners | Hadoop Architecture | Hadoo...Hadoop Training | Hadoop Training For Beginners | Hadoop Architecture | Hadoo...
Hadoop Training | Hadoop Training For Beginners | Hadoop Architecture | Hadoo...Simplilearn
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to HadoopApache Apex
 
Hadoop introduction , Why and What is Hadoop ?
Hadoop introduction , Why and What is  Hadoop ?Hadoop introduction , Why and What is  Hadoop ?
Hadoop introduction , Why and What is Hadoop ?sudhakara st
 
Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with HadoopPhilippe Julio
 
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...Simplilearn
 
What is NoSQL and CAP Theorem
What is NoSQL and CAP TheoremWhat is NoSQL and CAP Theorem
What is NoSQL and CAP TheoremRahul Jain
 

What's hot (20)

Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...
 
Big Data and Hadoop
Big Data and HadoopBig Data and Hadoop
Big Data and Hadoop
 
Introduction To Hadoop | What Is Hadoop And Big Data | Hadoop Tutorial For Be...
Introduction To Hadoop | What Is Hadoop And Big Data | Hadoop Tutorial For Be...Introduction To Hadoop | What Is Hadoop And Big Data | Hadoop Tutorial For Be...
Introduction To Hadoop | What Is Hadoop And Big Data | Hadoop Tutorial For Be...
 
Map Reduce
Map ReduceMap Reduce
Map Reduce
 
Hadoop YARN
Hadoop YARNHadoop YARN
Hadoop YARN
 
Big data and Hadoop
Big data and HadoopBig data and Hadoop
Big data and Hadoop
 
Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component
 
Hadoop File system (HDFS)
Hadoop File system (HDFS)Hadoop File system (HDFS)
Hadoop File system (HDFS)
 
Hadoop Distributed File System
Hadoop Distributed File SystemHadoop Distributed File System
Hadoop Distributed File System
 
Hadoop Training | Hadoop Training For Beginners | Hadoop Architecture | Hadoo...
Hadoop Training | Hadoop Training For Beginners | Hadoop Architecture | Hadoo...Hadoop Training | Hadoop Training For Beginners | Hadoop Architecture | Hadoo...
Hadoop Training | Hadoop Training For Beginners | Hadoop Architecture | Hadoo...
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
 
Hadoop introduction , Why and What is Hadoop ?
Hadoop introduction , Why and What is  Hadoop ?Hadoop introduction , Why and What is  Hadoop ?
Hadoop introduction , Why and What is Hadoop ?
 
Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with Hadoop
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
 
Hadoop Map Reduce
Hadoop Map ReduceHadoop Map Reduce
Hadoop Map Reduce
 
Apache HBase™
Apache HBase™Apache HBase™
Apache HBase™
 
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
 
Hadoop ecosystem
Hadoop ecosystemHadoop ecosystem
Hadoop ecosystem
 
What is NoSQL and CAP Theorem
What is NoSQL and CAP TheoremWhat is NoSQL and CAP Theorem
What is NoSQL and CAP Theorem
 
Hive
HiveHive
Hive
 

Viewers also liked

Cloud computing architecture and vulnerabilies
Cloud computing architecture and vulnerabiliesCloud computing architecture and vulnerabilies
Cloud computing architecture and vulnerabiliesVinay Dwivedi
 
Hadoop HDFS Detailed Introduction
Hadoop HDFS Detailed IntroductionHadoop HDFS Detailed Introduction
Hadoop HDFS Detailed IntroductionHanborq Inc.
 
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Scaling HDFS to Manage Billions of Files with Distributed Storage SchemesScaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Scaling HDFS to Manage Billions of Files with Distributed Storage SchemesDataWorks Summit/Hadoop Summit
 
Distributed Computing with Apache Hadoop: Technology Overview
Distributed Computing with Apache Hadoop: Technology OverviewDistributed Computing with Apache Hadoop: Technology Overview
Distributed Computing with Apache Hadoop: Technology OverviewKonstantin V. Shvachko
 
Hadoop Operations - Best Practices from the Field
Hadoop Operations - Best Practices from the FieldHadoop Operations - Best Practices from the Field
Hadoop Operations - Best Practices from the FieldDataWorks Summit
 
How to overcome mysterious problems caused by large and multi-tenancy Hadoop ...
How to overcome mysterious problems caused by large and multi-tenancy Hadoop ...How to overcome mysterious problems caused by large and multi-tenancy Hadoop ...
How to overcome mysterious problems caused by large and multi-tenancy Hadoop ...DataWorks Summit/Hadoop Summit
 
Hadoop - Lessons Learned
Hadoop - Lessons LearnedHadoop - Lessons Learned
Hadoop - Lessons Learnedtcurdt
 
Hadoop & Big Data benchmarking
Hadoop & Big Data benchmarkingHadoop & Big Data benchmarking
Hadoop & Big Data benchmarkingBart Vandewoestyne
 
Hadoop & HDFS for Beginners
Hadoop & HDFS for BeginnersHadoop & HDFS for Beginners
Hadoop & HDFS for BeginnersRahul Jain
 
Unleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache RangerUnleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache RangerDataWorks Summit/Hadoop Summit
 
Seminar Presentation Hadoop
Seminar Presentation HadoopSeminar Presentation Hadoop
Seminar Presentation HadoopVarun Narang
 

Viewers also liked (16)

Cloud computing architecture and vulnerabilies
Cloud computing architecture and vulnerabiliesCloud computing architecture and vulnerabilies
Cloud computing architecture and vulnerabilies
 
Hadoop HDFS Detailed Introduction
Hadoop HDFS Detailed IntroductionHadoop HDFS Detailed Introduction
Hadoop HDFS Detailed Introduction
 
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Scaling HDFS to Manage Billions of Files with Distributed Storage SchemesScaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
 
Hadoop introduction
Hadoop introductionHadoop introduction
Hadoop introduction
 
HDFS Design Principles
HDFS Design PrinciplesHDFS Design Principles
HDFS Design Principles
 
Distributed Computing with Apache Hadoop: Technology Overview
Distributed Computing with Apache Hadoop: Technology OverviewDistributed Computing with Apache Hadoop: Technology Overview
Distributed Computing with Apache Hadoop: Technology Overview
 
Hadoop Operations - Best Practices from the Field
Hadoop Operations - Best Practices from the FieldHadoop Operations - Best Practices from the Field
Hadoop Operations - Best Practices from the Field
 
How to overcome mysterious problems caused by large and multi-tenancy Hadoop ...
How to overcome mysterious problems caused by large and multi-tenancy Hadoop ...How to overcome mysterious problems caused by large and multi-tenancy Hadoop ...
How to overcome mysterious problems caused by large and multi-tenancy Hadoop ...
 
Big data- HDFS(2nd presentation)
Big data- HDFS(2nd presentation)Big data- HDFS(2nd presentation)
Big data- HDFS(2nd presentation)
 
Hadoop - Lessons Learned
Hadoop - Lessons LearnedHadoop - Lessons Learned
Hadoop - Lessons Learned
 
Cluster Computing
Cluster ComputingCluster Computing
Cluster Computing
 
Hadoop & Big Data benchmarking
Hadoop & Big Data benchmarkingHadoop & Big Data benchmarking
Hadoop & Big Data benchmarking
 
Hadoop & HDFS for Beginners
Hadoop & HDFS for BeginnersHadoop & HDFS for Beginners
Hadoop & HDFS for Beginners
 
Unleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache RangerUnleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache Ranger
 
Hadoop
HadoopHadoop
Hadoop
 
Seminar Presentation Hadoop
Seminar Presentation HadoopSeminar Presentation Hadoop
Seminar Presentation Hadoop
 

Similar to Hadoop HDFS Architeture and Design

Hadoop distributed file system
Hadoop distributed file systemHadoop distributed file system
Hadoop distributed file systemsrikanthhadoop
 
Introduction_to_HDFS sun.pptx
Introduction_to_HDFS sun.pptxIntroduction_to_HDFS sun.pptx
Introduction_to_HDFS sun.pptxsunithachphd
 
Big data interview questions and answers
Big data interview questions and answersBig data interview questions and answers
Big data interview questions and answersKalyan Hadoop
 
Hadoop-professional-software-development-course-in-mumbai
Hadoop-professional-software-development-course-in-mumbaiHadoop-professional-software-development-course-in-mumbai
Hadoop-professional-software-development-course-in-mumbaiUnmesh Baile
 
Hadoop professional-software-development-course-in-mumbai
Hadoop professional-software-development-course-in-mumbaiHadoop professional-software-development-course-in-mumbai
Hadoop professional-software-development-course-in-mumbaiUnmesh Baile
 
Hadoop Distributed File System for Big Data Analytics
Hadoop Distributed File System for Big Data AnalyticsHadoop Distributed File System for Big Data Analytics
Hadoop Distributed File System for Big Data AnalyticsDrPDShebaKeziaMalarc
 
Big data with HDFS and Mapreduce
Big data  with HDFS and MapreduceBig data  with HDFS and Mapreduce
Big data with HDFS and Mapreducesenthil0809
 
Hadoop architecture (Delhi Hadoop User Group Meetup 10 Sep 2011)
Hadoop architecture (Delhi Hadoop User Group Meetup 10 Sep 2011)Hadoop architecture (Delhi Hadoop User Group Meetup 10 Sep 2011)
Hadoop architecture (Delhi Hadoop User Group Meetup 10 Sep 2011)Hari Shankar Sreekumar
 
Hadoop Interview Questions And Answers Part-1 | Big Data Interview Questions ...
Hadoop Interview Questions And Answers Part-1 | Big Data Interview Questions ...Hadoop Interview Questions And Answers Part-1 | Big Data Interview Questions ...
Hadoop Interview Questions And Answers Part-1 | Big Data Interview Questions ...Simplilearn
 

Similar to Hadoop HDFS Architeture and Design (20)

Hadoop distributed file system
Hadoop distributed file systemHadoop distributed file system
Hadoop distributed file system
 
Hadoop
HadoopHadoop
Hadoop
 
Introduction_to_HDFS sun.pptx
Introduction_to_HDFS sun.pptxIntroduction_to_HDFS sun.pptx
Introduction_to_HDFS sun.pptx
 
Big data interview questions and answers
Big data interview questions and answersBig data interview questions and answers
Big data interview questions and answers
 
Hadoop-professional-software-development-course-in-mumbai
Hadoop-professional-software-development-course-in-mumbaiHadoop-professional-software-development-course-in-mumbai
Hadoop-professional-software-development-course-in-mumbai
 
Hadoop professional-software-development-course-in-mumbai
Hadoop professional-software-development-course-in-mumbaiHadoop professional-software-development-course-in-mumbai
Hadoop professional-software-development-course-in-mumbai
 
Hadoop Distributed File System for Big Data Analytics
Hadoop Distributed File System for Big Data AnalyticsHadoop Distributed File System for Big Data Analytics
Hadoop Distributed File System for Big Data Analytics
 
module 2.pptx
module 2.pptxmodule 2.pptx
module 2.pptx
 
Hdfs
HdfsHdfs
Hdfs
 
Hadoop data management
Hadoop data managementHadoop data management
Hadoop data management
 
Hdfs
HdfsHdfs
Hdfs
 
Big data with HDFS and Mapreduce
Big data  with HDFS and MapreduceBig data  with HDFS and Mapreduce
Big data with HDFS and Mapreduce
 
HDFS.ppt
HDFS.pptHDFS.ppt
HDFS.ppt
 
HDFS.ppt
HDFS.pptHDFS.ppt
HDFS.ppt
 
Hadoop Architecture
Hadoop ArchitectureHadoop Architecture
Hadoop Architecture
 
Hadoop architecture (Delhi Hadoop User Group Meetup 10 Sep 2011)
Hadoop architecture (Delhi Hadoop User Group Meetup 10 Sep 2011)Hadoop architecture (Delhi Hadoop User Group Meetup 10 Sep 2011)
Hadoop architecture (Delhi Hadoop User Group Meetup 10 Sep 2011)
 
Hadoop
HadoopHadoop
Hadoop
 
Hadoop HDFS Concepts
Hadoop HDFS ConceptsHadoop HDFS Concepts
Hadoop HDFS Concepts
 
Hadoop and HDFS
Hadoop and HDFSHadoop and HDFS
Hadoop and HDFS
 
Hadoop Interview Questions And Answers Part-1 | Big Data Interview Questions ...
Hadoop Interview Questions And Answers Part-1 | Big Data Interview Questions ...Hadoop Interview Questions And Answers Part-1 | Big Data Interview Questions ...
Hadoop Interview Questions And Answers Part-1 | Big Data Interview Questions ...
 

Recently uploaded

Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundOppotus
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单enxupq
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单enxupq
 
Business update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMIBusiness update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMIAlejandraGmez176757
 
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPsWebinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPsCEPTES Software Inc
 
Jpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization SampleJpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization SampleJames Polillo
 
Exploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptxExploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptxDilipVasan
 
How can I successfully sell my pi coins in Philippines?
How can I successfully sell my pi coins in Philippines?How can I successfully sell my pi coins in Philippines?
How can I successfully sell my pi coins in Philippines?DOT TECH
 
Supply chain analytics to combat the effects of Ukraine-Russia-conflict
Supply chain analytics to combat the effects of Ukraine-Russia-conflictSupply chain analytics to combat the effects of Ukraine-Russia-conflict
Supply chain analytics to combat the effects of Ukraine-Russia-conflictJack Cole
 
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...correoyaya
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单yhkoc
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .NABLAS株式会社
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单ewymefz
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单ewymefz
 
Tabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsTabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsalex933524
 
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...elinavihriala
 
Investigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesInvestigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesStarCompliance.io
 
Pre-ProductionImproveddsfjgndflghtgg.pptx
Pre-ProductionImproveddsfjgndflghtgg.pptxPre-ProductionImproveddsfjgndflghtgg.pptx
Pre-ProductionImproveddsfjgndflghtgg.pptxStephen266013
 

Recently uploaded (20)

Slip-and-fall Injuries: Top Workers' Comp Claims
Slip-and-fall Injuries: Top Workers' Comp ClaimsSlip-and-fall Injuries: Top Workers' Comp Claims
Slip-and-fall Injuries: Top Workers' Comp Claims
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
 
Business update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMIBusiness update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMI
 
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPsWebinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
 
Jpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization SampleJpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization Sample
 
Exploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptxExploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptx
 
How can I successfully sell my pi coins in Philippines?
How can I successfully sell my pi coins in Philippines?How can I successfully sell my pi coins in Philippines?
How can I successfully sell my pi coins in Philippines?
 
Supply chain analytics to combat the effects of Ukraine-Russia-conflict
Supply chain analytics to combat the effects of Ukraine-Russia-conflictSupply chain analytics to combat the effects of Ukraine-Russia-conflict
Supply chain analytics to combat the effects of Ukraine-Russia-conflict
 
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
 
Tabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsTabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflows
 
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...
 
Investigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesInvestigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_Crimes
 
Pre-ProductionImproveddsfjgndflghtgg.pptx
Pre-ProductionImproveddsfjgndflghtgg.pptxPre-ProductionImproveddsfjgndflghtgg.pptx
Pre-ProductionImproveddsfjgndflghtgg.pptx
 

Hadoop HDFS Architeture and Design

  • 1. HDFS – Hadoop Distributed File System
  • 2. Agenda 1. HDFS – Hadoop Distributed File System  HDFS HDFS Design and Goal HDFS componetnts: Namenode, Datanode, Secondary Namenode. HDFS blocks and replication. Anatomy of a File Read/Write in HDFS
  • 3.  Designed to reliably store very large files across machines in a large cluster  Data Model  Data is organized into files and directories  In storage layer files are divided into uniform sized blocks(64MB, 128MB, 256MB) and distributed across cluster nodes  Blocks are replicated to handle hardware failure  File system keeps checksums of data for corruption detection and recovery  HDFS exposes block placement so that computes can be migrated to data HDFS HDFS is a file system designed for storing very large files with streaming data access patterns, running on clusters of commodity hardware.
  • 4.  HDFS is File system rather then a storage. HDFS exhibits almost all POSIX file system standards - File, directory and sub-directory structure -Permissions(rwx) -Access(owner, group, other) and concept super user  Hadoop provides many interfaces to its filesystems, and it generally uses the URI scheme to pick the correct filesystem instance to communicate with.  Fault tolerant, scalable, distributed storage system HDFS continues…
  • 5.  Very Large Distributed File System 10K nodes, 100 million files, 10 PB ..! No problem, is possible !!!!!!!.  Computation moved to data Data locations exposed so that computations can move to where data resides  Streaming Data access  Assumes Commodity Hardware Files are replicated to handle hardware failure Detect failures and recovers from them  Optimized for Batch Processing Provides very high aggregate bandwidth  High throughput of data access Streaming access to data  Large files Typical file is gigabytes to terabytes in size Support for tens of millions of files.  Simple coherency- Write-once-read-many access model  Highly fault-tolerant runs on commodity HW, which can fail frequently HDFS Design Goals
  • 6. Master-Slave architecture  HDFS Master “Namenode” Manages the file system namespace Controls read/write access to files Manages block replication Checkpoints namespace and journals namespace changes for reliability  HDFS Workers “Datanodes”  Serve read/write requests from clients  Perform replication tasks upon instruction by Namenode.  Report blocks and system state.  HDFS Namespece Backup “Secondary Namenode” HDFS Components
  • 9.  Single Namespace for entire cluster  Files are broken up into sequence blocks – For a file all blocks except the last are of the same size. – Each block replicated on multiple DataNodes  Data Coherency – Write-once-read-many access model – Client can only add/append to files, restricts to random change  Intelligent Client – Client can find location of blocks – Client accesses data directly from DataNode [Q: How it is possible ?] --- User data never flows through the NameNode Distributed file system
  • 10. Distributed file system continue…
  • 11. HDFS Blocks HDFS has Large block size  Default 64MB  Typical 128MB, 256MB, 512MB… Normal Filesystem blocks are few kilobytes. Unlike a file system for a single disk. A file in HDFS that is smaller than a single block does not occupy a full block. if a block is 10MB it needs only 10MB of the space of the full block on the local drive. A file is stored in blocks on various nodes in hadoop cluster. Provides complte abstrction view to client.
  • 12. HDFS block placement HDFS creates several replication of the data blocks. Each and every data block is replicated to multiple nodes across the cluster. BlockReport contains all the blocks on a Datanode. HDFS Blocks continues...
  • 13.  Default is 3 replicas, but settable  Blocks are placed (writes are pipelined): (Will seen rack –awareness in next slide) – On same node – On different rack – On the other rack  Clients read from closest replica.  If the replication for a block drops below target, it is automatically re-replicated.  Pluggable policy for placing block replicas. Block Placements
  • 14. Why blocks in HDFS so large? Minimize the cost of seeks  Make transfer time > disk transfer rate Desinged for porcees indepdently.
  • 15. Benefit of Block abstraction A file can be larger than any single disk in the network. Simplify the storage subsystem. Providing fault tolerance and availability. Indepedent processing, failureover and distribute the computaion. [Q: How it is independent ? ]
  • 16.  Data blocks are checked with CRC32  File Creation Client computes checksum per 512 byte DataNode stores the checksum  File access Client retrieves the data and checksum from DataNode If Validation fails, reports and client tries other replicas  Corrupted blocks are reported and tolerated.  Each block replica on a DataNode is represented by two files in the local native filesystem. The first file contains the data itself and the second file records the block's metadata including checksums for the data and the generation stamp. Data Correctness
  • 17.  Data Integrity maintained in block level. [Q: Why it is block level not in file level?]  Client copies data along with check sum and client computes the checksum of every block , it verifies that the corresponding checksums match. If does not match, the client can retrieve the block from a replica. The corrupt block delete a replica will create.  Verified after each operation. What if access foe long time? that might result in data corruption. Also checked periodically. Data Integrity
  • 18.  An HDFS cluster consists of a single Namenode, a master server that manages the file system namespace and regulates access to files by clients. HDFS file system namespace is stored and maintained by Namenode. Name node maintains metadata in two binary files in namenode’s storage directory are  edits,  fsimage  Name node maintains ‘namespace ID’, to persistently stored on all nodes of the cluster. The namespace ID is assigned to the filesystem instance when it is formatted. Name Node
  • 19.  The HDFS namespace is a hierarchy of files and directories. Files and directories are represented on the NameNode by inodes. Inodes record attributes like permissions, modification and access times, namespace and disk space quotas.  Meta-data in Memory – The entire metadata is in main memory for Fast access – No demand paging and I/O wait for meta-data  Metadata content – Hierarchical file system with directories and files – List of Blocks for each file – File attributes, e.g access time, replication factor  Improved durability, redundant copies of the checkpoint and journal are typically stored on multiple independent local volumes and at remote NFS servers. NameNode Metadata
  • 20.  The inodes structure and the list of blocks that define the metadata store in file : fsimage  The ‘fsimage’ file is a persistent checkpoint of the filesystem metadata. It load in name node start-up.  The NameNode records changes to HDFS in a write-ahead log called the ‘Transaction Log ‘ in its local native filesystem infile: edits  Transaction is recorded in the Transaction Log, and the journal file is flushed and synced before the acknowledgment is sent to the client.  The location of block replicas are not part of the persistent checkpoint.  The checkpoint file is never changed by the NameNode; (Will look checkpoint and secondary namenode in later topics) NameNode Metadata continues…
  • 21. [what, how and where ]  The NameNode maintains the namespace tree  Mapping of datanode to list of blocks  Receiving heartbeats and Monitor datanodes health.  Replicate missing blocks.  Recording the file system changes.  Authorization & Authentication. Name node functions
  • 22. DataNode is slave daemon to perform the grunt work of the distributed filesystem—reading and writing HDFS blocks to actual files on the local filesystem. The ‘Slave’ stores data in files in its local file system. Datanode has no knowledge about HDFS filesystem. It stores each block of HDFS data in a separate file. Clients access the blocks directly from data nodes after communication with namenode. Blocks are stored as underlying OS’s files, Datanode does not create all files in the same directory, it use optimal number of files per directory and creates directories appropriately. Data Node
  • 23. [Read, Write, Report ]  During startup each DataNode connects to the NameNode and performs a handshake. The purpose of the handshake is to verify the namespace ID and the software version of the DataNode. If either does not match that of the NameNode, the DataNode automatically shuts down.  Serves read, write requests, performs block creation, deletion, and replication upon instruction from Namenode  Periodically send heartbeats and block reports to Namenode  A DataNode identifies block replicas in its possession to the NameNode by sending a block report.  A block report contains the block ID, the generation stamp and the length for each block replica the server hosts Data Node functions
  • 24.  During normal operation DataNodes send heartbeats to the NameNode to confirm that the DataNode is operating and the block replicas it hosts are available.  Heartbeats from a DataNode also carry information about total storage capacity, fraction of storage in use, and the number of data transfers currently in progress etc...  These statistics are used for the NameNode's block allocation and load balancing decision.  The NameNode does not directly send requests to DataNodes. It uses replies to heartbeats to send instructions to the DataNodes. The instructions include commands to replicate blocks to other nodes, remove local block replicas, re-register and send an immediate block report, and shut down the node.  To maintaining the overall system integrity it is critical to keep heartbeats frequent even on big clusters. The NameNode can process thousands of heartbeats per second without affecting other NameNode operations. Data node heartbeats
  • 25.  The Secondary NameNode (SNN) is an assistant daemon for monitoring and storing(backup) the state of the cluster HDFS.  NameNode to take snapshots of the HDFS metadata at intervals defined by the cluster configuration.  The NameNode is a single point of failure for a Hadoop cluster, and the SNN snapshots help minimize the downtime and loss of data. Nevertheless, a NameNode failure requires human intervention to reconfigure the cluster to use the SNN as the primary NameNode. Secondary Name Node
  • 26. Secondary Namenode ineraction with Namenode Will see the check point and recovery in later topics SNN periodically merge the namespace image with the edit log to prevent the edit log from becoming too large.
  • 27. Anatomy of a File Read in HDFS One important aspect of this design is that the client contacts datanodes directly to retrieve data and is guided by the namenode to the best datanode for each block Direct connection between client and datanode. Failure : Move to next 'closest' node with the block.
  • 28. 1. Client connects to the NameNode with file name. 2. The namenode performs various checks to make sure the file exist, client has the right permissions etc .. 3. The namenode returns the addresses of the datanodes that have a copy of that block.(locality is considered) 4. The list of datanodes forms a pipeline—we’ll assume the replication level is three, so there are three nodes in the pipeline. 5. The client connects to the first(closest) datanode for the first block in the file and reads. Then find the best datanode for the next block… and finish reads for the file. 6. Verifies checksums for each block the data transferred to it from the datanode. 7. During reading, if the client encounters an error while communicating with a datanode or block corrupted, then it will try the next closest one for that block. Anatomy of a File Read in HDFS continue…
  • 29. Anatomy of a File Read
  • 30. Anatomy of a File Write in HDFS
  • 31. 1. Client connects to the NameNode with file name. 2. The namenodeperforms various checks to make sure the file doesn’t already exist, client has the right permissions etc 3. NameNode places an entry for the file in its metadata, returns the block name and list of DataNodes to the client. 4. The list of datanodes forms a pipeline—we’ll assume the replication level is three, so there are three nodes in the pipeline. 5. Client connects to the first DataNode and starts sending data, As data is received by the first DataNode, it connects to the second and starts sending data Second DataNode similarly the second datanode stores the packet and forwards it to the third datanode in the pipeline. 6. A packet is removed from the ack queue only when it has been acknowledged by all the datanodes in the pipeline. 7. A datanode fails while data is being written to it, partial block on the failed datanode will be deleted if failed datanode recovers later on. 8. Client reports to the NameNode when the block is written. Anatomy of a File Write in HDFS continue…
  • 32.
  • 33. Replication and Rack-awareneces  Replication in Hadoop is at the block level .  Default Replication factor is 3 and configurable.  Blocks are replicated for fault tolerance.  A file’s replication factor can be changed dynamically and configurable per file  Rack-aware replica placement- Goal: improve reliability, availability and network bandwidth utilization  Many racks, communication between racks are through switches.  Network bandwidth between machines on the same rack is greater than those in different racks.
  • 34.  Namenode determines the rack id for each DataNode. Replication and Rack-awareneces continue…
  • 35. Replication and Rack-awareneces continue… Replicas are placed: one on a node in a local rack, one on a different node in the local rack and one on a node in a different rack. 1/3 of the replica on a node, 2/3 on a same rack and 1/3 distributed evenly across remaining racks. Replica selection for READ operation: HDFS tries to minimize the bandwidth consumption and latency. Selection of blocks to process in a MapReduce job takes advantage of rack-awareness. Rack-awareness is NOT automatic, and needs to be configured. By default, all nodes are assumed to be in the same rack.
  • 36. Block Re-replication The necessity for re-replication may arise due to:  A Datanode may become unavailable,  A replica may become corrupted,  A hard disk on a Datanode may fail, or  The replication factor on the block may be increased. Block under-replication & over-replication is detected by Namenode Balancer application rebalances blocks to balance datanode utilization. Will look balncer in later topic
  • 37. HDFS Worst fit with Low-latency data access Lots of small files Trasaction access and update Multiple writers, arbitrary file modifications
  • 38. Coherency Model Not visible when copying use sync() Write onece, read many Apply in applications
  • 39. Command Line Similar to *nix  hadoop fs -ls /  hadoop fs -mkdir /test  hadoop fs -rmr /test  hadoop fs -cp /1 /2  hadoop fs -copyFromLocal /3 hdfs://localhost/ Namedone-specific:  hadoop namenode -format  start-all.sh
  • 40. Command Line Sorting: Standard method to test cluster  TeraGen: Generate dummy data  TeraSort: Sort  TeraValidate: Validate sort result Command Line:  hadoop jar /usr/share/hadoop/hadoop-examples-1.0.3.jar terasort hdfs://ubuntu/10GdataUnsorted /10GDataSorted41
  • 41. References Hadoop: The Definitive Guide, Third Edition by Tom White. http://hadoop.apache.org/ http://www.cloudera.com/ https://developer.yahoo.com/hadoop/tutorial/