HADOOP DISTRIBUTED FILE
SYSTEM
PRESENTED BY:-
Koushik Mondal
B.Tech
Information Technology
3rd Year, 6th Semester
Roll no- 16900215021
Registration no- 151690110147
01
Index
1. Hadoop
2. Hadoop Component
3. Distributed file Systems
4. Main Components of HDFS
5. HDFS Architecture
6. Anatomy of a file Read in HDFS
7. Anatomy of a file Write in HDFS
8. Basic Commands in HDFS
9. Conclusion
10. References
02
WHAT IS HADOOP?
 Hadoop is a framework that allows to store and
process Big Data in a distributed environment
across group of computers using simple
programming models.
 It is an Open-source Data Management, so it is
freely available & also we can configure it
according to our requirement.
03
HADOOP COMPONENT
 Hadoop Distributed File System (HDFS)
 It is used to store Big Data
 Map Reduce
 It is used for processing the Big Data
04
DISTRIBUTED FILE SYSTEM
 A Distributed File System
(DFS) is a file system that
allows access to files from
multiple hosts sharing via
a computer network.
05
MAIN COMPONENTS OF HDFS
NameNode:
 Master of the system
 Maintains and manages the
blocks which are present on
the DataNodes
DataNode:
 Slaves which are deployed on
each machine and provide the
actual storage
 Responsible for serving read
and write requests for the
clients
06
HDFS ARCHITECTURE
Metadata (Name, Replicas,…):
/home/foo/data,
Rack 1
Blocks
DataNodes
Block
Replication
Write
DataNodes
Metadata
Client
Read
NameNode
Rack 2
Client
HDFS Architecture
07
ANATOMY OF A FILE READ IN HDFS
Anatomy of a file read in HDFS
08
ANATOMY OF A FILE WRITE IN HDFS
Anatomy of a file write in HDFS
09
BASIC COMMANDS IN HDFS
 To run Hadoop background process  To create directory in HDFS
 To display list of file in HDFS
10
BASIC COMMANDS IN HDFS
 To copy file in HDFS  To move file in HDFS
11
BASIC COMMANDS IN HDFS
 To load file from LFS to HDFS  To copy file from HDFS to LFS
12
BASIC COMMANDS IN HDFS
 To check the heath of the directory  To check the cluster balance
 To count directory's and files
13
BASIC COMMANDS IN HDFS
 To show the contents of file  To delete a file in HDFS
 To delete a directory in HDFS
14
 Hadoop has been very effective solution for
companies dealing with the data in petabytes.
 It has solved many problems in industry related to
huge data management and distributed system.
 As it is open source, so it is adopted by companies
widely.
CONCLUSION
15
REFERENCES
 BOOK:
 Hadoop-The Definitive Guide, 4th Edition by Tom White
 LINKS:
 http://hadoop.apache.org/docs/r1.2.1/hdfs_design.html
 https://developer.yahoo.com/hadoop/tutorial/module2.html
 https://www.edureka.co/blog/apache-hadoop-hdfs-architecture/
16
17

Hadoop Distributed File System

  • 1.
    HADOOP DISTRIBUTED FILE SYSTEM PRESENTEDBY:- Koushik Mondal B.Tech Information Technology 3rd Year, 6th Semester Roll no- 16900215021 Registration no- 151690110147 01
  • 2.
    Index 1. Hadoop 2. HadoopComponent 3. Distributed file Systems 4. Main Components of HDFS 5. HDFS Architecture 6. Anatomy of a file Read in HDFS 7. Anatomy of a file Write in HDFS 8. Basic Commands in HDFS 9. Conclusion 10. References 02
  • 3.
    WHAT IS HADOOP? Hadoop is a framework that allows to store and process Big Data in a distributed environment across group of computers using simple programming models.  It is an Open-source Data Management, so it is freely available & also we can configure it according to our requirement. 03
  • 4.
    HADOOP COMPONENT  HadoopDistributed File System (HDFS)  It is used to store Big Data  Map Reduce  It is used for processing the Big Data 04
  • 5.
    DISTRIBUTED FILE SYSTEM A Distributed File System (DFS) is a file system that allows access to files from multiple hosts sharing via a computer network. 05
  • 6.
    MAIN COMPONENTS OFHDFS NameNode:  Master of the system  Maintains and manages the blocks which are present on the DataNodes DataNode:  Slaves which are deployed on each machine and provide the actual storage  Responsible for serving read and write requests for the clients 06
  • 7.
    HDFS ARCHITECTURE Metadata (Name,Replicas,…): /home/foo/data, Rack 1 Blocks DataNodes Block Replication Write DataNodes Metadata Client Read NameNode Rack 2 Client HDFS Architecture 07
  • 8.
    ANATOMY OF AFILE READ IN HDFS Anatomy of a file read in HDFS 08
  • 9.
    ANATOMY OF AFILE WRITE IN HDFS Anatomy of a file write in HDFS 09
  • 10.
    BASIC COMMANDS INHDFS  To run Hadoop background process  To create directory in HDFS  To display list of file in HDFS 10
  • 11.
    BASIC COMMANDS INHDFS  To copy file in HDFS  To move file in HDFS 11
  • 12.
    BASIC COMMANDS INHDFS  To load file from LFS to HDFS  To copy file from HDFS to LFS 12
  • 13.
    BASIC COMMANDS INHDFS  To check the heath of the directory  To check the cluster balance  To count directory's and files 13
  • 14.
    BASIC COMMANDS INHDFS  To show the contents of file  To delete a file in HDFS  To delete a directory in HDFS 14
  • 15.
     Hadoop hasbeen very effective solution for companies dealing with the data in petabytes.  It has solved many problems in industry related to huge data management and distributed system.  As it is open source, so it is adopted by companies widely. CONCLUSION 15
  • 16.
    REFERENCES  BOOK:  Hadoop-TheDefinitive Guide, 4th Edition by Tom White  LINKS:  http://hadoop.apache.org/docs/r1.2.1/hdfs_design.html  https://developer.yahoo.com/hadoop/tutorial/module2.html  https://www.edureka.co/blog/apache-hadoop-hdfs-architecture/ 16
  • 17.