Hadoop hdfs

558 views

Published on

Distributed File System

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
558
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
34
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Hadoop hdfs

  1. 1. BE Seminar Topic on- HADOOP HDFS Presented By – SUDIPTA GHOSH (2010-3016) University Institute Of Technology , Burdwan University
  2. 2. What is Big Data ?
  3. 3. Why DFS ?
  4. 4. What is Hadoop ? Apache Hadoop is a framework that allows for the distributed processing of large data sets across clusters of commodity computers using a simple programming model.
  5. 5. Companies using Hadoop  Yahoo  Google  Facebook  Amazon  AOL  IBM  LinkedIn  And many more at http://wiki.apache.org/hadoop/PoweredBy
  6. 6. What is HDFS ? • HDFS – Hadoop Distributed File System • HDFS is a file system designed for storing very large files with streaming data access patterns, running clusters on commodity hardware.
  7. 7. Why HDFS ? – – – – – Highly fault-tolerant High throughput Suitable for applications with large data sets Streaming access to file system data Can be built out of commodity hardware
  8. 8. Main Components Of Hadoop : Name Node:  master of the system  maintains and manages the blocks which are present in the data node Data Nodes:  Slaves which are deployed in each machine and provide the actual storage  Responsible for serving read and write requests for the client
  9. 9. HDFS Architecture
  10. 10. Job Tracker
  11. 11. Job Tracker( Contnd.)
  12. 12. Running Jobs On Hadoop
  13. 13. Areas Where Hadoop Is Not A Good Fit Today : • Low-Latency data access • Lots of small files • Multiple writers, arbitrary file modifications
  14. 14. References 1. http://hadoop.apache.org/core/docs 2. https://www.youtube.com/watch?v=A02SRd yoshM
  15. 15. THANK YOU

×