Apache Hadoop HDFS

617 views
447 views

Published on

A short presentation to describe Apache HDFS

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
617
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
22
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Apache Hadoop HDFS

  1. 1. Apache Hadoop HDFS ● What is it ? ● What is it for ? ● Architecture ● Resilience ● Administration ● Data access ● Future changes ?
  2. 2. HDFS – What is it ? ● HDSF = Hadoop Distributed File System ● It is a distributed file system ● Runs on low cost hardware ● It is open source ● Written in Java ● Fault tolerant ● Designed for very large data sets ● Tuned for high throughput
  3. 3. HDFS – What is it for ? ● Designed for batch processing ● Streaming access to data ● Large data sizes i.e. Terabytes ● Highly reliable using data replication ● Supports very large node clusters ● Supports large files ● Supports file numbers into millions
  4. 4. HDFS – Architecture
  5. 5. HDFS – Architecture ● Has a master / slave architecture ● A master NameNode – Controls file system operations – Maps data blocks to DataNodes – Logs all changes ● Slave DataNodes – Store file blocks – Store replicated data
  6. 6. HDFS – Resilience ● Data is replicated across DataNodes ● Nodes may fail but data is still available ● DataNodes indicate state via heart beat report ● Single point of failure in master NameNode ● Data integrity via check sums
  7. 7. HDFS – Administration ● Access via Java API ● FS Shell commands language ● HTTP browser ● C wrapper for Java API ● Space reclamation – Via control of replication factor – Deleted files sent to trash folder – Trash folder cleaned after configurable time
  8. 8. HDFS – Future changes Things they might consider for HDFS ● File append ● User quotas ● File links ● Stand by nodes
  9. 9. Other Areas ● Want to know about ? – Big Data – Nutch – Solr ● see my other presentations
  10. 10. Contact Us ● Feel free to contact us at – www.semtech-solutions.co.nz – info@semtech-solutions.co.nz ● We offer IT project consultancy ● We are happy to hear about your problems ● You can just pay for those hours that you need ● To solve your problems

×