NameNode HA in CDH42012/07/06
Background• Prior to Hadoop 2.0.0, the NameNode was a single  point of failure (SPOF) in an HDFS cluster.
Approach and Terminology• Initial goal is Active-Standby• Terminology  – Active NN: Actively serves the read/write operati...
High-level Architecture
Hardware resources• NameNode machines  – Should have equivalent hardware to each other• Shared storage  – Both NameNode ca...
Automatic Failover• Introduce two new components  – ZooKeeper  – ZKFailoverController (abbreviated as ZKFC)    • It’s a Zo...
Automatic Failover                                  ZK            ZK     ZK                   session                     ...
Appendix• High Availability Framework for HDFS NN  – HDFS-1623• HDFS portion of ZK-based FailoverController  – HDFS-2185
Questions?
Upcoming SlideShare
Loading in …5
×

HDFS NameNode HA in CDH4

2,507 views

Published on

Published in: Technology, Business
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
2,507
On SlideShare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
51
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide

HDFS NameNode HA in CDH4

  1. 1. NameNode HA in CDH42012/07/06
  2. 2. Background• Prior to Hadoop 2.0.0, the NameNode was a single point of failure (SPOF) in an HDFS cluster.
  3. 3. Approach and Terminology• Initial goal is Active-Standby• Terminology – Active NN: Actively serves the read/write operations from the clients – Standby NN: Waits, becomes active when Active dies or is unhealthy – Hot Standby: Standby has all most of the Active’s state and start immediately
  4. 4. High-level Architecture
  5. 5. Hardware resources• NameNode machines – Should have equivalent hardware to each other• Shared storage – Both NameNode can have read/write access – Only a single shared directory is supported • High-quality dedicated NAS appliance is recommended• Secondary NameNode is not necessary
  6. 6. Automatic Failover• Introduce two new components – ZooKeeper – ZKFailoverController (abbreviated as ZKFC) • It’s a ZooKeeper client • Each of the machines which runs a NameNode also runs a ZKFC • Responsible for: – Health monitoring – ZooKeeper session management – ZooKeeper-based election
  7. 7. Automatic Failover ZK ZK ZK session session Shared dir on NFS ZKFC ZKFCHeartbeat Active Hot Standby Heartbeat NN NN Block Reports DN DN DN
  8. 8. Appendix• High Availability Framework for HDFS NN – HDFS-1623• HDFS portion of ZK-based FailoverController – HDFS-2185
  9. 9. Questions?

×