HDFS NameNode HA in CDH4

  • 1,890 views
Uploaded on

 

More in: Technology , Business
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
1,890
On Slideshare
0
From Embeds
0
Number of Embeds
1

Actions

Shares
Downloads
38
Comments
0
Likes
1

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. NameNode HA in CDH42012/07/06
  • 2. Background• Prior to Hadoop 2.0.0, the NameNode was a single point of failure (SPOF) in an HDFS cluster.
  • 3. Approach and Terminology• Initial goal is Active-Standby• Terminology – Active NN: Actively serves the read/write operations from the clients – Standby NN: Waits, becomes active when Active dies or is unhealthy – Hot Standby: Standby has all most of the Active’s state and start immediately
  • 4. High-level Architecture
  • 5. Hardware resources• NameNode machines – Should have equivalent hardware to each other• Shared storage – Both NameNode can have read/write access – Only a single shared directory is supported • High-quality dedicated NAS appliance is recommended• Secondary NameNode is not necessary
  • 6. Automatic Failover• Introduce two new components – ZooKeeper – ZKFailoverController (abbreviated as ZKFC) • It’s a ZooKeeper client • Each of the machines which runs a NameNode also runs a ZKFC • Responsible for: – Health monitoring – ZooKeeper session management – ZooKeeper-based election
  • 7. Automatic Failover ZK ZK ZK session session Shared dir on NFS ZKFC ZKFCHeartbeat Active Hot Standby Heartbeat NN NN Block Reports DN DN DN
  • 8. Appendix• High Availability Framework for HDFS NN – HDFS-1623• HDFS portion of ZK-based FailoverController – HDFS-2185
  • 9. Questions?