Your SlideShare is downloading. ×
Hadoop HDFS NameNode HA
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.


Saving this for later?

Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime - even offline.

Text the download link to your phone

Standard text messaging rates apply

Hadoop HDFS NameNode HA


Published on

Hadoop HDFS NameNode HA

Hadoop HDFS NameNode HA

Published in: Technology

1 Like
  • Be the first to comment

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide


  • 1. Anty RaoApril 10, 2011
  • 2. Outline Architecture of HDFS Available NN HA options
  • 3. HDFS architectureNN is SPOF, need some kind of HA for NN.
  • 4. NN HACurrently two main available HA options: AvatarNode (facebook) BackupNode(yahoo!) (available?)
  • 5. AvatarNode
  • 6. AvatarNode (AN) Active-Standby Pair Client  Coordinated via ZooKeeper  Failover in few seconds Client retrieves block location from  Wrapper over NameNode Primary or Standby Active AvatarNode Write Read Active transaction Standby  Writes transaction log to AvatarNode transaction AvatarNode NFS filter (NameNode) (NameNode) Standby AvatarNode  Reads/Consumes transactions from NFS filter Block Block  Processes all messages from Location Location DataNodes messages messages  Latest metadata in memory DataNodes
  • 7. Four steps to failover Wipe ZooKeeper entry. Clients will know the failover is in progress. (0 seconds) Stop the primary NameNode. Last bits of data will be flushed to Transaction Log and it will die. (Seconds) Switch Standby to Primary. It will consume the rest of the Transaction log and get out of SafeMode ready to serve traffic. (Seconds) Update the entry in ZooKeeper. All the clients waiting for failover will pick up the new connection (0 seconds) After: Start the first node in the Standby Mode (Takes a while, but the cluster is up and running)
  • 8. AvatarNode @Facebook Diagram from Facebook Contrib@hadoop 0.20 (HDFS-976)
  • 9. Conclusions Complete Hot Standby  NFS for storage of fsimage and editlogs. (no data loss)  Standby node Consumes transactions from editlogs on NFS continuously. (namespace hot standby)  DataNodes send message to both primary and standby node. (block reports hot standby) Fast Switchover  Less than a minute Make sense!
  • 10. BackupNode
  • 11. BackupNode (BN) NN synchronously streams Client transaction log to Client retrieves block location BackupNode from NN BackupNode applies log Synchronous NN to in-memory and disk stream transacton (NameNode) logs to BN image BN always commit to disk BN Block (BackupNode before success to NN Location ) If BN restarts, it has to messages catch up with NN Available in HDFS 0.20.1 release DataNodes
  • 12. Limitations of BackupNode(BN) Maximum of one BackupNode per NN  Support only two-machine failure NN doesn’t forward block reports to BackupNode Time to restart from 12GB image, 70M files + 100M blocks  3-5 minutes to read the image from the disk  20 min to process block reports  BN will still take 25+ minutes to failover!
  • 13. Conclusions Incomplete Hot Standby / Semi-Hot Standby  Namespace: hot standby  Block reports: cold standby Still-Slow Switchover
  • 14. Other HA solutions DRDB + Linux HA configuration/ metadata backup