Hdfs ha using journal nodes

6,339 views
5,963 views

Published on

Published in: Technology, Business
0 Comments
15 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
6,339
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
139
Comments
0
Likes
15
Embeds 0
No embeds

No notes for slide

Hdfs ha using journal nodes

  1. 1. HDFS HA using Journal Nodes
  2. 2. Introducing Journal Nodes Manual Failover 7/26/2013 Copyright 2013 Trend Micro Inc.
  3. 3. Architecture 7/26/2013 Copyright 2013 Trend Micro Inc. JN 1 JN 2 JN 3 NN Active NN Standby DN DNDNDN Block locations map
  4. 4. • When any namespace modification is performed it durably logs a record of the modification to JNs • The Standby reads the edits from the JNs and applies them to its own namespace JournalNodes’ job 7/26/2013 Copyright 2013 Trend Micro Inc. JN 1 JN 2 JN 3 NN Active NN Standby Edits Edits Edits Edits Edits EditsEdits Edits Edits Safe Mode
  5. 5. • Specify path on local disk • tolerate at most (N - 1) / 2 failures JournalNodes’ storage 7/26/2013 Copyright 2013 Trend Micro Inc.
  6. 6. • JournalNodes will only allow a single NameNode to be a writer at a time. • no potential for corrupting the file system metadata from a split-brain scenario. JournalNodes’ fencing 7/26/2013 Copyright 2013 Trend Micro Inc. JN 1 JN 2 JN 3 NN Active NN Standby WRITE READ
  7. 7. • Whenever a NameNode becomes active, it first generate an epoch number. • first active NameNode after the namespace is initialized starts with epoch number 1 • any failovers or restarts result in an increment of the epoch number JournalNodes’ fencing 7/26/2013 Copyright 2013 Trend Micro Inc.
  8. 8. • When a new NameNode becomes active, it has an epoch number higher than any previous NameNode • Call JournalNodes to increment their promised epochs • Fencing: – JNs receive newer epoch  update majority of JNs’ promised epochs  accept – JNs receive older epoch  reject JournalNodes’ fencing 7/26/2013 Copyright 2013 Trend Micro Inc.
  9. 9. • previous Active NameNode could serve read requests to clients which may be out of date until a write access performed • You can specify some fencing method to avoid this happened But… 7/26/2013 Copyright 2013 Trend Micro Inc.
  10. 10. Fencing Method 7/26/2013 Copyright 2013 Trend Micro Inc.
  11. 11. • sshfence SSH to the Active NameNode and kill the process Fencing Method 7/26/2013 Copyright 2013 Trend Micro Inc.
  12. 12. • shell run a shell command to fence the Active NameNode • The script may have properties with the '_' character replacing any '.' ex : dfs_namenode_rpc-address Fencing Method 7/26/2013 Copyright 2013 Trend Micro Inc.
  13. 13. • Additional environment variable Fencing Method 7/26/2013 Copyright 2013 Trend Micro Inc.
  14. 14. Automatic Failover 7/26/2013 Copyright 2013 Trend Micro Inc.
  15. 15. 7/26/2013 Copyright 2013 Trend Micro Inc. JN 1 JN 2 JN 3
  16. 16. • Health monitoring – the ZKFC pings its local NameNode on a periodic basis with a health-check command. (healthy/unhealthy) • ZooKeeper session management – when the local NameNode is healthy, the ZKFC holds a session open in ZooKeeper. – If the local NameNode is active, it also holds a special "lock" znode. – if the session expires, the lock node will be automatically deleted. ZKFailoverController 7/26/2013 Copyright 2013 Trend Micro Inc.
  17. 17. • ZooKeeper-based election – if the local NameNode is healthy, and no other node currently holds the lock znode, it will itself try to acquire the lock. – If it succeeds, then it has "won the election“ • Failover – the previous active is fenced – local NameNode transitions to active state. ZKFailoverController 7/26/2013 Copyright 2013 Trend Micro Inc.
  18. 18. 7/26/2013 Copyright 2013 Trend Micro Inc. JN 1 JN 2 JN 3 NN Active 1 2 3 4 5 6 7
  19. 19. Client Side 7/26/2013 Copyright 2013 Trend Micro Inc.
  20. 20. • Client connect to Active Namenode via proxy • When Active Namenode down, client receive Exception  retry and send RPC to another namenode (implement by ConfiguredFailoverProxyProvider) Client Failover 7/26/2013 Copyright 2013 Trend Micro Inc.
  21. 21. Steps to Apply HDFS HA 7/26/2013 Copyright 2013 Trend Micro Inc.
  22. 22. • If setting up a fresh HDFS cluster, hdfs namenode –format • copy over the contents of your NameNode metadata directories to the other hdfs namenode –bootstrapStandby ./format-failover-namenode.sh • hdfs –initializeSharedEdits to initialize edits log in journalnode • Startup both Namenode converting a non-HA-enabled cluster to be HA-enabled 7/26/2013 Copyright 2013 Trend Micro Inc.
  23. 23. • http://hadoop.apache.org/docs/current/hadoop- yarn/hadoop-yarn- site/HDFSHighAvailabilityWithQJM.html#Deployment • http://blog.cloudera.com/blog/2012/10/quorum-based- journaling-in-cdh4-1/ • https://issues.apache.org/jira/secure/attachment/125475 98/qjournal-design.pdf Reference 7/26/2013 Copyright 2013 Trend Micro Inc.

×