Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Inside HDFS Append

7,652 views

Published on

How the new operation of Hadoop Distributed FIle System (HDFS) -- Append works. The internals of the processing. The new states that are more than the write operation.

Published in: Software

Inside HDFS Append

  1. 1. Inside HDFS APPEND Yue Chen http://linkedin.com/in/yuechen2 http://dataera.wordpress.com
  2. 2. http://dataera.wordpress.com http://linkedin.com/in/yuechen2 HDFS Background HDFS: Hadoop Distributed File System Good for: Large Files Streaming Data Access Bad for: Lots of Small Files Random Access
  3. 3. http://dataera.wordpress.com http://linkedin.com/in/yuechen2 HDFS Architecture
  4. 4. http://dataera.wordpress.com http://linkedin.com/in/yuechen2 HDFS Write
  5. 5. http://dataera.wordpress.com http://linkedin.com/in/yuechen2 Before the birth of append, once a file is closed, it is immutable. For database operations, it is expensive. Solution: Append Background
  6. 6. http://dataera.wordpress.com http://linkedin.com/in/yuechen2 Before the birth of append, once a file is closed, it is immutable. For database operations, it is expensive. Solution: Append Background APPEND
  7. 7. http://dataera.wordpress.com http://linkedin.com/in/yuechen2 Key for Designing Append How to guarantee the consistency when something is wrong?
  8. 8. http://dataera.wordpress.com http://linkedin.com/in/yuechen2 Key for Designing Append How to guarantee the consistency when something is wrong? Use more states!
  9. 9. http://dataera.wordpress.com http://linkedin.com/in/yuechen2 States Finalized: Everything is done!
  10. 10. http://dataera.wordpress.com http://linkedin.com/in/yuechen2 States RBW (ReplicaBeingWritten): In write’s pipeline, visible to read
  11. 11. http://dataera.wordpress.com http://linkedin.com/in/yuechen2 States RUR (ReplicaUnderRecovery): Lease is expired, replica is under recovery
  12. 12. http://dataera.wordpress.com http://linkedin.com/in/yuechen2 States RWR (ReplicaWaitingToBeRecovered): If one DN is down, all RBW becomes RWR
  13. 13. http://dataera.wordpress.com http://linkedin.com/in/yuechen2 States Temporary: Replicas are transmitted between DN’s
  14. 14. http://dataera.wordpress.com http://linkedin.com/in/yuechen2 Lease What is a lease? Write lock for file modification, Avoids concurrent write on the same file No lease for reading files
  15. 15. http://dataera.wordpress.com http://linkedin.com/in/yuechen2 Lease Expiration Soft Limit No renewing for 1 minute Other client compete for the lease Hard Limit No renewing for 60 minutes No competition for the lease
  16. 16. http://dataera.wordpress.com http://linkedin.com/in/yuechen2 State Name Node (NN) block, 4 types of states: complete under_construction under_recovery committed Data Node (DN) replica, 5 types of states: Finalized RBW (ReplicaBeingWritten, in write’s pipeline, visible to read) RUR (ReplicaUnderRecovery, lease is expired) RWR (ReplicaWaitingToBeRecovered, if one DN is down, all RBW becomes RWR) Temporary (being transmitted between DN’s)
  17. 17. http://dataera.wordpress.com http://linkedin.com/in/yuechen2 Overview (Hadoop 1.0.0)
  18. 18. http://dataera.wordpress.com http://linkedin.com/in/yuechen2 Overall Procedure From the perspective of Client, append operation firstly calls append of DistributedFileSystem, this operation would return a stream object FSDataOutputStream out. If Client needs to append data to this file, it could calls out.write to write, and calls out.close to close.
  19. 19. http://dataera.wordpress.com http://linkedin.com/in/yuechen2 write/append 1)Normal close DFSOutputStream.close()->FSNamesystem.completeFile()- >commitOrCompleteLastBlock() State of file in NN (Name Node) is INode, not INodeUnderConstruction. 2)Abnormal close The state is INodeUnderConstruction. The lease (write lock) on the file is not released. Lease recovery Block recovery
  20. 20. http://dataera.wordpress.com http://linkedin.com/in/yuechen2 Lease Recovery When file is not normally closed, the last block’s 3 replicas may be in different states (size and generation stamp (version of the block)). The recovery procedure includes checking if the previous lease holder renews the lease, and if the lease exceeds the softLimit (exceeds the time limit); if so, calls internalReleaseLease().
  21. 21. http://dataera.wordpress.com http://linkedin.com/in/yuechen2 Block Recovery Sent with DN’s heartbeat to NN. Find the best state of all replicas, and recover the remaining to this state. State Ranking: Finalized > RBW > RWR > RUR > Temporary When finishing recovery, continues executing (append, write, etc.)
  22. 22. http://dataera.wordpress.com http://linkedin.com/in/yuechen2 Reference http://yanbohappy.sinaapp.com/?p=175 http://blog.csdn.net/chenpingbupt/article/details/7972589 http://hdfs-hadoop.blogspot.com/ http://blog.csdn.net/nexus/article/details/7321150

×