6. HDFS Architecture : Computation close to the data Hadoop Cluster Data Data data data data data Data data data data data Data data data data data Data data data data data Data data data data data Data data data data data Data data data data data Data data data data data Data data data data data Data data data data data Data data data data data Data data data data data Block 1 Block 1 Results Data data data data Data data data data Data data data data Data data data data Data data data data Data data data data Data data data data Data data data data Data data data data Block 1 MAP Block 2 Block 2 MAP Reduce Block 2 MAP Block 3 Block 3 Block 3 6
13. Scaling the Name Service: Options Separate Bmaps from NN Not to scale Block-reports for Billions of blocks requires rethinking block layer # clients Good isolation properties 100x 50x Distributed NNs 20x Multiple Namespace volumes Partial NS in memory With Namespace volumes 4x All NS in memory Partial NS (Cache) in memory 1x Archives # names 100M 10B 200M 1B 2B 20B 11
14. Opportunity:Vertical & Horizontal scaling 12 Vertical scaling More RAM, Efficiency in memory usage First class archives (tar/zip like) Partial namespace in main memory Horizontal: Federation Namenode Horizontal scaling/federation benefits: Scale Isolation, Stability, Availability Flexibility Other Namenode implementations or non-HDFS namespaces
17. Note: HDFS has 2 layers today – we are generalizing/extending it.Namespace Foreign NS n NS1 ... ... NS k Block storage 13
18. 1st Phase: B-Pool management inside Namenode Datanode 2 Datanode m Datanode 1 ... ... ... Pools k Pools n Pools 1 Block Pools Balancer NN-n NN-k NN-1 Foreign NS n NS1 ... ... NS k Future: Move Block mgt into separate nodes 14
19. Future: Move block management out 15 Datanode 1 Datanode 2 Datanode m Pools n Pools k Pools 1 ... ... ... Block Pools Balancer Foreign NS n NS1 ... ... NS k Easier to scale horizontally than the name server 1. Open client Block Manager 2. getBlockLocations 3. ReadBlock
20. What is a HDFS Cluster Current HDFS Cluster 1 Namespace A set of blocks Implemented as 1 Namenode Set of DNs New HDFS Cluster N Namespaces Set of block-pools Each block-pool is set of blocks Phase 1: 1 BP per NS Implies N block-pools Implemented as N Namenode Set of DNs Each DN stores the blocks for each block-pool 16
21. Managing Namespaces HDFS Namespaces as a first class entity Many many namespaces: one per-user or per-project Why? Because it can’t fit in a server? No Pieces of data are often autonomous Log data from different dates Photos/videos loaded by a user A user’s mail, or his home directory The key is sharing the data A global namespace is one way to do that – but even there we talk of several large “global” namespaces Client-side mount table is another way to share Shared mount-table => “global” shared view Personalized mount-table => per-application view Share the data that matter by mounting it 17 Plan 9, Spring OS: dad personalized namespaces
22. 18 HDFS Federation Across Clusters / Application mount-table in Cluster 1 / Application mount-table in Cluster 2 home tmp home tmp data project project data Cluster 2 Cluster 1