SlideShare a Scribd company logo
1 of 21
HDFS FederationSanjay Radia, Hadoop Architect Yahoo! Inc 1
Outline HDFS - Quick overview Scaling HDFS - Federation Hadoop Components
3
4 HDFS b1 b3 b1 b3 b3 b2 b2 b4 b2 b5 b5 b3 b6 b4 b5 Namespace Metadata & Journal Backup Namenode Namenode Namespace State Block Map Block ID  Block Locations Hierarchal Namespace File Name   BlockIDs Heartbeats & Block Reports Datanodes Block ID  Data Horizontally Scale IO and Storage
5 HDFSClient reads and writes b1 b3 b1 b3 b3 b2 b2 b4 b2 b5 b5 b3 b6 b4 b5 Namenode Namespace State Block Map 1 create 1 open Client Client End-to-end checksum 2 read 2 write write write Datanodes
HDFS Architecture :	 Computation close to the data Hadoop Cluster Data Data data data data data Data data data data data Data data data data data Data data data data data Data data data data data Data data data data data Data data data data data Data data data data data Data data data data data Data data data data data Data data data data data Data data data data data Block 1 Block 1 Results Data data data data Data data data data Data data data data Data data data data Data data data data Data data data data Data data data data Data data data data Data data data data Block 1 MAP Block 2 Block 2 MAP Reduce Block 2 MAP Block 3 Block 3 Block 3 6
Quiz: What Is the Common Attribute? 7
HDFS Actively maintain data reliability b1 b3 b1 b3 b3 b2 b2 b4 b2 b5 b5 b3 b6 b4 b5 Namenode Namespace State Block Map Bad/lost block replica Periodically check block checksums 1. replicate 3. blockReceived 2. copy Datanodes
Hadoop at Yahoo! 1M+ Monthly Hadoop Jobs 9
Scaling Hadoop Early Gains ,[object Object],Namespace is all in RAM, simpler locking Improved memory usage in 0.16, JVM Heap configuration (Suresh Srinivas) Growth of number of files and storage is limited by adding RAM to namenode 50G heap = 200M “fs objects” = 100M names + 100MBlocks ,[object Object]
4K nodes-  Job Tracker carries out both job lifecycle management and scheduling Yahoo’s Response: ,[object Object]
Next Generation of Map-Reduce - Complete overhaul of job tracker/task trackerGoal:  ,[object Object],6 May 2010 10
Scaling the Name Service: Options Separate Bmaps from NN Not to scale Block-reports for Billions of blocks requires rethinking  block layer # clients Good isolation  properties 100x 50x Distributed NNs 20x Multiple  Namespace  volumes Partial NS in memory With Namespace  volumes  4x All NS  in memory Partial  NS (Cache)  in memory 1x Archives # names 100M 10B 200M 1B 2B 20B 11
Opportunity:Vertical & Horizontal scaling 12 Vertical scaling More RAM, Efficiency in memory usage First class archives (tar/zip like) Partial namespace in main memory Horizontal: Federation Namenode Horizontal scaling/federation benefits: Scale Isolation, Stability, Availability Flexibility Other Namenode implementations or non-HDFS namespaces
Datanode 1 Datanode 2 Datanode m Pools  n Pools  1 Pools  k ... ... ...             Block      Pools Balancer Block (Object) Storage Subsystem Block (Object) Storage Subsystem ,[object Object]
Namespaces (HDFS, others) use one or more block-pools
Note: HDFS has 2 layers today – we are generalizing/extending it.Namespace Foreign NS n           NS1 ... ...           NS k Block storage 13
1st Phase: B-Pool management inside Namenode Datanode 2 Datanode m Datanode 1 ... ... ... Pools  k Pools  n Pools  1             Block      Pools Balancer NN-n NN-k NN-1 Foreign NS n           NS1 ... ...           NS k Future: Move Block mgt into separate nodes 14
Future: 	Move block management out 15 Datanode 1 Datanode 2 Datanode m Pools  n Pools  k Pools  1 ... ... ...             Block      Pools Balancer Foreign NS n           NS1 ... ...           NS k Easier to scale horizontally than the name server 1. Open client Block Manager 2. getBlockLocations 3. ReadBlock
What is a HDFS Cluster Current HDFS Cluster 1 Namespace A set of blocks Implemented as 1 Namenode Set of DNs New HDFS Cluster N Namespaces  Set of block-pools Each block-pool is set of blocks Phase 1:  1 BP per NS Implies N block-pools Implemented as N Namenode Set of DNs Each DN stores the blocks for each block-pool 16
Managing Namespaces HDFS Namespaces as a first class entity Many many namespaces: one per-user or per-project Why? Because it can’t fit in a server? No Pieces of data are often autonomous Log data from different dates Photos/videos loaded by a user A user’s mail, or his home directory The key is sharing the data A global namespace is one way to do that – but even there we talk of several large “global” namespaces Client-side mount table is another way to share Shared mount-table => “global” shared view Personalized mount-table => per-application view Share the data that matter by mounting it 17 Plan 9, Spring OS: dad personalized namespaces

More Related Content

What's hot

Hadoop Distributed File System
Hadoop Distributed File SystemHadoop Distributed File System
Hadoop Distributed File Systemelliando dias
 
Hadoop Interacting with HDFS
Hadoop Interacting with HDFSHadoop Interacting with HDFS
Hadoop Interacting with HDFSApache Apex
 
Hadoop operations basic
Hadoop operations basicHadoop operations basic
Hadoop operations basicHafizur Rahman
 
Hadoop basic commands
Hadoop basic commandsHadoop basic commands
Hadoop basic commandsbispsolutions
 
Ravi Namboori Hadoop & HDFS Architecture
Ravi Namboori Hadoop & HDFS ArchitectureRavi Namboori Hadoop & HDFS Architecture
Ravi Namboori Hadoop & HDFS ArchitectureRavi namboori
 
Hadoop Distributed File System
Hadoop Distributed File SystemHadoop Distributed File System
Hadoop Distributed File SystemAnand Kulkarni
 
Hadoop Distributed File System
Hadoop Distributed File SystemHadoop Distributed File System
Hadoop Distributed File SystemRutvik Bapat
 
Hadoop HDFS Detailed Introduction
Hadoop HDFS Detailed IntroductionHadoop HDFS Detailed Introduction
Hadoop HDFS Detailed IntroductionHanborq Inc.
 
Hadoop HDFS Architeture and Design
Hadoop HDFS Architeture and DesignHadoop HDFS Architeture and Design
Hadoop HDFS Architeture and Designsudhakara st
 
Snapshot in Hadoop Distributed File System
Snapshot in Hadoop Distributed File SystemSnapshot in Hadoop Distributed File System
Snapshot in Hadoop Distributed File SystemBhavesh Padharia
 
Hadoop Distributed File System
Hadoop Distributed File SystemHadoop Distributed File System
Hadoop Distributed File SystemVaibhav Jain
 
HDFS introduction
HDFS introductionHDFS introduction
HDFS introductioninjae yeo
 

What's hot (20)

HDFS_Command_Reference
HDFS_Command_ReferenceHDFS_Command_Reference
HDFS_Command_Reference
 
Understanding Hadoop
Understanding HadoopUnderstanding Hadoop
Understanding Hadoop
 
Hadoop Distributed File System
Hadoop Distributed File SystemHadoop Distributed File System
Hadoop Distributed File System
 
Big data- HDFS(2nd presentation)
Big data- HDFS(2nd presentation)Big data- HDFS(2nd presentation)
Big data- HDFS(2nd presentation)
 
Hadoop Interacting with HDFS
Hadoop Interacting with HDFSHadoop Interacting with HDFS
Hadoop Interacting with HDFS
 
Hadoop operations basic
Hadoop operations basicHadoop operations basic
Hadoop operations basic
 
Hadoop basic commands
Hadoop basic commandsHadoop basic commands
Hadoop basic commands
 
Hadoop Architecture
Hadoop ArchitectureHadoop Architecture
Hadoop Architecture
 
Ravi Namboori Hadoop & HDFS Architecture
Ravi Namboori Hadoop & HDFS ArchitectureRavi Namboori Hadoop & HDFS Architecture
Ravi Namboori Hadoop & HDFS Architecture
 
Introduction to HDFS
Introduction to HDFSIntroduction to HDFS
Introduction to HDFS
 
Hadoop Distributed File System
Hadoop Distributed File SystemHadoop Distributed File System
Hadoop Distributed File System
 
Hadoop Distributed File System
Hadoop Distributed File SystemHadoop Distributed File System
Hadoop Distributed File System
 
Hadoop HDFS Detailed Introduction
Hadoop HDFS Detailed IntroductionHadoop HDFS Detailed Introduction
Hadoop HDFS Detailed Introduction
 
Hadoop HDFS Architeture and Design
Hadoop HDFS Architeture and DesignHadoop HDFS Architeture and Design
Hadoop HDFS Architeture and Design
 
Snapshot in Hadoop Distributed File System
Snapshot in Hadoop Distributed File SystemSnapshot in Hadoop Distributed File System
Snapshot in Hadoop Distributed File System
 
Hadoop Distributed File System
Hadoop Distributed File SystemHadoop Distributed File System
Hadoop Distributed File System
 
Hadoop hdfs
Hadoop hdfsHadoop hdfs
Hadoop hdfs
 
HDFS introduction
HDFS introductionHDFS introduction
HDFS introduction
 
Hadoop HDFS
Hadoop HDFSHadoop HDFS
Hadoop HDFS
 
Hadoop introduction
Hadoop introductionHadoop introduction
Hadoop introduction
 

Similar to HDFS Federation: Scaling HDFS Through NameNode Federation

Hadoop Architecture | HDFS Architecture | Hadoop Architecture Tutorial | HDFS...
Hadoop Architecture | HDFS Architecture | Hadoop Architecture Tutorial | HDFS...Hadoop Architecture | HDFS Architecture | Hadoop Architecture Tutorial | HDFS...
Hadoop Architecture | HDFS Architecture | Hadoop Architecture Tutorial | HDFS...Simplilearn
 
HDFS Federation++
HDFS Federation++HDFS Federation++
HDFS Federation++Hortonworks
 
Hadoop Interview Questions And Answers Part-1 | Big Data Interview Questions ...
Hadoop Interview Questions And Answers Part-1 | Big Data Interview Questions ...Hadoop Interview Questions And Answers Part-1 | Big Data Interview Questions ...
Hadoop Interview Questions And Answers Part-1 | Big Data Interview Questions ...Simplilearn
 
Dynamic Namespace Partitioning with Giraffa File System
Dynamic Namespace Partitioning with Giraffa File SystemDynamic Namespace Partitioning with Giraffa File System
Dynamic Namespace Partitioning with Giraffa File SystemDataWorks Summit
 
Hadoop training in bangalore-kellytechnologies
Hadoop training in bangalore-kellytechnologiesHadoop training in bangalore-kellytechnologies
Hadoop training in bangalore-kellytechnologiesappaji intelhunt
 
Hadoop Architecture and HDFS
Hadoop Architecture and HDFSHadoop Architecture and HDFS
Hadoop Architecture and HDFSEdureka!
 
Hadoop Interview Questions and Answers
Hadoop Interview Questions and AnswersHadoop Interview Questions and Answers
Hadoop Interview Questions and AnswersMindsMapped Consulting
 
Hadoop distributed file system
Hadoop distributed file systemHadoop distributed file system
Hadoop distributed file systemsrikanthhadoop
 
Hadoop security
Hadoop securityHadoop security
Hadoop securityBiju Nair
 
HDFS+basics.pptx
HDFS+basics.pptxHDFS+basics.pptx
HDFS+basics.pptxAyush .
 
Sep 2012 HUG: Giraffa File System to Grow Hadoop Bigger
Sep 2012 HUG: Giraffa File System to Grow Hadoop Bigger Sep 2012 HUG: Giraffa File System to Grow Hadoop Bigger
Sep 2012 HUG: Giraffa File System to Grow Hadoop Bigger Yahoo Developer Network
 

Similar to HDFS Federation: Scaling HDFS Through NameNode Federation (20)

March 2011 HUG: HDFS Federation
March 2011 HUG: HDFS FederationMarch 2011 HUG: HDFS Federation
March 2011 HUG: HDFS Federation
 
Hadoop Architecture | HDFS Architecture | Hadoop Architecture Tutorial | HDFS...
Hadoop Architecture | HDFS Architecture | Hadoop Architecture Tutorial | HDFS...Hadoop Architecture | HDFS Architecture | Hadoop Architecture Tutorial | HDFS...
Hadoop Architecture | HDFS Architecture | Hadoop Architecture Tutorial | HDFS...
 
HDFS Federation++
HDFS Federation++HDFS Federation++
HDFS Federation++
 
Hadoop -HDFS.ppt
Hadoop -HDFS.pptHadoop -HDFS.ppt
Hadoop -HDFS.ppt
 
Hadoop Interview Questions And Answers Part-1 | Big Data Interview Questions ...
Hadoop Interview Questions And Answers Part-1 | Big Data Interview Questions ...Hadoop Interview Questions And Answers Part-1 | Big Data Interview Questions ...
Hadoop Interview Questions And Answers Part-1 | Big Data Interview Questions ...
 
Tutorial Haddop 2.3
Tutorial Haddop 2.3Tutorial Haddop 2.3
Tutorial Haddop 2.3
 
Hadoop
HadoopHadoop
Hadoop
 
Dynamic Namespace Partitioning with Giraffa File System
Dynamic Namespace Partitioning with Giraffa File SystemDynamic Namespace Partitioning with Giraffa File System
Dynamic Namespace Partitioning with Giraffa File System
 
Lecture 2 part 1
Lecture 2 part 1Lecture 2 part 1
Lecture 2 part 1
 
Hadoop training in bangalore-kellytechnologies
Hadoop training in bangalore-kellytechnologiesHadoop training in bangalore-kellytechnologies
Hadoop training in bangalore-kellytechnologies
 
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS HadoopBreaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
 
HDFS.ppt
HDFS.pptHDFS.ppt
HDFS.ppt
 
Hadoop and HDFS
Hadoop and HDFSHadoop and HDFS
Hadoop and HDFS
 
Hadoop Architecture and HDFS
Hadoop Architecture and HDFSHadoop Architecture and HDFS
Hadoop Architecture and HDFS
 
Hadoop Interview Questions and Answers
Hadoop Interview Questions and AnswersHadoop Interview Questions and Answers
Hadoop Interview Questions and Answers
 
Hadoop distributed file system
Hadoop distributed file systemHadoop distributed file system
Hadoop distributed file system
 
Hadoop security
Hadoop securityHadoop security
Hadoop security
 
HDFS+basics.pptx
HDFS+basics.pptxHDFS+basics.pptx
HDFS+basics.pptx
 
Evolving HDFS to a Generalized Storage Subsystem
Evolving HDFS to a Generalized Storage SubsystemEvolving HDFS to a Generalized Storage Subsystem
Evolving HDFS to a Generalized Storage Subsystem
 
Sep 2012 HUG: Giraffa File System to Grow Hadoop Bigger
Sep 2012 HUG: Giraffa File System to Grow Hadoop Bigger Sep 2012 HUG: Giraffa File System to Grow Hadoop Bigger
Sep 2012 HUG: Giraffa File System to Grow Hadoop Bigger
 

More from Yahoo Developer Network

Developing Mobile Apps for Performance - Swapnil Patel, Verizon Media
Developing Mobile Apps for Performance - Swapnil Patel, Verizon MediaDeveloping Mobile Apps for Performance - Swapnil Patel, Verizon Media
Developing Mobile Apps for Performance - Swapnil Patel, Verizon MediaYahoo Developer Network
 
Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infras...
Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infras...Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infras...
Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infras...Yahoo Developer Network
 
Athenz & SPIFFE, Tatsuya Yano, Yahoo Japan
Athenz & SPIFFE, Tatsuya Yano, Yahoo JapanAthenz & SPIFFE, Tatsuya Yano, Yahoo Japan
Athenz & SPIFFE, Tatsuya Yano, Yahoo JapanYahoo Developer Network
 
Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tat...
Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tat...Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tat...
Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tat...Yahoo Developer Network
 
Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath
Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, OathBig Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath
Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, OathYahoo Developer Network
 
How @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenu
How @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenuHow @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenu
How @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenuYahoo Developer Network
 
The Future of Hadoop in an AI World, Milind Bhandarkar, CEO, Ampool
The Future of Hadoop in an AI World, Milind Bhandarkar, CEO, AmpoolThe Future of Hadoop in an AI World, Milind Bhandarkar, CEO, Ampool
The Future of Hadoop in an AI World, Milind Bhandarkar, CEO, AmpoolYahoo Developer Network
 
Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...
Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...
Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...Yahoo Developer Network
 
Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...
Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...
Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...Yahoo Developer Network
 
HDFS Scalability and Security, Daryn Sharp, Senior Engineer, Oath
HDFS Scalability and Security, Daryn Sharp, Senior Engineer, OathHDFS Scalability and Security, Daryn Sharp, Senior Engineer, Oath
HDFS Scalability and Security, Daryn Sharp, Senior Engineer, OathYahoo Developer Network
 
Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...
Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...
Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...Yahoo Developer Network
 
Moving the Oath Grid to Docker, Eric Badger, Oath
Moving the Oath Grid to Docker, Eric Badger, OathMoving the Oath Grid to Docker, Eric Badger, Oath
Moving the Oath Grid to Docker, Eric Badger, OathYahoo Developer Network
 
Architecting Petabyte Scale AI Applications
Architecting Petabyte Scale AI ApplicationsArchitecting Petabyte Scale AI Applications
Architecting Petabyte Scale AI ApplicationsYahoo Developer Network
 
Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...
Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...
Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...Yahoo Developer Network
 
Jun 2017 HUG: YARN Scheduling – A Step Beyond
Jun 2017 HUG: YARN Scheduling – A Step BeyondJun 2017 HUG: YARN Scheduling – A Step Beyond
Jun 2017 HUG: YARN Scheduling – A Step BeyondYahoo Developer Network
 
Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies
Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies
Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies Yahoo Developer Network
 
February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...
February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...
February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...Yahoo Developer Network
 
February 2017 HUG: Exactly-once end-to-end processing with Apache Apex
February 2017 HUG: Exactly-once end-to-end processing with Apache ApexFebruary 2017 HUG: Exactly-once end-to-end processing with Apache Apex
February 2017 HUG: Exactly-once end-to-end processing with Apache ApexYahoo Developer Network
 
February 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics
February 2017 HUG: Data Sketches: A required toolkit for Big Data AnalyticsFebruary 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics
February 2017 HUG: Data Sketches: A required toolkit for Big Data AnalyticsYahoo Developer Network
 

More from Yahoo Developer Network (20)

Developing Mobile Apps for Performance - Swapnil Patel, Verizon Media
Developing Mobile Apps for Performance - Swapnil Patel, Verizon MediaDeveloping Mobile Apps for Performance - Swapnil Patel, Verizon Media
Developing Mobile Apps for Performance - Swapnil Patel, Verizon Media
 
Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infras...
Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infras...Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infras...
Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infras...
 
Athenz & SPIFFE, Tatsuya Yano, Yahoo Japan
Athenz & SPIFFE, Tatsuya Yano, Yahoo JapanAthenz & SPIFFE, Tatsuya Yano, Yahoo Japan
Athenz & SPIFFE, Tatsuya Yano, Yahoo Japan
 
Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tat...
Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tat...Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tat...
Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tat...
 
CICD at Oath using Screwdriver
CICD at Oath using ScrewdriverCICD at Oath using Screwdriver
CICD at Oath using Screwdriver
 
Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath
Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, OathBig Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath
Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath
 
How @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenu
How @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenuHow @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenu
How @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenu
 
The Future of Hadoop in an AI World, Milind Bhandarkar, CEO, Ampool
The Future of Hadoop in an AI World, Milind Bhandarkar, CEO, AmpoolThe Future of Hadoop in an AI World, Milind Bhandarkar, CEO, Ampool
The Future of Hadoop in an AI World, Milind Bhandarkar, CEO, Ampool
 
Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...
Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...
Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...
 
Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...
Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...
Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...
 
HDFS Scalability and Security, Daryn Sharp, Senior Engineer, Oath
HDFS Scalability and Security, Daryn Sharp, Senior Engineer, OathHDFS Scalability and Security, Daryn Sharp, Senior Engineer, Oath
HDFS Scalability and Security, Daryn Sharp, Senior Engineer, Oath
 
Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...
Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...
Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...
 
Moving the Oath Grid to Docker, Eric Badger, Oath
Moving the Oath Grid to Docker, Eric Badger, OathMoving the Oath Grid to Docker, Eric Badger, Oath
Moving the Oath Grid to Docker, Eric Badger, Oath
 
Architecting Petabyte Scale AI Applications
Architecting Petabyte Scale AI ApplicationsArchitecting Petabyte Scale AI Applications
Architecting Petabyte Scale AI Applications
 
Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...
Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...
Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...
 
Jun 2017 HUG: YARN Scheduling – A Step Beyond
Jun 2017 HUG: YARN Scheduling – A Step BeyondJun 2017 HUG: YARN Scheduling – A Step Beyond
Jun 2017 HUG: YARN Scheduling – A Step Beyond
 
Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies
Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies
Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies
 
February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...
February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...
February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...
 
February 2017 HUG: Exactly-once end-to-end processing with Apache Apex
February 2017 HUG: Exactly-once end-to-end processing with Apache ApexFebruary 2017 HUG: Exactly-once end-to-end processing with Apache Apex
February 2017 HUG: Exactly-once end-to-end processing with Apache Apex
 
February 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics
February 2017 HUG: Data Sketches: A required toolkit for Big Data AnalyticsFebruary 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics
February 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics
 

HDFS Federation: Scaling HDFS Through NameNode Federation

  • 1. HDFS FederationSanjay Radia, Hadoop Architect Yahoo! Inc 1
  • 2. Outline HDFS - Quick overview Scaling HDFS - Federation Hadoop Components
  • 3. 3
  • 4. 4 HDFS b1 b3 b1 b3 b3 b2 b2 b4 b2 b5 b5 b3 b6 b4 b5 Namespace Metadata & Journal Backup Namenode Namenode Namespace State Block Map Block ID  Block Locations Hierarchal Namespace File Name  BlockIDs Heartbeats & Block Reports Datanodes Block ID  Data Horizontally Scale IO and Storage
  • 5. 5 HDFSClient reads and writes b1 b3 b1 b3 b3 b2 b2 b4 b2 b5 b5 b3 b6 b4 b5 Namenode Namespace State Block Map 1 create 1 open Client Client End-to-end checksum 2 read 2 write write write Datanodes
  • 6. HDFS Architecture : Computation close to the data Hadoop Cluster Data Data data data data data Data data data data data Data data data data data Data data data data data Data data data data data Data data data data data Data data data data data Data data data data data Data data data data data Data data data data data Data data data data data Data data data data data Block 1 Block 1 Results Data data data data Data data data data Data data data data Data data data data Data data data data Data data data data Data data data data Data data data data Data data data data Block 1 MAP Block 2 Block 2 MAP Reduce Block 2 MAP Block 3 Block 3 Block 3 6
  • 7. Quiz: What Is the Common Attribute? 7
  • 8. HDFS Actively maintain data reliability b1 b3 b1 b3 b3 b2 b2 b4 b2 b5 b5 b3 b6 b4 b5 Namenode Namespace State Block Map Bad/lost block replica Periodically check block checksums 1. replicate 3. blockReceived 2. copy Datanodes
  • 9. Hadoop at Yahoo! 1M+ Monthly Hadoop Jobs 9
  • 10.
  • 11.
  • 12.
  • 13. Scaling the Name Service: Options Separate Bmaps from NN Not to scale Block-reports for Billions of blocks requires rethinking block layer # clients Good isolation properties 100x 50x Distributed NNs 20x Multiple Namespace volumes Partial NS in memory With Namespace volumes 4x All NS in memory Partial NS (Cache) in memory 1x Archives # names 100M 10B 200M 1B 2B 20B 11
  • 14. Opportunity:Vertical & Horizontal scaling 12 Vertical scaling More RAM, Efficiency in memory usage First class archives (tar/zip like) Partial namespace in main memory Horizontal: Federation Namenode Horizontal scaling/federation benefits: Scale Isolation, Stability, Availability Flexibility Other Namenode implementations or non-HDFS namespaces
  • 15.
  • 16. Namespaces (HDFS, others) use one or more block-pools
  • 17. Note: HDFS has 2 layers today – we are generalizing/extending it.Namespace Foreign NS n NS1 ... ... NS k Block storage 13
  • 18. 1st Phase: B-Pool management inside Namenode Datanode 2 Datanode m Datanode 1 ... ... ... Pools k Pools n Pools 1 Block Pools Balancer NN-n NN-k NN-1 Foreign NS n NS1 ... ... NS k Future: Move Block mgt into separate nodes 14
  • 19. Future: Move block management out 15 Datanode 1 Datanode 2 Datanode m Pools n Pools k Pools 1 ... ... ... Block Pools Balancer Foreign NS n NS1 ... ... NS k Easier to scale horizontally than the name server 1. Open client Block Manager 2. getBlockLocations 3. ReadBlock
  • 20. What is a HDFS Cluster Current HDFS Cluster 1 Namespace A set of blocks Implemented as 1 Namenode Set of DNs New HDFS Cluster N Namespaces Set of block-pools Each block-pool is set of blocks Phase 1: 1 BP per NS Implies N block-pools Implemented as N Namenode Set of DNs Each DN stores the blocks for each block-pool 16
  • 21. Managing Namespaces HDFS Namespaces as a first class entity Many many namespaces: one per-user or per-project Why? Because it can’t fit in a server? No Pieces of data are often autonomous Log data from different dates Photos/videos loaded by a user A user’s mail, or his home directory The key is sharing the data A global namespace is one way to do that – but even there we talk of several large “global” namespaces Client-side mount table is another way to share Shared mount-table => “global” shared view Personalized mount-table => per-application view Share the data that matter by mounting it 17 Plan 9, Spring OS: dad personalized namespaces
  • 22. 18 HDFS Federation Across Clusters / Application mount-table in Cluster 1 / Application mount-table in Cluster 2 home tmp home tmp data project project data Cluster 2 Cluster 1
  • 23.
  • 24.
  • 25. Selected based on isolation and capacity
  • 26. A namespace can be moved between nameserver19 … Nameserver Nameserver … Shared persistent storage for namespace metadata (e.g. Book keeper)
  • 27.
  • 28. Q & A 21

Editor's Notes

  1. The data nodes not have RAID, just JBOD
  2. Replication is rack aware
  3. 50K blocks (50MB block size)48GB heap -= 180M object = 90M files, 90M blocks = 14PB (includes overhead of 3 replicas)