SlideShare a Scribd company logo
1 of 16
HDFS Federation


Suresh Srinivas
@suresh_m_s




                  Page 1
Agenda

• HDFS Background

• Current Limitations

• Federation Architecture

• Federation Details

• Next Steps
HDFS Architecture

                                                        Two main layers
                                                        •   Namespace
Block Storage Namespace




                                                            ›   Consists of dirs, files and blocks
                            Namenode
                                        NS                  ›   Supports create, delete, modify and list files or
                                                                dirs operations
                                Block Management


                                                        •   Block Storage
                           Datanode     …    Datanode       ›   Block Management
                                      Storage                   • Datanode cluster membership
                                                                • Supports create/delete/modify/get block location
                                                                  operations
                                                                • Manages replication and replica placement
                                                            ›   Storage - provides read and write access to
                                                                blocks



                                                                   3
HDFS Architecture…

                                                        Implemented as
Block Storage Namespace




                           Namenode                     • Single Namespace Volume
                                        NS                 › Namespace Volume = Namespace +
                                Block Management             Blocks
                                                        • Single namenode with a namespace
                           Datanode     …    Datanode      › Entire namespace is in memory
                                      Storage              › Provides Block Management
                                                        • Datanodes store block replicas
                                                           › Block files stored on local file system




                                                                4
Limitation - Scalability

 Scalability
 •   Storage scales horizontally - namespace doesn’t
 •   Limited number of files, dirs and blocks
     ›   250 million files and blocks at 64GB Namenode heap size
           •   Still a very large cluster
           •   Facebook clusters are sized at ~70 PB storage



 Performance
 •   File system operations throughput limited by a single node
     ›   120K read ops/sec and 6000 write ops/sec
           •   Easily scalable to 20K write ops/sec by code improvements



                                              5
Limitation - Isolation

 Poor Isolation
 •   All the tenants share a single namespace
     ›   Separate volume for tenants is desirable

 •   Lacks separate namespace for different application
     categories or application requirements
     ›   Experimental apps can affect production apps

     ›   Example - HBase could use its own namespace




                                       6
Limitation – Tight Coupling

Namespace and Block Management are distinct services
• Tightly coupled due to co-location
• Scaling block management independent of namespace is simpler
• Simplifies Namespace and scaling it


Block Storage could be a generic service
• Namespace is one of the applications to use the service
• Other services can be built directly on Block Storage
   › HBase
   ›   Foreign namespaces



                                    7
Isolation is a problem for even small
                 clusters




                   8
HDFS Federation
         Namespace       NN-1                    NN-k                   NN-n

                                                                               Foreig
                                NS1                     NS k                   n NS n
                                           ...                    ...


                                  Pool 1             Pool k              Pool n
         Block Storage




                                                    Block Pools




                          Datanode 1               Datanode 2            Datanode m
                                ...                     ...                    ...
                                                 Common Storage


•   Multiple independent Namenodes and Namespace Volumes in a cluster
     ›   Namespace Volume = Namespace + Block Pool
•   Block Storage as generic storage service
     ›   Set of blocks for a Namespace Volume is called a Block Pool
     ›   DNs store blocks for all the Namespace Volumes – no partitioning
Key Ideas & Benefits

•   Distributed Namespace: Partitioned across namenodes
                                                                                        Alternate NN
     ›     Simple and Robust due to independent masters                                Implementation   HBase
                                                                           HDFS
         • Each master serves a namespace volume                         Namespace                      MR tmp

         • Preserves namenode stability – little namenode code change

     ›     Scalability – 6K nodes, 100K tasks, 200PB and 1 billion files

•   Block Pools enable generic storage service                                       Storage Service


     ›     Enables Namespace Volumes to be independent of each other
     ›     Fuels innovation and Rapid development
            • New implementations of file systems and Applications on top of block storage possible
            • New block pool categories – tmp storage, distributed cache, small object storage

•   In future, move Block Management out of namenode to separate set of nodes
     ›     Simplifies namespace/application implementation
            • Distributed namenode becomes significantly simpler
HDFS Federation Details
• Simple design
   › Little change to the Namenode, most changes in Datanode, Config and Tools
   › Core development in 4 months
   › Namespace and Block Management remain in Namenode
      • Block Management could be moved out of namenode in the future

• Little impact on existing deployments
   › Single namenode configuration runs as is

• Datanodes provide storage services for all the namenodes
   › Register with all the namenodes
   › Send periodic heartbeats and block reports to all the namenodes
   › Send block received/deleted for a block pool to corresponding namenode



                                           11
HDFS Federation Details…
• Cluster Web UI for better manageability
    › Provides cluster summary
    › Namenode list and summary of namenode status
    › Decommissioning status
• Tools
    › Decommissioning works with multiple namespace
    › Balancer works with multiple namespaces
       • Both Datanode storage or Block Pool storage can be balanced

• Namenode can be added/deleted in Federated cluster
    › No need to restart the cluster
• Single configuration for all the nodes in the cluster
Managing Namespaces
•   Federation has multiple namespaces – Don’t                                 Client-side
    you need a single global namespace?                                    /
                                                                               mount-table
     ›   Key is to share the data and the names used
         to access the data

•   A global namespace is one way to do that
                                                               data project home     tmp
•   Client-side mount table is another way to
    share.
     ›   Shared mount-table => “global” shared view
     ›   Personalized mount-table => per-application                                  NS4
         view
         • Share the data that matter by mounting it

•   Client-side implementation of mount tables
                                                         NS1       NS2         NS3
     ›   No single point of failure
     ›   No hotspot for root and top level directories
Next Steps

• Complete separation of namespace and block
  management layers
  › Block storage as generic service

• Partial namespace in memory for further scalability
• Move partial namespace from one namenode to
  another
  › Namespace operation - no data copy
Next Steps…
                          • Namenode as a container for namespaces
                             › Lots of small namespaces
                               •   Chosen per user/tenant/data feed

                               •   Mount tables for unified namespace
                                   •    Can be managed by a central volume server
                 …

     Namenodes               › Move namespace from one container to
                               another for balancing
Datanode   …   Datanode
                          • Combined with partial namespace
                             › Choose number of namenodes to match
                               •   Sum of (Namespace working set)

                               •   Sum of (Namespace throughput)




                                   15
Thank You

More information
1. HDFS-1052 – HDFS Scalability with multiple namenodes
2. An Introduction to HDFS Federation –
  https://hortonworks.com/an-introduction-to-hdfs-federation/

More Related Content

What's hot

Introduction to Storm
Introduction to Storm Introduction to Storm
Introduction to Storm
Chandler Huang
 
Boosting I/O Performance with KVM io_uring
Boosting I/O Performance with KVM io_uringBoosting I/O Performance with KVM io_uring
Boosting I/O Performance with KVM io_uring
ShapeBlue
 
Filesystem Comparison: NFS vs GFS2 vs OCFS2
Filesystem Comparison: NFS vs GFS2 vs OCFS2Filesystem Comparison: NFS vs GFS2 vs OCFS2
Filesystem Comparison: NFS vs GFS2 vs OCFS2
Giuseppe Paterno'
 
ZFS
ZFSZFS
Designing Apache Hudi for Incremental Processing With Vinoth Chandar and Etha...
Designing Apache Hudi for Incremental Processing With Vinoth Chandar and Etha...Designing Apache Hudi for Incremental Processing With Vinoth Chandar and Etha...
Designing Apache Hudi for Incremental Processing With Vinoth Chandar and Etha...
HostedbyConfluent
 

What's hot (20)

Apache Spark Architecture
Apache Spark ArchitectureApache Spark Architecture
Apache Spark Architecture
 
Introduction to Storm
Introduction to Storm Introduction to Storm
Introduction to Storm
 
Cassandra Introduction & Features
Cassandra Introduction & FeaturesCassandra Introduction & Features
Cassandra Introduction & Features
 
Intro to HBase
Intro to HBaseIntro to HBase
Intro to HBase
 
Apache ZooKeeper
Apache ZooKeeperApache ZooKeeper
Apache ZooKeeper
 
Apache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic DatasetsApache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic Datasets
 
Boosting I/O Performance with KVM io_uring
Boosting I/O Performance with KVM io_uringBoosting I/O Performance with KVM io_uring
Boosting I/O Performance with KVM io_uring
 
Nosql databases
Nosql databasesNosql databases
Nosql databases
 
Revisiting CephFS MDS and mClock QoS Scheduler
Revisiting CephFS MDS and mClock QoS SchedulerRevisiting CephFS MDS and mClock QoS Scheduler
Revisiting CephFS MDS and mClock QoS Scheduler
 
[OpenInfra Days Korea 2018] Day 2 - CEPH 운영자를 위한 Object Storage Performance T...
[OpenInfra Days Korea 2018] Day 2 - CEPH 운영자를 위한 Object Storage Performance T...[OpenInfra Days Korea 2018] Day 2 - CEPH 운영자를 위한 Object Storage Performance T...
[OpenInfra Days Korea 2018] Day 2 - CEPH 운영자를 위한 Object Storage Performance T...
 
Hadoop Security Architecture
Hadoop Security ArchitectureHadoop Security Architecture
Hadoop Security Architecture
 
Filesystem Comparison: NFS vs GFS2 vs OCFS2
Filesystem Comparison: NFS vs GFS2 vs OCFS2Filesystem Comparison: NFS vs GFS2 vs OCFS2
Filesystem Comparison: NFS vs GFS2 vs OCFS2
 
Data Stores @ Netflix
Data Stores @ NetflixData Stores @ Netflix
Data Stores @ Netflix
 
ZFS
ZFSZFS
ZFS
 
Designing Apache Hudi for Incremental Processing With Vinoth Chandar and Etha...
Designing Apache Hudi for Incremental Processing With Vinoth Chandar and Etha...Designing Apache Hudi for Incremental Processing With Vinoth Chandar and Etha...
Designing Apache Hudi for Incremental Processing With Vinoth Chandar and Etha...
 
Hadoop ecosystem
Hadoop ecosystemHadoop ecosystem
Hadoop ecosystem
 
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
 
Introduction to memcached
Introduction to memcachedIntroduction to memcached
Introduction to memcached
 
Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component
 
Kafka to the Maxka - (Kafka Performance Tuning)
Kafka to the Maxka - (Kafka Performance Tuning)Kafka to the Maxka - (Kafka Performance Tuning)
Kafka to the Maxka - (Kafka Performance Tuning)
 

Viewers also liked

Hadoop 2.0 Architecture | HDFS Federation | NameNode High Availability |
Hadoop 2.0 Architecture | HDFS Federation | NameNode High Availability | Hadoop 2.0 Architecture | HDFS Federation | NameNode High Availability |
Hadoop 2.0 Architecture | HDFS Federation | NameNode High Availability |
Edureka!
 
Hadoop Adventures At Spotify (Strata Conference + Hadoop World 2013)
Hadoop Adventures At Spotify (Strata Conference + Hadoop World 2013)Hadoop Adventures At Spotify (Strata Conference + Hadoop World 2013)
Hadoop Adventures At Spotify (Strata Conference + Hadoop World 2013)
Adam Kawa
 

Viewers also liked (20)

Hadoop 2.0 Architecture | HDFS Federation | NameNode High Availability |
Hadoop 2.0 Architecture | HDFS Federation | NameNode High Availability | Hadoop 2.0 Architecture | HDFS Federation | NameNode High Availability |
Hadoop 2.0 Architecture | HDFS Federation | NameNode High Availability |
 
Apache Hadoop YARN, NameNode HA, HDFS Federation
Apache Hadoop YARN, NameNode HA, HDFS FederationApache Hadoop YARN, NameNode HA, HDFS Federation
Apache Hadoop YARN, NameNode HA, HDFS Federation
 
HDFS NameNode High Availability
HDFS NameNode High AvailabilityHDFS NameNode High Availability
HDFS NameNode High Availability
 
HDFS Namenode High Availability
HDFS Namenode High AvailabilityHDFS Namenode High Availability
HDFS Namenode High Availability
 
Hadoop Adminstration with Latest Release (2.0)
Hadoop Adminstration with Latest Release (2.0)Hadoop Adminstration with Latest Release (2.0)
Hadoop Adminstration with Latest Release (2.0)
 
Hadoop Playlist (Ignite talks at Strata + Hadoop World 2013)
Hadoop Playlist (Ignite talks at Strata + Hadoop World 2013)Hadoop Playlist (Ignite talks at Strata + Hadoop World 2013)
Hadoop Playlist (Ignite talks at Strata + Hadoop World 2013)
 
HDFS Tiered Storage
HDFS Tiered StorageHDFS Tiered Storage
HDFS Tiered Storage
 
Introduction To Elastic MapReduce at WHUG
Introduction To Elastic MapReduce at WHUGIntroduction To Elastic MapReduce at WHUG
Introduction To Elastic MapReduce at WHUG
 
Data Aggregation At Scale Using Apache Flume
Data Aggregation At Scale Using Apache FlumeData Aggregation At Scale Using Apache Flume
Data Aggregation At Scale Using Apache Flume
 
Introduction to Hadoop Ecosystem
Introduction to Hadoop Ecosystem Introduction to Hadoop Ecosystem
Introduction to Hadoop Ecosystem
 
HCatalog
HCatalogHCatalog
HCatalog
 
Quick Introduction to Apache Tez
Quick Introduction to Apache TezQuick Introduction to Apache Tez
Quick Introduction to Apache Tez
 
Hadoop MapReduce Streaming and Pipes
Hadoop MapReduce  Streaming and PipesHadoop MapReduce  Streaming and Pipes
Hadoop MapReduce Streaming and Pipes
 
Hadoop Adventures At Spotify (Strata Conference + Hadoop World 2013)
Hadoop Adventures At Spotify (Strata Conference + Hadoop World 2013)Hadoop Adventures At Spotify (Strata Conference + Hadoop World 2013)
Hadoop Adventures At Spotify (Strata Conference + Hadoop World 2013)
 
Apache Hadoop In Theory And Practice
Apache Hadoop In Theory And PracticeApache Hadoop In Theory And Practice
Apache Hadoop In Theory And Practice
 
Simplified Data Management And Process Scheduling in Hadoop
Simplified Data Management And Process Scheduling in HadoopSimplified Data Management And Process Scheduling in Hadoop
Simplified Data Management And Process Scheduling in Hadoop
 
Advance Map reduce - Apache hadoop Bigdata training by Design Pathshala
Advance Map reduce - Apache hadoop Bigdata training by Design PathshalaAdvance Map reduce - Apache hadoop Bigdata training by Design Pathshala
Advance Map reduce - Apache hadoop Bigdata training by Design Pathshala
 
Building Continuously Curated Ingestion Pipelines
Building Continuously Curated Ingestion PipelinesBuilding Continuously Curated Ingestion Pipelines
Building Continuously Curated Ingestion Pipelines
 
Storage Systems for big data - HDFS, HBase, and intro to KV Store - Redis
Storage Systems for big data - HDFS, HBase, and intro to KV Store - RedisStorage Systems for big data - HDFS, HBase, and intro to KV Store - Redis
Storage Systems for big data - HDFS, HBase, and intro to KV Store - Redis
 
Hadoop combiner and partitioner
Hadoop combiner and partitionerHadoop combiner and partitioner
Hadoop combiner and partitioner
 

Similar to HDFS Federation

Gluster open stack dev summit 042011
Gluster open stack dev summit 042011Gluster open stack dev summit 042011
Gluster open stack dev summit 042011
Open Stack
 
Tailoring NAS Proxies for Virtual Machines
Tailoring NAS Proxies for Virtual MachinesTailoring NAS Proxies for Virtual Machines
Tailoring NAS Proxies for Virtual Machines
The Linux Foundation
 
VDI storage and storage virtualization
VDI storage and storage virtualizationVDI storage and storage virtualization
VDI storage and storage virtualization
Sisimon Soman
 
Apache Hadoop India Summit 2011 Keynote talk "HDFS Federation" by Sanjay Radia
Apache Hadoop India Summit 2011 Keynote talk "HDFS Federation" by Sanjay RadiaApache Hadoop India Summit 2011 Keynote talk "HDFS Federation" by Sanjay Radia
Apache Hadoop India Summit 2011 Keynote talk "HDFS Federation" by Sanjay Radia
Yahoo Developer Network
 

Similar to HDFS Federation (20)

HDFS Futures: NameNode Federation for Improved Efficiency and Scalability
HDFS Futures: NameNode Federation for Improved Efficiency and ScalabilityHDFS Futures: NameNode Federation for Improved Efficiency and Scalability
HDFS Futures: NameNode Federation for Improved Efficiency and Scalability
 
Oct 2012 HUG: Hadoop .Next (0.23) - Customer Impact and Deployment
Oct 2012 HUG: Hadoop .Next (0.23) - Customer Impact and DeploymentOct 2012 HUG: Hadoop .Next (0.23) - Customer Impact and Deployment
Oct 2012 HUG: Hadoop .Next (0.23) - Customer Impact and Deployment
 
March 2011 HUG: HDFS Federation
March 2011 HUG: HDFS FederationMarch 2011 HUG: HDFS Federation
March 2011 HUG: HDFS Federation
 
Inexpensive storage
Inexpensive storageInexpensive storage
Inexpensive storage
 
Giraffa - November 2014
Giraffa - November 2014Giraffa - November 2014
Giraffa - November 2014
 
Gluster open stack dev summit 042011
Gluster open stack dev summit 042011Gluster open stack dev summit 042011
Gluster open stack dev summit 042011
 
Evolving HDFS to Generalized Storage Subsystem
Evolving HDFS to Generalized Storage SubsystemEvolving HDFS to Generalized Storage Subsystem
Evolving HDFS to Generalized Storage Subsystem
 
Tutorial Haddop 2.3
Tutorial Haddop 2.3Tutorial Haddop 2.3
Tutorial Haddop 2.3
 
Tailoring NAS Proxies for Virtual Machines
Tailoring NAS Proxies for Virtual MachinesTailoring NAS Proxies for Virtual Machines
Tailoring NAS Proxies for Virtual Machines
 
Data Analytics presentation.pptx
Data Analytics presentation.pptxData Analytics presentation.pptx
Data Analytics presentation.pptx
 
Index file
Index fileIndex file
Index file
 
Hadoop Distributed File System
Hadoop Distributed File SystemHadoop Distributed File System
Hadoop Distributed File System
 
VDI storage and storage virtualization
VDI storage and storage virtualizationVDI storage and storage virtualization
VDI storage and storage virtualization
 
Hdfs architecture
Hdfs architectureHdfs architecture
Hdfs architecture
 
Future of cloud storage
Future of cloud storageFuture of cloud storage
Future of cloud storage
 
HDFS Federation++
HDFS Federation++HDFS Federation++
HDFS Federation++
 
Hadoop Backup and Disaster Recovery
Hadoop Backup and Disaster RecoveryHadoop Backup and Disaster Recovery
Hadoop Backup and Disaster Recovery
 
Facebook's HBase Backups - StampedeCon 2012
Facebook's HBase Backups - StampedeCon 2012Facebook's HBase Backups - StampedeCon 2012
Facebook's HBase Backups - StampedeCon 2012
 
Apache Hadoop India Summit 2011 Keynote talk "HDFS Federation" by Sanjay Radia
Apache Hadoop India Summit 2011 Keynote talk "HDFS Federation" by Sanjay RadiaApache Hadoop India Summit 2011 Keynote talk "HDFS Federation" by Sanjay Radia
Apache Hadoop India Summit 2011 Keynote talk "HDFS Federation" by Sanjay Radia
 
Gluster Webinar: Introduction to GlusterFS
Gluster Webinar: Introduction to GlusterFSGluster Webinar: Introduction to GlusterFS
Gluster Webinar: Introduction to GlusterFS
 

More from Hortonworks

More from Hortonworks (20)

Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
 
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyIoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
 
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakGetting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with Cloudbreak
 
Johns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsJohns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log Events
 
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysCatch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
 
HDF 3.2 - What's New
HDF 3.2 - What's NewHDF 3.2 - What's New
HDF 3.2 - What's New
 
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerCuring Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
 
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsInterpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
 
IBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeIBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data Landscape
 
Premier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidPremier Inside-Out: Apache Druid
Premier Inside-Out: Apache Druid
 
Accelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleAccelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at Scale
 
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATATIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
 
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
 
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseDelivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
 
Making Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseMaking Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with Ease
 
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationWebinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
 
Driving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementDriving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data Management
 
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
 
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
 
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCUnlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDC
 

Recently uploaded

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 

Recently uploaded (20)

🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 

HDFS Federation

  • 2. Agenda • HDFS Background • Current Limitations • Federation Architecture • Federation Details • Next Steps
  • 3. HDFS Architecture Two main layers • Namespace Block Storage Namespace › Consists of dirs, files and blocks Namenode NS › Supports create, delete, modify and list files or dirs operations Block Management • Block Storage Datanode … Datanode › Block Management Storage • Datanode cluster membership • Supports create/delete/modify/get block location operations • Manages replication and replica placement › Storage - provides read and write access to blocks 3
  • 4. HDFS Architecture… Implemented as Block Storage Namespace Namenode • Single Namespace Volume NS › Namespace Volume = Namespace + Block Management Blocks • Single namenode with a namespace Datanode … Datanode › Entire namespace is in memory Storage › Provides Block Management • Datanodes store block replicas › Block files stored on local file system 4
  • 5. Limitation - Scalability Scalability • Storage scales horizontally - namespace doesn’t • Limited number of files, dirs and blocks › 250 million files and blocks at 64GB Namenode heap size • Still a very large cluster • Facebook clusters are sized at ~70 PB storage Performance • File system operations throughput limited by a single node › 120K read ops/sec and 6000 write ops/sec • Easily scalable to 20K write ops/sec by code improvements 5
  • 6. Limitation - Isolation Poor Isolation • All the tenants share a single namespace › Separate volume for tenants is desirable • Lacks separate namespace for different application categories or application requirements › Experimental apps can affect production apps › Example - HBase could use its own namespace 6
  • 7. Limitation – Tight Coupling Namespace and Block Management are distinct services • Tightly coupled due to co-location • Scaling block management independent of namespace is simpler • Simplifies Namespace and scaling it Block Storage could be a generic service • Namespace is one of the applications to use the service • Other services can be built directly on Block Storage › HBase › Foreign namespaces 7
  • 8. Isolation is a problem for even small clusters 8
  • 9. HDFS Federation Namespace NN-1 NN-k NN-n Foreig NS1 NS k n NS n ... ... Pool 1 Pool k Pool n Block Storage Block Pools Datanode 1 Datanode 2 Datanode m ... ... ... Common Storage • Multiple independent Namenodes and Namespace Volumes in a cluster › Namespace Volume = Namespace + Block Pool • Block Storage as generic storage service › Set of blocks for a Namespace Volume is called a Block Pool › DNs store blocks for all the Namespace Volumes – no partitioning
  • 10. Key Ideas & Benefits • Distributed Namespace: Partitioned across namenodes Alternate NN › Simple and Robust due to independent masters Implementation HBase HDFS • Each master serves a namespace volume Namespace MR tmp • Preserves namenode stability – little namenode code change › Scalability – 6K nodes, 100K tasks, 200PB and 1 billion files • Block Pools enable generic storage service Storage Service › Enables Namespace Volumes to be independent of each other › Fuels innovation and Rapid development • New implementations of file systems and Applications on top of block storage possible • New block pool categories – tmp storage, distributed cache, small object storage • In future, move Block Management out of namenode to separate set of nodes › Simplifies namespace/application implementation • Distributed namenode becomes significantly simpler
  • 11. HDFS Federation Details • Simple design › Little change to the Namenode, most changes in Datanode, Config and Tools › Core development in 4 months › Namespace and Block Management remain in Namenode • Block Management could be moved out of namenode in the future • Little impact on existing deployments › Single namenode configuration runs as is • Datanodes provide storage services for all the namenodes › Register with all the namenodes › Send periodic heartbeats and block reports to all the namenodes › Send block received/deleted for a block pool to corresponding namenode 11
  • 12. HDFS Federation Details… • Cluster Web UI for better manageability › Provides cluster summary › Namenode list and summary of namenode status › Decommissioning status • Tools › Decommissioning works with multiple namespace › Balancer works with multiple namespaces • Both Datanode storage or Block Pool storage can be balanced • Namenode can be added/deleted in Federated cluster › No need to restart the cluster • Single configuration for all the nodes in the cluster
  • 13. Managing Namespaces • Federation has multiple namespaces – Don’t Client-side you need a single global namespace? / mount-table › Key is to share the data and the names used to access the data • A global namespace is one way to do that data project home tmp • Client-side mount table is another way to share. › Shared mount-table => “global” shared view › Personalized mount-table => per-application NS4 view • Share the data that matter by mounting it • Client-side implementation of mount tables NS1 NS2 NS3 › No single point of failure › No hotspot for root and top level directories
  • 14. Next Steps • Complete separation of namespace and block management layers › Block storage as generic service • Partial namespace in memory for further scalability • Move partial namespace from one namenode to another › Namespace operation - no data copy
  • 15. Next Steps… • Namenode as a container for namespaces › Lots of small namespaces • Chosen per user/tenant/data feed • Mount tables for unified namespace • Can be managed by a central volume server … Namenodes › Move namespace from one container to another for balancing Datanode … Datanode • Combined with partial namespace › Choose number of namenodes to match • Sum of (Namespace working set) • Sum of (Namespace throughput) 15
  • 16. Thank You More information 1. HDFS-1052 – HDFS Scalability with multiple namenodes 2. An Introduction to HDFS Federation – https://hortonworks.com/an-introduction-to-hdfs-federation/