SlideShare a Scribd company logo
1 of 24
1 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Toward Better Multi-
Tenancy Support from
HDFS
Xiaoyu Yao
Email: xyao@hortonworks.com
2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
About myself
⬢ Member of Technical Staff at Hortonworks since 2014
⬢ Apache Hadoop Committer and PMC member.
⬢ Currently working on HDFS.
⬢ This talk is to help better understanding of HDFS multi-tenancy support and ongoing
work for better resource management.
3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Agenda
⬢ Overview
⬢ Hadoop multi-tenancy features
⬢ HDFS resources and multi-tenancy offerings
⬢ HDFS resource management via resource coupon
⬢ Q&A
4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Overview
⬢ Centrally managed infrastructure
–Consolidate to simplify management and lower TCO
–Better utilization and efficiency
⬢ Requirement
–Resource Sharing
–Resource Isolation
–Resource Control
5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Multi-Tenancy Support from Hadoop
Resource
Sharing
Resource
Isolation
Resource
Management
HBASE Y Namespace,
Region Server
Group
Quota
YARN Y Queue, Node Label
...
Capacity Scheduler,
...
HDFS Y Federation Quota,
FairCallQueue,
Backoff
6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
HDFS Resources
⬢ Capacity
–Namespace
–Storage Space
–Storage Type
⬢ Operational Resources
–Namenode
•RPC
–Datanode
•Disk & Network
7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
HDFS Resource Sharing/Isolation – Federation
8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
HDFS Capacity Management – Quota
⬢ Quota
–Namespace
–StorageSpace
–HDFS-7584 Quota by Storage Types
⬢ Limitations
–Static
–Per directory
–No per user/job control
9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
HDFS Operational Resource Management – Namenode RPC
Isolation (1)
⬢Internal RPC
–DN->NN block report, heartbeat, etc.
–ZKFC->NN liveness check
⬢External RPC
–Client RPCs from HDFSClients such as MR jobs/Hive queries/HBase
Client Listener
Reader
Reader
Call Queue
Handler
Handler
Handler
FSN
10 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
HDFS Operational Resource Management – Namenode RPC
Isolation (2)
⬢Use case:
–HFDS access from normal jobs impacted by offending jobs
–Internal RPCs impacted by External RPCs
–One blocked RPC method could affect others
⬢Protect HDFS internal RPCs:
–Dedicated service RPC server/port
•Isolate DN->NN block report, heartbeat, etc.
–Dedicated lifeline RPC server/port
•Protect ZKFC->NN liveness check
⬢All external RPCs go to the default port (e.g., 8020)
11 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
HDFS Resource Management – Name Node RPC Call Queue
⬢ In multi-tenancy scenario, call queue should play an important role like a shock
absorber to accommodate different workload, converting busty arrivals into smooth,
steady departures.
⬢ Good call queue
–queue without call bloat
–catches and handles bursts with no more than a temporary increase of queue delay
–maximum server utilization
⬢ Bad call queue
–queue that exhibits call bloat
–queue filled up and stay filled upon bursts
–low utilization and high queue latency
12 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
HDFS Resource Management - Fair Call Queue
⬢ Before HADOOP-9640 LinkedBlockingQueue
–Single queue
–Client blocked and timeout/fail when queue is full
⬢ HADOOP-9640 - Fair Call Queue
–Multiple priority levels and call queues with different processing priority
–Each RPC is assigned a priority by scheduler
–High priority RPC calls are put into call queue with higher probability of being executed.
Scheduler
Queue 0
Queue ...
Queue 2
Multiplexer (WRR)
13 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
HDFS Resource Management – Namenode RPC Throttling <1>
⬢ HADOOP-10597 Backoff when the call queue is full
–Send back a Retriable exception
–Let the client do exponential wait and retry instead of blocking/timeout/failed
the call.
14 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
HDFS Resource Management – Namenode RPC Throttling <2>
⬢ HADOOP-12916 Backoff based on response time
–The basic idea: Backoff earlier to avoid call queue overload so that namenode
can recover quickly.
–Low priority calls get backed off if response time of high priority call is over
predefined threshold.
–More per user/queue metrics added for trouble shooting.
15 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
HDFS Resource Management – Namenode RPC Throttling <3>
⬢ Abstract scheduler interface from call queue for pluggable RPC priority assignment
–DefaultRpcScheduler: all RPC calls with same priority
–DecayRpcScheduler: from original FairCallQueue priority assigned based on
previous call volumes of users.
–Other experimental schedulers: configurable list of high priority user/group for
low latency jobs, medium priority user/group for normal jobs and low priority
user/group for batch jobs.
16 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
HDFS resource management - QoS
⬢ Use case:
–Allow high performance QoS mechanism with minimum decoding effort on server side
⬢ HADOOP-9194 QoS support for Hadoop RPC
–One bytes in RPC header to facilitate QoS mechanism
–E.g., differentiate OLTP/OLAP, batch/streaming against the same HDFS
⬢ Limitation
–No mechanism level implementation yet
17 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
HDFS resource management with YARN
⬢ Use Case
–Priority inversion without centralized resource management (e.g., RPC calls from high priority
YARN jobs may be put into low priority HDFS namenode call queue)
–Identify and manage ”bad” caller effectively
⬢ Namenode – RPC handler
–FairCallQueue offers the fairness use of namenode RPC handlers
–No guarantee of differentiation
⬢ Datanode – I/O bandwidth
–No differentiation of writer/reader and bandwidth usage.
–Datanode allows static throttling balancer I/O.
18 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
HDFS Namenode Resource Reservation
⬢ HADOOP-13128 propose HDFS namenode resource reservation via resource coupon
–From throttling to manage
–Similar to delegation token in many aspects
–Works for both Kerberos and non-Kerberos cluster
–Allows only privileged service user to request resource coupons from namenode.
–Coupon can be serialized/de-serialized for use within container.
–Coupon can be renewed for long running jobs or canceled after the intended job is finished.
19 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
HDFS Namenode Resource Coupon
⬢ Coupon Identifier
–Finer grain owner (MR job ID, Hive Query ID) to help identify and manage “good” and “bad”
callers
–Resource type (Namenode RPC or Datanode I/O bandwidth)
–Flexible management unit for different resources.
•Min/Max percentage (e.g. Namenode RPC)
•Absolute value (Datanode I/O bandwidth)
20 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
HDFS Namenode Resource Coupon Manager (RCM)
⬢ Grant/Renew/Cancel resource coupon
⬢ Monitor and report resource usage
⬢ Check and validate resource use requests
21 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
HDFS Namenode Resource Pool
HDFS Namenode
Resource Pool
Fairness Pool Managed Pool
Applications supporting
Resource Coupon
(YARN/HBASE)
Legacy Applications
without Resource
Coupon
22 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
HDFS Namenode Resource Coupon Manager (RCM)
NEW
Client
YARN
Resource
Manager
HDFS Namenode
RCM
HDFS Datanode
YARN Node Manager
YARN Container
23 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
HDFS Resource Management – Datanode
⬢ Use case:
–When a client writes to HDFS faster than the disk bandwidth of the DNs, it saturates the disk
bandwidth and put the DNs into an unresponsive state.
–The client only backs off by aborting / recovering the pipeline, which causes failed writes and
unnecessary pipeline recovery.
⬢ Static I/O Throttling
–HDFS-7265 Support HDFS IO throttling
–HDFS-9796 Use a throttler for replica write in datanode
–HDFS-4412 Add throttler for datanode bandwidth
–HADOOP-10410 datanode Qos via ioprio_set on DataXceiver thread
⬢ Dynamic I/O Throttling
–HDFS-7270 Add congestion signaling capability to DataNode write pipline(ECN)
⬢ Future work: I/O bandwidth reservation with resource coupon
24 © Hortonworks Inc. 2011 – 2016. All Rights Reserved24 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Thank you!
Q&A

More Related Content

What's hot

Designing Apache Hudi for Incremental Processing With Vinoth Chandar and Etha...
Designing Apache Hudi for Incremental Processing With Vinoth Chandar and Etha...Designing Apache Hudi for Incremental Processing With Vinoth Chandar and Etha...
Designing Apache Hudi for Incremental Processing With Vinoth Chandar and Etha...
HostedbyConfluent
 

What's hot (20)

Apache HBase™
Apache HBase™Apache HBase™
Apache HBase™
 
Apache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic DatasetsApache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic Datasets
 
Hive+Tez: A performance deep dive
Hive+Tez: A performance deep diveHive+Tez: A performance deep dive
Hive+Tez: A performance deep dive
 
Facebook Presto presentation
Facebook Presto presentationFacebook Presto presentation
Facebook Presto presentation
 
Hadoop Security
Hadoop SecurityHadoop Security
Hadoop Security
 
Flume vs. kafka
Flume vs. kafkaFlume vs. kafka
Flume vs. kafka
 
Hadoop Backup and Disaster Recovery
Hadoop Backup and Disaster RecoveryHadoop Backup and Disaster Recovery
Hadoop Backup and Disaster Recovery
 
What is new in Apache Hive 3.0?
What is new in Apache Hive 3.0?What is new in Apache Hive 3.0?
What is new in Apache Hive 3.0?
 
Redis vs. MongoDB: Comparing In-Memory Databases with Percona Memory Engine
Redis vs. MongoDB: Comparing In-Memory Databases with Percona Memory EngineRedis vs. MongoDB: Comparing In-Memory Databases with Percona Memory Engine
Redis vs. MongoDB: Comparing In-Memory Databases with Percona Memory Engine
 
Hive spark-s3acommitter-hbase-nfs
Hive spark-s3acommitter-hbase-nfsHive spark-s3acommitter-hbase-nfs
Hive spark-s3acommitter-hbase-nfs
 
Pig latin
Pig latinPig latin
Pig latin
 
Designing Apache Hudi for Incremental Processing With Vinoth Chandar and Etha...
Designing Apache Hudi for Incremental Processing With Vinoth Chandar and Etha...Designing Apache Hudi for Incremental Processing With Vinoth Chandar and Etha...
Designing Apache Hudi for Incremental Processing With Vinoth Chandar and Etha...
 
Kafka at Peak Performance
Kafka at Peak PerformanceKafka at Peak Performance
Kafka at Peak Performance
 
How Adobe Does 2 Million Records Per Second Using Apache Spark!
How Adobe Does 2 Million Records Per Second Using Apache Spark!How Adobe Does 2 Million Records Per Second Using Apache Spark!
How Adobe Does 2 Million Records Per Second Using Apache Spark!
 
Building large scale transactional data lake using apache hudi
Building large scale transactional data lake using apache hudiBuilding large scale transactional data lake using apache hudi
Building large scale transactional data lake using apache hudi
 
Ozone and HDFS’s evolution
Ozone and HDFS’s evolutionOzone and HDFS’s evolution
Ozone and HDFS’s evolution
 
Hadoop Security Architecture
Hadoop Security ArchitectureHadoop Security Architecture
Hadoop Security Architecture
 
Hadoop Meetup Jan 2019 - HDFS Scalability and Consistent Reads from Standby Node
Hadoop Meetup Jan 2019 - HDFS Scalability and Consistent Reads from Standby NodeHadoop Meetup Jan 2019 - HDFS Scalability and Consistent Reads from Standby Node
Hadoop Meetup Jan 2019 - HDFS Scalability and Consistent Reads from Standby Node
 
Apache Kudu: Technical Deep Dive


Apache Kudu: Technical Deep Dive

Apache Kudu: Technical Deep Dive


Apache Kudu: Technical Deep Dive


 
The Data Lake Engine Data Microservices in Spark using Apache Arrow Flight
The Data Lake Engine Data Microservices in Spark using Apache Arrow FlightThe Data Lake Engine Data Microservices in Spark using Apache Arrow Flight
The Data Lake Engine Data Microservices in Spark using Apache Arrow Flight
 

Viewers also liked

Pivotal deep dive_on_pivotal_hd_world_class_hdfs_platform
Pivotal deep dive_on_pivotal_hd_world_class_hdfs_platformPivotal deep dive_on_pivotal_hd_world_class_hdfs_platform
Pivotal deep dive_on_pivotal_hd_world_class_hdfs_platform
EMC
 
Pivotal the new_pivotal_big_data_suite_-_revolutionary_foundation_to_leverage...
Pivotal the new_pivotal_big_data_suite_-_revolutionary_foundation_to_leverage...Pivotal the new_pivotal_big_data_suite_-_revolutionary_foundation_to_leverage...
Pivotal the new_pivotal_big_data_suite_-_revolutionary_foundation_to_leverage...
EMC
 

Viewers also liked (6)

Real time analytics using Hadoop and Elasticsearch
Real time analytics using Hadoop and ElasticsearchReal time analytics using Hadoop and Elasticsearch
Real time analytics using Hadoop and Elasticsearch
 
Pivotal HD and Spring for Apache Hadoop
Pivotal HD and Spring for Apache HadoopPivotal HD and Spring for Apache Hadoop
Pivotal HD and Spring for Apache Hadoop
 
Pivotal deep dive_on_pivotal_hd_world_class_hdfs_platform
Pivotal deep dive_on_pivotal_hd_world_class_hdfs_platformPivotal deep dive_on_pivotal_hd_world_class_hdfs_platform
Pivotal deep dive_on_pivotal_hd_world_class_hdfs_platform
 
Hadoop and Your Data Warehouse
Hadoop and Your Data WarehouseHadoop and Your Data Warehouse
Hadoop and Your Data Warehouse
 
Pivotal the new_pivotal_big_data_suite_-_revolutionary_foundation_to_leverage...
Pivotal the new_pivotal_big_data_suite_-_revolutionary_foundation_to_leverage...Pivotal the new_pivotal_big_data_suite_-_revolutionary_foundation_to_leverage...
Pivotal the new_pivotal_big_data_suite_-_revolutionary_foundation_to_leverage...
 
Implementing a Data Lake with Enterprise Grade Data Governance
Implementing a Data Lake with Enterprise Grade Data GovernanceImplementing a Data Lake with Enterprise Grade Data Governance
Implementing a Data Lake with Enterprise Grade Data Governance
 

Similar to Toward Better Multi-Tenancy Support from HDFS

Apache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the unionApache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the union
DataWorks Summit
 

Similar to Toward Better Multi-Tenancy Support from HDFS (20)

Taming the Elephant: Efficient and Effective Apache Hadoop Management
Taming the Elephant: Efficient and Effective Apache Hadoop ManagementTaming the Elephant: Efficient and Effective Apache Hadoop Management
Taming the Elephant: Efficient and Effective Apache Hadoop Management
 
Hadoop Summit - Scheduling policies in YARN - San Jose 2016
Hadoop Summit - Scheduling policies in YARN - San Jose 2016Hadoop Summit - Scheduling policies in YARN - San Jose 2016
Hadoop Summit - Scheduling policies in YARN - San Jose 2016
 
Scheduling Policies in YARN
Scheduling Policies in YARNScheduling Policies in YARN
Scheduling Policies in YARN
 
Apache Hadoop YARN: Past, Present and Future
Apache Hadoop YARN: Past, Present and FutureApache Hadoop YARN: Past, Present and Future
Apache Hadoop YARN: Past, Present and Future
 
Managing enterprise users in Hadoop ecosystem
Managing enterprise users in Hadoop ecosystemManaging enterprise users in Hadoop ecosystem
Managing enterprise users in Hadoop ecosystem
 
Apache Hadoop YARN: Past, Present and Future
Apache Hadoop YARN: Past, Present and FutureApache Hadoop YARN: Past, Present and Future
Apache Hadoop YARN: Past, Present and Future
 
Running Services on YARN
Running Services on YARNRunning Services on YARN
Running Services on YARN
 
Hadoop 3 in a Nutshell
Hadoop 3 in a NutshellHadoop 3 in a Nutshell
Hadoop 3 in a Nutshell
 
Apache Hadoop 3.0 What's new in YARN and MapReduce
Apache Hadoop 3.0 What's new in YARN and MapReduceApache Hadoop 3.0 What's new in YARN and MapReduce
Apache Hadoop 3.0 What's new in YARN and MapReduce
 
An Apache Hive Based Data Warehouse
An Apache Hive Based Data WarehouseAn Apache Hive Based Data Warehouse
An Apache Hive Based Data Warehouse
 
Hive edw-dataworks summit-eu-april-2017
Hive edw-dataworks summit-eu-april-2017Hive edw-dataworks summit-eu-april-2017
Hive edw-dataworks summit-eu-april-2017
 
SAP HANA SPS09 - Multitenant Database Containers
SAP HANA SPS09 - Multitenant Database ContainersSAP HANA SPS09 - Multitenant Database Containers
SAP HANA SPS09 - Multitenant Database Containers
 
Big data spain keynote nov 2016
Big data spain keynote nov 2016Big data spain keynote nov 2016
Big data spain keynote nov 2016
 
Migrating your clusters and workloads from Hadoop 2 to Hadoop 3
Migrating your clusters and workloads from Hadoop 2 to Hadoop 3Migrating your clusters and workloads from Hadoop 2 to Hadoop 3
Migrating your clusters and workloads from Hadoop 2 to Hadoop 3
 
The Enterprise and Connected Data, Trends in the Apache Hadoop Ecosystem by A...
The Enterprise and Connected Data, Trends in the Apache Hadoop Ecosystem by A...The Enterprise and Connected Data, Trends in the Apache Hadoop Ecosystem by A...
The Enterprise and Connected Data, Trends in the Apache Hadoop Ecosystem by A...
 
Hdfs 2016-hadoop-summit-san-jose-v4
Hdfs 2016-hadoop-summit-san-jose-v4Hdfs 2016-hadoop-summit-san-jose-v4
Hdfs 2016-hadoop-summit-san-jose-v4
 
Dataworks Berlin Summit 18' - Apache hadoop YARN State Of The Union
Dataworks Berlin Summit 18' - Apache hadoop YARN State Of The UnionDataworks Berlin Summit 18' - Apache hadoop YARN State Of The Union
Dataworks Berlin Summit 18' - Apache hadoop YARN State Of The Union
 
Apache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the unionApache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the union
 
Apache HBase Internals you hoped you Never Needed to Understand
Apache HBase Internals you hoped you Never Needed to UnderstandApache HBase Internals you hoped you Never Needed to Understand
Apache HBase Internals you hoped you Never Needed to Understand
 
Hortonworks and Platfora in Financial Services - Webinar
Hortonworks and Platfora in Financial Services - WebinarHortonworks and Platfora in Financial Services - Webinar
Hortonworks and Platfora in Financial Services - Webinar
 

More from DataWorks Summit/Hadoop Summit

How Hadoop Makes the Natixis Pack More Efficient
How Hadoop Makes the Natixis Pack More Efficient How Hadoop Makes the Natixis Pack More Efficient
How Hadoop Makes the Natixis Pack More Efficient
DataWorks Summit/Hadoop Summit
 
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS HadoopBreaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
DataWorks Summit/Hadoop Summit
 

More from DataWorks Summit/Hadoop Summit (20)

Running Apache Spark & Apache Zeppelin in Production
Running Apache Spark & Apache Zeppelin in ProductionRunning Apache Spark & Apache Zeppelin in Production
Running Apache Spark & Apache Zeppelin in Production
 
State of Security: Apache Spark & Apache Zeppelin
State of Security: Apache Spark & Apache ZeppelinState of Security: Apache Spark & Apache Zeppelin
State of Security: Apache Spark & Apache Zeppelin
 
Unleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache RangerUnleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache Ranger
 
Enabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science PlatformEnabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science Platform
 
Revolutionize Text Mining with Spark and Zeppelin
Revolutionize Text Mining with Spark and ZeppelinRevolutionize Text Mining with Spark and Zeppelin
Revolutionize Text Mining with Spark and Zeppelin
 
Double Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSenseDouble Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSense
 
Hadoop Crash Course
Hadoop Crash CourseHadoop Crash Course
Hadoop Crash Course
 
Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Apache Spark Crash Course
Apache Spark Crash CourseApache Spark Crash Course
Apache Spark Crash Course
 
Dataflow with Apache NiFi
Dataflow with Apache NiFiDataflow with Apache NiFi
Dataflow with Apache NiFi
 
Schema Registry - Set you Data Free
Schema Registry - Set you Data FreeSchema Registry - Set you Data Free
Schema Registry - Set you Data Free
 
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
 
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
 
Mool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and MLMool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and ML
 
How Hadoop Makes the Natixis Pack More Efficient
How Hadoop Makes the Natixis Pack More Efficient How Hadoop Makes the Natixis Pack More Efficient
How Hadoop Makes the Natixis Pack More Efficient
 
HBase in Practice
HBase in Practice HBase in Practice
HBase in Practice
 
The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)
 
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS HadoopBreaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
 
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
 
Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop
 

Recently uploaded

Recently uploaded (20)

GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 

Toward Better Multi-Tenancy Support from HDFS

  • 1. 1 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Toward Better Multi- Tenancy Support from HDFS Xiaoyu Yao Email: xyao@hortonworks.com
  • 2. 2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved About myself ⬢ Member of Technical Staff at Hortonworks since 2014 ⬢ Apache Hadoop Committer and PMC member. ⬢ Currently working on HDFS. ⬢ This talk is to help better understanding of HDFS multi-tenancy support and ongoing work for better resource management.
  • 3. 3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Agenda ⬢ Overview ⬢ Hadoop multi-tenancy features ⬢ HDFS resources and multi-tenancy offerings ⬢ HDFS resource management via resource coupon ⬢ Q&A
  • 4. 4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Overview ⬢ Centrally managed infrastructure –Consolidate to simplify management and lower TCO –Better utilization and efficiency ⬢ Requirement –Resource Sharing –Resource Isolation –Resource Control
  • 5. 5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Multi-Tenancy Support from Hadoop Resource Sharing Resource Isolation Resource Management HBASE Y Namespace, Region Server Group Quota YARN Y Queue, Node Label ... Capacity Scheduler, ... HDFS Y Federation Quota, FairCallQueue, Backoff
  • 6. 6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved HDFS Resources ⬢ Capacity –Namespace –Storage Space –Storage Type ⬢ Operational Resources –Namenode •RPC –Datanode •Disk & Network
  • 7. 7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved HDFS Resource Sharing/Isolation – Federation
  • 8. 8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved HDFS Capacity Management – Quota ⬢ Quota –Namespace –StorageSpace –HDFS-7584 Quota by Storage Types ⬢ Limitations –Static –Per directory –No per user/job control
  • 9. 9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved HDFS Operational Resource Management – Namenode RPC Isolation (1) ⬢Internal RPC –DN->NN block report, heartbeat, etc. –ZKFC->NN liveness check ⬢External RPC –Client RPCs from HDFSClients such as MR jobs/Hive queries/HBase Client Listener Reader Reader Call Queue Handler Handler Handler FSN
  • 10. 10 © Hortonworks Inc. 2011 – 2016. All Rights Reserved HDFS Operational Resource Management – Namenode RPC Isolation (2) ⬢Use case: –HFDS access from normal jobs impacted by offending jobs –Internal RPCs impacted by External RPCs –One blocked RPC method could affect others ⬢Protect HDFS internal RPCs: –Dedicated service RPC server/port •Isolate DN->NN block report, heartbeat, etc. –Dedicated lifeline RPC server/port •Protect ZKFC->NN liveness check ⬢All external RPCs go to the default port (e.g., 8020)
  • 11. 11 © Hortonworks Inc. 2011 – 2016. All Rights Reserved HDFS Resource Management – Name Node RPC Call Queue ⬢ In multi-tenancy scenario, call queue should play an important role like a shock absorber to accommodate different workload, converting busty arrivals into smooth, steady departures. ⬢ Good call queue –queue without call bloat –catches and handles bursts with no more than a temporary increase of queue delay –maximum server utilization ⬢ Bad call queue –queue that exhibits call bloat –queue filled up and stay filled upon bursts –low utilization and high queue latency
  • 12. 12 © Hortonworks Inc. 2011 – 2016. All Rights Reserved HDFS Resource Management - Fair Call Queue ⬢ Before HADOOP-9640 LinkedBlockingQueue –Single queue –Client blocked and timeout/fail when queue is full ⬢ HADOOP-9640 - Fair Call Queue –Multiple priority levels and call queues with different processing priority –Each RPC is assigned a priority by scheduler –High priority RPC calls are put into call queue with higher probability of being executed. Scheduler Queue 0 Queue ... Queue 2 Multiplexer (WRR)
  • 13. 13 © Hortonworks Inc. 2011 – 2016. All Rights Reserved HDFS Resource Management – Namenode RPC Throttling <1> ⬢ HADOOP-10597 Backoff when the call queue is full –Send back a Retriable exception –Let the client do exponential wait and retry instead of blocking/timeout/failed the call.
  • 14. 14 © Hortonworks Inc. 2011 – 2016. All Rights Reserved HDFS Resource Management – Namenode RPC Throttling <2> ⬢ HADOOP-12916 Backoff based on response time –The basic idea: Backoff earlier to avoid call queue overload so that namenode can recover quickly. –Low priority calls get backed off if response time of high priority call is over predefined threshold. –More per user/queue metrics added for trouble shooting.
  • 15. 15 © Hortonworks Inc. 2011 – 2016. All Rights Reserved HDFS Resource Management – Namenode RPC Throttling <3> ⬢ Abstract scheduler interface from call queue for pluggable RPC priority assignment –DefaultRpcScheduler: all RPC calls with same priority –DecayRpcScheduler: from original FairCallQueue priority assigned based on previous call volumes of users. –Other experimental schedulers: configurable list of high priority user/group for low latency jobs, medium priority user/group for normal jobs and low priority user/group for batch jobs.
  • 16. 16 © Hortonworks Inc. 2011 – 2016. All Rights Reserved HDFS resource management - QoS ⬢ Use case: –Allow high performance QoS mechanism with minimum decoding effort on server side ⬢ HADOOP-9194 QoS support for Hadoop RPC –One bytes in RPC header to facilitate QoS mechanism –E.g., differentiate OLTP/OLAP, batch/streaming against the same HDFS ⬢ Limitation –No mechanism level implementation yet
  • 17. 17 © Hortonworks Inc. 2011 – 2016. All Rights Reserved HDFS resource management with YARN ⬢ Use Case –Priority inversion without centralized resource management (e.g., RPC calls from high priority YARN jobs may be put into low priority HDFS namenode call queue) –Identify and manage ”bad” caller effectively ⬢ Namenode – RPC handler –FairCallQueue offers the fairness use of namenode RPC handlers –No guarantee of differentiation ⬢ Datanode – I/O bandwidth –No differentiation of writer/reader and bandwidth usage. –Datanode allows static throttling balancer I/O.
  • 18. 18 © Hortonworks Inc. 2011 – 2016. All Rights Reserved HDFS Namenode Resource Reservation ⬢ HADOOP-13128 propose HDFS namenode resource reservation via resource coupon –From throttling to manage –Similar to delegation token in many aspects –Works for both Kerberos and non-Kerberos cluster –Allows only privileged service user to request resource coupons from namenode. –Coupon can be serialized/de-serialized for use within container. –Coupon can be renewed for long running jobs or canceled after the intended job is finished.
  • 19. 19 © Hortonworks Inc. 2011 – 2016. All Rights Reserved HDFS Namenode Resource Coupon ⬢ Coupon Identifier –Finer grain owner (MR job ID, Hive Query ID) to help identify and manage “good” and “bad” callers –Resource type (Namenode RPC or Datanode I/O bandwidth) –Flexible management unit for different resources. •Min/Max percentage (e.g. Namenode RPC) •Absolute value (Datanode I/O bandwidth)
  • 20. 20 © Hortonworks Inc. 2011 – 2016. All Rights Reserved HDFS Namenode Resource Coupon Manager (RCM) ⬢ Grant/Renew/Cancel resource coupon ⬢ Monitor and report resource usage ⬢ Check and validate resource use requests
  • 21. 21 © Hortonworks Inc. 2011 – 2016. All Rights Reserved HDFS Namenode Resource Pool HDFS Namenode Resource Pool Fairness Pool Managed Pool Applications supporting Resource Coupon (YARN/HBASE) Legacy Applications without Resource Coupon
  • 22. 22 © Hortonworks Inc. 2011 – 2016. All Rights Reserved HDFS Namenode Resource Coupon Manager (RCM) NEW Client YARN Resource Manager HDFS Namenode RCM HDFS Datanode YARN Node Manager YARN Container
  • 23. 23 © Hortonworks Inc. 2011 – 2016. All Rights Reserved HDFS Resource Management – Datanode ⬢ Use case: –When a client writes to HDFS faster than the disk bandwidth of the DNs, it saturates the disk bandwidth and put the DNs into an unresponsive state. –The client only backs off by aborting / recovering the pipeline, which causes failed writes and unnecessary pipeline recovery. ⬢ Static I/O Throttling –HDFS-7265 Support HDFS IO throttling –HDFS-9796 Use a throttler for replica write in datanode –HDFS-4412 Add throttler for datanode bandwidth –HADOOP-10410 datanode Qos via ioprio_set on DataXceiver thread ⬢ Dynamic I/O Throttling –HDFS-7270 Add congestion signaling capability to DataNode write pipline(ECN) ⬢ Future work: I/O bandwidth reservation with resource coupon
  • 24. 24 © Hortonworks Inc. 2011 – 2016. All Rights Reserved24 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Thank you! Q&A

Editor's Notes

  1. move the yarn pic here
  2. sever/client
  3. bandwidth via ioprio for dfsclient and xceiver thread maybe no standard across OS
  4. Reservation based dynamic throttling utilizes existing DataXceiver bandwidth throttling