Apache Hadoop 3.0 What's new in YARN and MapReduce

1 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Apache Hadoop 3.0:
What’s new in
YARN & MapReduce
Tokyo, Oct.26 2016
Junping Du
junping_du@apache.org

About Speakers
⬢ Junping Du
– Apache Hadoop Committer & PMC member
– Lead Software Engineer @ Hortonworks YARN Core Team
– 10+ years for developing enterprise software (5+ years for being “Hadooper”)

Agenda
⬢ Evolutions in YARN & MR (Done and In Progress)
⬢ Timeline Estimation for Apache Hadoop 3.0 Release

First, A bit of Vision…
⬢ Evolution of Hadoop start with YARN
⬢ YARN Evolution will continue to drive Hadoop forward
Hadoop 3

Several important trends in age of Hadoop 3.0 +
YARN and Other Platform Services
Storage
Resource
Management Security
Service
Discovery Management
Monitoring
Alerts
IOT Assembly
Kafka Storm HBase Solr
Governance
MR Tez Spark …
Innovating
frameworks:
Flink,
DL(TensorFlow),
etc.
Various Environments
On Premise Private Cloud Public Cloud

Evolutions in YARN & MR
⬢ Re-architecture for YARN Timeline Service - ATS v2
⬢ Service Native Support in YARN
⬢ YARN Scheduling Enhancements
⬢ More Cloud Friendly
⬢ Better User Experiences
⬢ Other Enhancements

Timeline Service Revolution – ATS v2
⬢ Why ATS v2?
– Scalability & Performance
To get rid of v1 limitation:
•Single global instance of
writer/reader
•Local disk based LevelDB storage
– Usability
•Handle flows as first-class
concepts and model aggregation
•Add configuration and metrics as
first-class members
•Better support for queries
– Reliability
v1 limitation:
•Data is stored in a local disk
•Single point of failure (SPOF) for
timeline server
– Flexibility
•Data model is more describable
•Extended to more specific info to
app

Core Design for ATS v2
⬢ Distributed write path
– Logical per app collector + physical per
node writer
– Collector/Writer launched as an auxiliary
service in NM.
– Standalone writers will be added later.
⬢ Pluggable backend storage
– Built in with a scalable and reliable
implementation (HBase)
⬢ Enhanced data model
– Entity (bi-directional relation) with flow,
queue, etc.
– Configuration, Metric, Event, etc.
⬢ Separate reader instances
⬢ Aggregation & Accumulation
– Aggregation: rolling up the metric values to the
parent
•Online aggregation for apps and flow
runs
•Offline aggregation for users, flows
and queues
– Accumulation: rolling up the metric values
across time interval
•Accumulated resource consumption
for app, flow, etc.

ATS v2 Architecture
Resource
Manager
RMApp
NodeManager
Info of Collectors
{
app_1,
app_2,
….
}
app_1 AM
Syncapp_1
Collector
app_n
Collector
Aux Service
AM timeline info
Timeline
Writer
RM app
Events
NM
Collector
Service
Timeline
Writer
NM_n
…
NM_1
app_1
container
NM
Collector
Service
Sync
Container
Monitor
1
1Timeline
Reader
User
Queries
Container
metric info
HBase
container info
(to be added)

1
0
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Data Model in ATS v2
Entity
ID + Type
Configurations
Metadata(Info)
Parent-Child
Relationships
Metrics
Events
Metric
ID
Metadata
Single Value or
Time
Series(with
timestamps)
Cluster
Type
Cluster Attributes
Flow
Type
User
Flow Runs
Flow Attributes
Flow Run
Type
User
Running apps
Flow Run
Attributes
Application
Type
User
Flow + Run
Queue
Attempts
Attempt
Type
Application
Queue
Containers
Container
Type
Attempt
Attributes
Entities of first
class citizens
User
Username(ID)
Aggregated metrics
Queue
Queue(ID)
Sub queues
Aggregated metrics
Aggregation
Event
ID
Metadata
Timestamp

1
1
Status for ATS v2
⬢ For other details, like:
– Aggregations (app/flow/user/queue level, offline or online)
– HBase table schema for EntityTable, ApplicationTable, FlowRunTable, etc.
– Reader APIs (RESTful)
Please refer to previous talks in Hadoop Summit 2016 San Jose:
https://www.youtube.com/watch?v=adV-DFa-8us&index=6&list=PLKnYDs_-dq16K1NH83Bke2dGGUO3YKZ5b
⬢ Status
–Phase I (YARN-2928): already released as an alpha feature in 3.0.0-alpha1
–Phase II (YARN-5355): In progress

1
2
Native Service Support in YARN
 A native YARN framework. YARN-4692
– Abstract common Framework (Similar to Slider) to support long running service
– More simplified API
 Better support for long running service
– Recognition of long running service
• Affect the policy of preemption, container reservation, etc.
– Auto-restart of containers
• Containers in long running service are more stateful
– Service/application upgrade support
• More services are expected to run long enough to across versions
– Dynamic container configuration
• Only reserve resource for necessary moment

1
3
API Simplification - REST
 Existing APIs are too low level and not easy to work with.
 Simple REST API layer fronting YARN
– YARN-4793. Simplified API layer for services and beyond
 Create and manage lifecycle of YARN services.
Example: ZooKeeper App

1
4
Discovery services in YARN
 YARN Service Discovery via DNS: YARN-4757
– Expose existing service information in YARN registry via DNS
• Current YARN service registry’s records will be converted into DNS entries
– Enabling Container to IP mappings - enables discovery of the IPs of containers via
standard DNS lookups.
• Application
– zkapp1.user1.yarncluster.com -> 192.168.10.11:8080
• Container
– container-1454001598828-0001-01-00004.yarncluster.com -> 192.168.10.18

1
5
More Cloud Friendly
⬢ Elastic
–Dynamic Resource Configuration
•YARN-291
•Allow tune down/up on NM’s resource in runtime
–Graceful decommissioning of NodeManagers
•YARN-914
•Drains a node that’s being decommissioned to allow running containers to
finish
⬢ Efficient
–Support for container resizing
•YARN-1197
•Allows applications to change the size of an existing container
–Task level native optimization
•MAPREDUCE-2841

1
6
More Cloud Friendly (Contd.)
⬢ Isolation
–Embrace container technology to achieve better isolation
–Resource isolation support for disk and network
•YARN-2619 (disk), YARN-2140 (network)
•Containers get a fair share of disk and network resources using Cgroups
–Docker support in LinuxContainerExecutor
•YARN-3611
•Support to launch Docker containers alongside process
•Packaging and resource isolation
⬢ Operation
–Container upgrades (YARN-4726)
•”Do an upgrade of my Spark / HBase apps with minimal impact to end-users”
–AM Restart With Work Preserving
•MAPREDUCE-6608

1
7
Scheduling Enhancements
 Application priorities: YARN-1963
– Inner-queue priority support
 Affinity / anti-affinity: YARN-1042
– More restraints on locations
 Global Scheduling: YARN-5139
– Get rid of per node scheduling model
– Enhance container scheduling throughput

1
8
Operational and User Experience Enhancements (YARN-3368)

1
9
Other YARN work could get released in Hadoop 3.X
⬢ Resource profiles
–YARN-3926
–Users can specify resource profile name instead of individual resources
–Resource types read via a config file
⬢ YARN federation
–YARN-2915
–Allows YARN to scale out to tens of thousands of nodes
–Cluster of clusters which appear as a single cluster to an end user
⬢ Gang Scheduling
–YARN-624
More Details in tomorrow noon session “Apache Hadoop YARN: Past,
Present and Future” by Junping Du and Jian He

2
0
Release Timeline for Apache Hadoop 3.0
⬢ 3.0.0-alpha1 is released on Sep/3/2016
⬢ alpha2 in Q4. 2016 (Estimated)
⬢ beta1 in early Q1. 2017 (Estimated)
⬢ GA in Q1/Q2 2017 (Estimated)

2
1
HDP Evolution with Apache Hadoop and YARN

Apache Hadoop 3.0 What's new in YARN and MapReduce

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Apache Hadoop 3.0 What's new in YARN and MapReduce

Similar to Apache Hadoop 3.0 What's new in YARN and MapReduce (20)

More from DataWorks Summit/Hadoop Summit

More from DataWorks Summit/Hadoop Summit (20)

Recently uploaded

Recently uploaded (20)

Apache Hadoop 3.0 What's new in YARN and MapReduce