SlideShare a Scribd company logo
Multi-Tenant Data Cloud
with YARN & Helix
LinkedIn - Data infra : Helix, Espresso
@kishore_b_g
Yahoo - Ads infra : S4
Kishore Gopalakrishna
1Thursday, June 5, 14
What is YARN
Next Generation Compute Platform
MapReduce
HDFS
Hadoop 1.0
MapReduce
HDFS
Hadoop 2.0
Others
(Batch, Interactive, Online,
Streaming)
YARN
(cluster resource management)
2Thursday, June 5, 14
What is YARN
Next Generation Compute Platform
MapReduce
HDFS
Hadoop 1.0
MapReduce
HDFS
Hadoop 2.0
Others
(Batch, Interactive, Online,
Streaming)
YARN
(cluster resource management)
A1
A1
A2
A3
B1 C1
C5
B2
B3 C2
B4
B5
C3
C4
Enables
2Thursday, June 5, 14
HDFS/Common Area
YARN
YARN Architecture
Client
Resource
Manager
Node Manager Node Manager
submit job
node statusnode status
container
request
App Package
Application
Master
Container
3Thursday, June 5, 14
So, let’s build something
4Thursday, June 5, 14
Example System
Generate Data
Serve
M/R
Redis
Server 3
HDFS 3
- Generate data in Hadoop
- Use it for serving
5Thursday, June 5, 14
Example System
Generate Data
Serve
M/R
Server 3
HDFS 3
6Thursday, June 5, 14
Example System
Requirements
Big Data :-)
Partitioned, replicated
Fault tolerant, Scalable
Efficient resource utilization
Generate Data
Serve
M/R
Server 3
HDFS 3
6Thursday, June 5, 14
Application
Master
Example System
Request
Containers Assign work
Handle Failure
Handle
workload
Changes
Requirements
Big Data :-)
Partitioned, replicated
Fault tolerant, Scalable
Efficient resource utilization
Generate Data
Serve
M/R
Server 3
HDFS 3
6Thursday, June 5, 14
Allocation + Assignment
HDFS
Server 1 Server 2Server 3
Partition Assignment - affinity, even distribution
Replica Placement - on different physical machines
Container Allocation - data affinity, rack aware placement
M/Rp1 p2 p3 p4 p5 p6
p1 p2
p5 p4
Server 3
p3 p4
p1 p6
Server 3
p5 p6
p3 p2
Multiple servers to serve
the partitioned data
M/R job generates partitioned data
7Thursday, June 5, 14
Failure Handling
Server 1 Server 2Server 1
Acquire new container close to data if possible
Assign failed partitions to new container
On Failure - Even load distribution, while waiting for new container
Server 23 Server 3
p5 p4 p1 p6 p3 p2
p1 p2 p3 p4 p5 p6
8Thursday, June 5, 14
Failure Handling
Server 1 Server 2Server 1
Acquire new container close to data if possible
Assign failed partitions to new container
On Failure - Even load distribution, while waiting for new container
Server 23 Server 3
p5 p4 p1 p6 p3 p2
p1 p2 p3 p4 p5 p6
8Thursday, June 5, 14
Failure Handling
Server 1 Server 2Server 1
Acquire new container close to data if possible
Assign failed partitions to new container
On Failure - Even load distribution, while waiting for new container
Server 23 Server 3 Server 4
p5 p4 p1 p6 p3 p2
p1 p2 p3 p4 p5 p6
p3 p2
p5 p6
8Thursday, June 5, 14
Workload Changes
Server 1 Server 2Server 3
Workload change - Acquire/Release containers
Container change - Re-distribute work
Monitor - CPU, Memory, Latency, Tps
p1 p2
p5 p4
Server 3
p3 p4
p1 p6
Server 3
p5 p6
p3 p2
9Thursday, June 5, 14
Workload Changes
Server 1 Server 2Server 3
Workload change - Acquire/Release containers
Container change - Re-distribute work
Monitor - CPU, Memory, Latency, Tps
p1 p2
p5 p4
Server 3
p3 p4
p1 p6
Server 3
p5 p6
p3 p2
Server 3
p4 p6
p2
9Thursday, June 5, 14
Workload Changes
Server 1 Server 2Server 3
Workload change - Acquire/Release containers
Container change - Re-distribute work
Monitor - CPU, Memory, Latency, Tps
p1 p2
p5
Server 3
p3 p4
p1
Server 3
p5 p6
p3
Server 3
p4 p6
p2
9Thursday, June 5, 14
Service Discovery
Server 1 Server 2Server 3
Dynamically updated on changes
Discover everything, what is running where
p1 p2
p1 p1
Server 3
p3 p4
p1 p1
Server 3
p5 p6
p1 p1
10Thursday, June 5, 14
Service Discovery
Server 1 Server 2Server 3
Dynamically updated on changes
Discover everything, what is running where
p1 p2
p1 p1
Server 3
p3 p4
p1 p1
Server 3
p5 p6
p1 p1
Client Client
Service Discovery
10Thursday, June 5, 14
Building YARN Application
Writing AM is Hard and Error Prone
Handling Faults, Workload Changes is non-trivial and often overlooked
Request
container
How many
containers
Where
Assign work
Place
partitions &
replicas
Affinity
Workload
changes
acquire/
release
containers
Minimize
movement
Faults
Handling
Detect non
trivial failures
new v/s
reuse
containers
Other
Service
Discovery
Monitoring
11Thursday, June 5, 14
Building YARN Application
Writing AM is Hard and Error Prone
Handling Faults, Workload Changes is non-trivial and often overlooked
Request
container
How many
containers
Where
Assign work
Place
partitions &
replicas
Affinity
Workload
changes
acquire/
release
containers
Minimize
movement
Faults
Handling
Detect non
trivial failures
new v/s
reuse
containers
Other
Service
Discovery
Monitoring
Is there something that can make
this easy?
11Thursday, June 5, 14
Apache Helix
12Thursday, June 5, 14
What is Helix?
Built at LinkedIn, 2+ years in production
Generic cluster management framework
Contributed to Apache, now a TLP: helix.apache.org
Decoupling cluster management from core functionality
13Thursday, June 5, 14
Helix at LinkedIn
Oracle
Oracle
OracleDB
Change Capture
Change
Consumers
Index Search Index
User Writes
Data Replicator
In Production
ETL
HDFS
Analytics
14Thursday, June 5, 14
Helix at LinkedIn
In Production
Over 1000 instances covering over 30000
partitions
Over 1000 instances for change
capture consumers
As many as 500 instances in a
single Helix cluster
(all numbers are per-datacenter)
15Thursday, June 5, 14
Others Using Helix
16Thursday, June 5, 14
Helix concepts
Resource
(Database, Index, Topic, Task)
17Thursday, June 5, 14
Helix concepts
Resource
(Database, Index, Topic, Task)
Partitions
p1 p2 p3 p4 p5 p6
17Thursday, June 5, 14
Helix concepts
Resource
(Database, Index, Topic, Task)
Partitions
Replicas
p1 p2 p3 p4 p5 p6
r1
r2
r3
17Thursday, June 5, 14
Helix concepts
Resource
(Database, Index, Topic, Task)
Partitions
Replicas
p1 p2 p3 p4 p5 p6
r1
r2
r3
Container
Process
Container
Process
Container
Process
17Thursday, June 5, 14
Helix concepts
Resource
(Database, Index, Topic, Task)
Partitions
Replicas
p1 p2 p3 p4 p5 p6
r1
r2
r3
Container
Process
Container
Process
Container
Process
Assignment ?
17Thursday, June 5, 14
State Model and Constraints
Helix Concepts
18Thursday, June 5, 14
Serve
bootstrap
State Model and Constraints
Helix Concepts
Stop
18Thursday, June 5, 14
Serve
bootstrap
State Model and Constraints
Helix Concepts
State
Constraints
Transition
Constraints
Partition
Resource
Node
Cluster
Serve: 3
bootstrap: 0
Max T1 transitions in
parallel
-
Max T2 transitions in
parallel
No more than
10 replicas
Max T3 transitions in
parallel
-
Max T4 transitions in
parallel
Stop
18Thursday, June 5, 14
Serve
bootstrap
State Model and Constraints
Helix Concepts
State
Constraints
Transition
Constraints
Partition
Resource
Node
Cluster
Serve: 3
bootstrap: 0
Max T1 transitions in
parallel
-
Max T2 transitions in
parallel
No more than
10 replicas
Max T3 transitions in
parallel
-
Max T4 transitions in
parallel
StateCount=
Replication factor:3
Stop
18Thursday, June 5, 14
ParticipantParticipantParticipant
Helix Architecture
P1
stop
bootstrap
server
P2 P5
P3
P4
P8
P6
P7
Controller
Client Client Target Provider
Provisioner
Rebalancer
assign work via callback
spectator spectator
Service Discovery
metrics
metrics
19Thursday, June 5, 14
Helix Controller
High-Level Overview
Resource
Config
Constraints
Objectives
Controller
TargetProvider
Provisioner
Rebalancer
Number of Containers
Task-> Container
Mapping
YARN RM
20Thursday, June 5, 14
Helix Controller
Target Provider
Determine how many containers are required along with the spec
Fixed CPU Memory Bin Packing
monitoring system provides usage information
Default implementations, Bin Packing can be used to customize further
TargetProvider
Resources p1,p2 .. pn
Existing containers c1,c2 .. cn
Health of tasks,
containers
cpu, memory, health
Allocation
constraints
Affinity,
rack locality
SLA
Fixed: 10 containers
CPU headroom:30%
Memory Usage: 70%
time: 5h
Number of
container
release list
acquire list
Container spec
cpu: x
memory: y
location: L
21Thursday, June 5, 14
Helix Controller
Provisioner
Given the container spec, interact with YARN RM to
acquire/release, NM to start/stop containers
YARN
Interacts with YARN RM and subscribes to notifications
22Thursday, June 5, 14
Helix Controller
Rebalancer
Based on the current nodes in the cluster and constraints, find an
assignment of task to node
Auto Semi-Auto Static
Rebalancer
Tasks t1,t2 .. tn
Existing containers c1,c2 .. cn
Allocation
constraints &
objectives
Affinity,
rack locality,
Even distribution of
tasks,
Minimize movement
while expanding
Assignment
C1: t1,t2
C2: t3,t4
User defined
Based on the FSM, compute & fire the transitions to Participants
23Thursday, June 5, 14
Example System: Helix-Based Solution
Solution
Configure App
Configure Target Provider
Configure Provisioner
Configure Rebalancer
Generate Data
Serve
M/R
Server 3
HDFS 3
24Thursday, June 5, 14
Configure AppConfigure App
App Name Partitioned Data Server
App Master
Package
/path/to/
GenericHelixAppMaster.tar
App package
/path/to/
RedisServerLauncher.tar
App Config
DataDirectory: hdfs:/path/to/
data
Configure target providerConfigure target provider
TargetProvider RedisTargetProvider
Goal Target TPS: 1 million
Min container 1
Max containers 25
Configure ProvisionerConfigure Provisioner
YARN RM host:port
Configure RebalancerConfigure Rebalancer
Partitions 6
Replica 2
Max partitions per container 4
Rebalancer.Mode AUTO
Placement Data Affinity
FailureHandling Even distribution
Scaling Minimize Movement
app_config_spec.yaml
Example System: Helix-Based Solution
25Thursday, June 5, 14
yarn_app_launcher.sh	
  app_config_spec.yaml
Launch Application
26Thursday, June 5, 14
Helix + YARN
Server 1 Server 2
27Thursday, June 5, 14
Helix + YARN
YARN
Resource
Manager
Client
submit job
Server 1 Server 2
27Thursday, June 5, 14
Application Master
Helix + YARN
YARN
Resource
Manager
Client
submit job
Launch
AM
Server 1 Server 2
27Thursday, June 5, 14
Application Master
Helix + YARN
Helix Controller
YARN
Resource
Manager
Target Provider
Provisioner
RebalancerClient
submit job
Launch
AM
Server 1 Server 2
27Thursday, June 5, 14
Application Master
Helix + YARN
Helix Controller
YARN
Resource
Manager
Target Provider
Provisioner
RebalancerClient
submit job
Launch
AM
request
cntrs
Server 1 Server 2
27Thursday, June 5, 14
Node ManagerNode Manager
Application Master
Helix + YARN
Helix Controller
Node Manager
YARN
Resource
Manager
Target Provider
Provisioner
RebalancerClient
submit job
Launch
AM
request
cntrs
launch
containers
Server 1 Server 2participant 3 participant 3 participant 3
27Thursday, June 5, 14
Node ManagerNode Manager
Application Master
Helix + YARN
Helix Controller
Node Manager
YARN
Resource
Manager
Target Provider
Provisioner
Rebalancer
assign
work
Client
submit job
Launch
AM
request
cntrs
launch
containers
Server 1 Server 2participant 3
p1 p2
p5 p4
participant 3
p3 p4
p1 p6
participant 3
p5 p6
p3 p2
27Thursday, June 5, 14
Auto Scaling
Non linear scaling from 0 to 1M TPS and back
28Thursday, June 5, 14
Failure Handling: Random Faults
Recovering from faults at 1M Tps (5%, 10%, 20% failures/min)
29Thursday, June 5, 14
Summary
HDFS
YARN
(cluster resource management)
HELIX
(container + task management)
Others
(Batch, Interactive, Online, Streaming)
Fault tolerance, Expansion handled transparently
Generic Application Master
Efficient resource utilization by task model
30Thursday, June 5, 14
Questions?
Website
Twitter
Mail
Team
helix.apache.org, #apachehelix
@apachehelix, @kishore_b_g
user@helix.apache.org
Kanak Biscuitwala, Zhen Zhang
?We love helping & being helped
31Thursday, June 5, 14

More Related Content

What's hot

R for hadoopers
R for hadoopersR for hadoopers
R for hadoopers
Gwen (Chen) Shapira
 
ApacheCon: Apache Flink - Fast and Reliable Large-Scale Data Processing
ApacheCon: Apache Flink - Fast and Reliable Large-Scale Data ProcessingApacheCon: Apache Flink - Fast and Reliable Large-Scale Data Processing
ApacheCon: Apache Flink - Fast and Reliable Large-Scale Data Processing
Fabian Hueske
 
Introduction to Spark Streaming
Introduction to Spark StreamingIntroduction to Spark Streaming
Introduction to Spark Streaming
datamantra
 
Flexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache FlinkFlexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache Flink
DataWorks Summit
 
HBaseCon 2015: Blackbird Collections - In-situ Stream Processing in HBase
HBaseCon 2015: Blackbird Collections - In-situ  Stream Processing in HBaseHBaseCon 2015: Blackbird Collections - In-situ  Stream Processing in HBase
HBaseCon 2015: Blackbird Collections - In-situ Stream Processing in HBase
HBaseCon
 
How Texas Instruments Uses InfluxDB to Uphold Product Standards and to Improv...
How Texas Instruments Uses InfluxDB to Uphold Product Standards and to Improv...How Texas Instruments Uses InfluxDB to Uphold Product Standards and to Improv...
How Texas Instruments Uses InfluxDB to Uphold Product Standards and to Improv...
InfluxData
 
RMOUG2016 - Resource Management (the critical piece of the consolidation puzzle)
RMOUG2016 - Resource Management (the critical piece of the consolidation puzzle)RMOUG2016 - Resource Management (the critical piece of the consolidation puzzle)
RMOUG2016 - Resource Management (the critical piece of the consolidation puzzle)
Kristofferson A
 
Speed Up Your Queries with Hive LLAP Engine on Hadoop or in the Cloud
Speed Up Your Queries with Hive LLAP Engine on Hadoop or in the CloudSpeed Up Your Queries with Hive LLAP Engine on Hadoop or in the Cloud
Speed Up Your Queries with Hive LLAP Engine on Hadoop or in the Cloud
gluent.
 
Debunking the Myths of HDFS Erasure Coding Performance
Debunking the Myths of HDFS Erasure Coding Performance Debunking the Myths of HDFS Erasure Coding Performance
Debunking the Myths of HDFS Erasure Coding Performance
DataWorks Summit/Hadoop Summit
 
Overview of Cascading 3.0 on Apache Flink
Overview of Cascading 3.0 on Apache Flink Overview of Cascading 3.0 on Apache Flink
Overview of Cascading 3.0 on Apache Flink
Cascading
 
January 2015 HUG: Apache Flink: Fast and reliable large-scale data processing
January 2015 HUG: Apache Flink:  Fast and reliable large-scale data processingJanuary 2015 HUG: Apache Flink:  Fast and reliable large-scale data processing
January 2015 HUG: Apache Flink: Fast and reliable large-scale data processing
Yahoo Developer Network
 
Apache Big Data EU 2015 - HBase
Apache Big Data EU 2015 - HBaseApache Big Data EU 2015 - HBase
Apache Big Data EU 2015 - HBase
Nick Dimiduk
 
Apache Big Data EU 2016: Building Streaming Applications with Apache Apex
Apache Big Data EU 2016: Building Streaming Applications with Apache ApexApache Big Data EU 2016: Building Streaming Applications with Apache Apex
Apache Big Data EU 2016: Building Streaming Applications with Apache Apex
Apache Apex
 
Data Integration
Data IntegrationData Integration
Data Integration
Datio Big Data
 
Inside HDFS Append
Inside HDFS AppendInside HDFS Append
Inside HDFS Append
Yue Chen
 
HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.
HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.
HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.
Cloudera, Inc.
 
HBaseCon 2013: Scalable Network Designs for Apache HBase
HBaseCon 2013: Scalable Network Designs for Apache HBaseHBaseCon 2013: Scalable Network Designs for Apache HBase
HBaseCon 2013: Scalable Network Designs for Apache HBase
Cloudera, Inc.
 
Intro to Apache Apex - Next Gen Platform for Ingest and Transform
Intro to Apache Apex - Next Gen Platform for Ingest and TransformIntro to Apache Apex - Next Gen Platform for Ingest and Transform
Intro to Apache Apex - Next Gen Platform for Ingest and Transform
Apache Apex
 
[Hadoop Meetup] Apache Hadoop 3 community update - Rohith Sharma
[Hadoop Meetup] Apache Hadoop 3 community update - Rohith Sharma[Hadoop Meetup] Apache Hadoop 3 community update - Rohith Sharma
[Hadoop Meetup] Apache Hadoop 3 community update - Rohith Sharma
Newton Alex
 
The Stream Processor as a Database Apache Flink
The Stream Processor as a Database Apache FlinkThe Stream Processor as a Database Apache Flink
The Stream Processor as a Database Apache Flink
DataWorks Summit/Hadoop Summit
 

What's hot (20)

R for hadoopers
R for hadoopersR for hadoopers
R for hadoopers
 
ApacheCon: Apache Flink - Fast and Reliable Large-Scale Data Processing
ApacheCon: Apache Flink - Fast and Reliable Large-Scale Data ProcessingApacheCon: Apache Flink - Fast and Reliable Large-Scale Data Processing
ApacheCon: Apache Flink - Fast and Reliable Large-Scale Data Processing
 
Introduction to Spark Streaming
Introduction to Spark StreamingIntroduction to Spark Streaming
Introduction to Spark Streaming
 
Flexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache FlinkFlexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache Flink
 
HBaseCon 2015: Blackbird Collections - In-situ Stream Processing in HBase
HBaseCon 2015: Blackbird Collections - In-situ  Stream Processing in HBaseHBaseCon 2015: Blackbird Collections - In-situ  Stream Processing in HBase
HBaseCon 2015: Blackbird Collections - In-situ Stream Processing in HBase
 
How Texas Instruments Uses InfluxDB to Uphold Product Standards and to Improv...
How Texas Instruments Uses InfluxDB to Uphold Product Standards and to Improv...How Texas Instruments Uses InfluxDB to Uphold Product Standards and to Improv...
How Texas Instruments Uses InfluxDB to Uphold Product Standards and to Improv...
 
RMOUG2016 - Resource Management (the critical piece of the consolidation puzzle)
RMOUG2016 - Resource Management (the critical piece of the consolidation puzzle)RMOUG2016 - Resource Management (the critical piece of the consolidation puzzle)
RMOUG2016 - Resource Management (the critical piece of the consolidation puzzle)
 
Speed Up Your Queries with Hive LLAP Engine on Hadoop or in the Cloud
Speed Up Your Queries with Hive LLAP Engine on Hadoop or in the CloudSpeed Up Your Queries with Hive LLAP Engine on Hadoop or in the Cloud
Speed Up Your Queries with Hive LLAP Engine on Hadoop or in the Cloud
 
Debunking the Myths of HDFS Erasure Coding Performance
Debunking the Myths of HDFS Erasure Coding Performance Debunking the Myths of HDFS Erasure Coding Performance
Debunking the Myths of HDFS Erasure Coding Performance
 
Overview of Cascading 3.0 on Apache Flink
Overview of Cascading 3.0 on Apache Flink Overview of Cascading 3.0 on Apache Flink
Overview of Cascading 3.0 on Apache Flink
 
January 2015 HUG: Apache Flink: Fast and reliable large-scale data processing
January 2015 HUG: Apache Flink:  Fast and reliable large-scale data processingJanuary 2015 HUG: Apache Flink:  Fast and reliable large-scale data processing
January 2015 HUG: Apache Flink: Fast and reliable large-scale data processing
 
Apache Big Data EU 2015 - HBase
Apache Big Data EU 2015 - HBaseApache Big Data EU 2015 - HBase
Apache Big Data EU 2015 - HBase
 
Apache Big Data EU 2016: Building Streaming Applications with Apache Apex
Apache Big Data EU 2016: Building Streaming Applications with Apache ApexApache Big Data EU 2016: Building Streaming Applications with Apache Apex
Apache Big Data EU 2016: Building Streaming Applications with Apache Apex
 
Data Integration
Data IntegrationData Integration
Data Integration
 
Inside HDFS Append
Inside HDFS AppendInside HDFS Append
Inside HDFS Append
 
HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.
HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.
HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.
 
HBaseCon 2013: Scalable Network Designs for Apache HBase
HBaseCon 2013: Scalable Network Designs for Apache HBaseHBaseCon 2013: Scalable Network Designs for Apache HBase
HBaseCon 2013: Scalable Network Designs for Apache HBase
 
Intro to Apache Apex - Next Gen Platform for Ingest and Transform
Intro to Apache Apex - Next Gen Platform for Ingest and TransformIntro to Apache Apex - Next Gen Platform for Ingest and Transform
Intro to Apache Apex - Next Gen Platform for Ingest and Transform
 
[Hadoop Meetup] Apache Hadoop 3 community update - Rohith Sharma
[Hadoop Meetup] Apache Hadoop 3 community update - Rohith Sharma[Hadoop Meetup] Apache Hadoop 3 community update - Rohith Sharma
[Hadoop Meetup] Apache Hadoop 3 community update - Rohith Sharma
 
The Stream Processor as a Database Apache Flink
The Stream Processor as a Database Apache FlinkThe Stream Processor as a Database Apache Flink
The Stream Processor as a Database Apache Flink
 

Viewers also liked

Showhomes special half_day_makeover
Showhomes special half_day_makeoverShowhomes special half_day_makeover
Showhomes special half_day_makeoverPresentation Sells
 
Redaccion de textos
Redaccion de textosRedaccion de textos
Redaccion de textos
Katerine Sanchez
 
Three Trigger-Ready Flows You Can Build Today
Three Trigger-Ready Flows You Can Build TodayThree Trigger-Ready Flows You Can Build Today
Three Trigger-Ready Flows You Can Build Today
Patrick Sheil
 
seriale ostateczna wersja
seriale ostateczna wersjaseriale ostateczna wersja
seriale ostateczna wersjaAnka Zabłotna
 
Czy można porozumiewać się bez słów?
Czy można porozumiewać się bez słów?Czy można porozumiewać się bez słów?
Czy można porozumiewać się bez słów?
Anka Zabłotna
 
Science powerpoint
Science powerpointScience powerpoint
Science powerpointalmonds6
 
CONVINCING THEM YOU’RE RIGHT FOR THE JOB
CONVINCING THEM YOU’RE RIGHT FOR THE JOBCONVINCING THEM YOU’RE RIGHT FOR THE JOB
CONVINCING THEM YOU’RE RIGHT FOR THE JOB
EnvaPya
 
Presentation dr herman
Presentation dr hermanPresentation dr herman
Presentation dr hermanDea Noviana
 
Pnum final
Pnum finalPnum final
Pnum final
Liron Sabag
 
สมบูรณ์ สะโสวิทย์ ສົມບູນ ສະໂສຫວິດ
สมบูรณ์ สะโสวิทย์ ສົມບູນ ສະໂສຫວິດสมบูรณ์ สะโสวิทย์ ສົມບູນ ສະໂສຫວິດ
สมบูรณ์ สะโสวิทย์ ສົມບູນ ສະໂສຫວິດ
สมบูรณ์ สะโสวิทย์
 
Data driven testing: Case study with Apache Helix
Data driven testing: Case study with Apache HelixData driven testing: Case study with Apache Helix
Data driven testing: Case study with Apache Helix
Kishore Gopalakrishna
 
Apache Helix presentation at Vmware
Apache Helix presentation at VmwareApache Helix presentation at Vmware
Apache Helix presentation at Vmware
Kishore Gopalakrishna
 
Rencana bisnis futsal- Manajemen bisnis
Rencana bisnis futsal- Manajemen bisnisRencana bisnis futsal- Manajemen bisnis
Rencana bisnis futsal- Manajemen bisnis
EnvaPya
 
Aba new habilitation data sheets
Aba new habilitation data sheetsAba new habilitation data sheets
Aba new habilitation data sheets
abrighterave
 
Apache Helix presentation at ApacheCon 2013
Apache Helix presentation at ApacheCon 2013Apache Helix presentation at ApacheCon 2013
Apache Helix presentation at ApacheCon 2013
Kishore Gopalakrishna
 
Manual operaciones Radio AM
Manual operaciones Radio AMManual operaciones Radio AM
Manual operaciones Radio AM
Krystal Martínez
 
refrat persalinan normal ( 2-08-2013 RSUD SERANG )
 refrat  persalinan normal ( 2-08-2013 RSUD SERANG ) refrat  persalinan normal ( 2-08-2013 RSUD SERANG )
refrat persalinan normal ( 2-08-2013 RSUD SERANG )Dea Noviana
 
Perbedaan SAP dan SAP negara lain internasional
Perbedaan SAP dan SAP negara lain internasional Perbedaan SAP dan SAP negara lain internasional
Perbedaan SAP dan SAP negara lain internasional
EnvaPya
 

Viewers also liked (20)

Showhomes special half_day_makeover
Showhomes special half_day_makeoverShowhomes special half_day_makeover
Showhomes special half_day_makeover
 
Redaccion de textos
Redaccion de textosRedaccion de textos
Redaccion de textos
 
Three Trigger-Ready Flows You Can Build Today
Three Trigger-Ready Flows You Can Build TodayThree Trigger-Ready Flows You Can Build Today
Three Trigger-Ready Flows You Can Build Today
 
seriale ostateczna wersja
seriale ostateczna wersjaseriale ostateczna wersja
seriale ostateczna wersja
 
Czy można porozumiewać się bez słów?
Czy można porozumiewać się bez słów?Czy można porozumiewać się bez słów?
Czy można porozumiewać się bez słów?
 
Science powerpoint
Science powerpointScience powerpoint
Science powerpoint
 
CONVINCING THEM YOU’RE RIGHT FOR THE JOB
CONVINCING THEM YOU’RE RIGHT FOR THE JOBCONVINCING THEM YOU’RE RIGHT FOR THE JOB
CONVINCING THEM YOU’RE RIGHT FOR THE JOB
 
Presentation dr herman
Presentation dr hermanPresentation dr herman
Presentation dr herman
 
Seriale
SerialeSeriale
Seriale
 
Pnum final
Pnum finalPnum final
Pnum final
 
สมบูรณ์ สะโสวิทย์ ສົມບູນ ສະໂສຫວິດ
สมบูรณ์ สะโสวิทย์ ສົມບູນ ສະໂສຫວິດสมบูรณ์ สะโสวิทย์ ສົມບູນ ສະໂສຫວິດ
สมบูรณ์ สะโสวิทย์ ສົມບູນ ສະໂສຫວິດ
 
Data driven testing: Case study with Apache Helix
Data driven testing: Case study with Apache HelixData driven testing: Case study with Apache Helix
Data driven testing: Case study with Apache Helix
 
Apache Helix presentation at Vmware
Apache Helix presentation at VmwareApache Helix presentation at Vmware
Apache Helix presentation at Vmware
 
Helix talk at RelateIQ
Helix talk at RelateIQHelix talk at RelateIQ
Helix talk at RelateIQ
 
Rencana bisnis futsal- Manajemen bisnis
Rencana bisnis futsal- Manajemen bisnisRencana bisnis futsal- Manajemen bisnis
Rencana bisnis futsal- Manajemen bisnis
 
Aba new habilitation data sheets
Aba new habilitation data sheetsAba new habilitation data sheets
Aba new habilitation data sheets
 
Apache Helix presentation at ApacheCon 2013
Apache Helix presentation at ApacheCon 2013Apache Helix presentation at ApacheCon 2013
Apache Helix presentation at ApacheCon 2013
 
Manual operaciones Radio AM
Manual operaciones Radio AMManual operaciones Radio AM
Manual operaciones Radio AM
 
refrat persalinan normal ( 2-08-2013 RSUD SERANG )
 refrat  persalinan normal ( 2-08-2013 RSUD SERANG ) refrat  persalinan normal ( 2-08-2013 RSUD SERANG )
refrat persalinan normal ( 2-08-2013 RSUD SERANG )
 
Perbedaan SAP dan SAP negara lain internasional
Perbedaan SAP dan SAP negara lain internasional Perbedaan SAP dan SAP negara lain internasional
Perbedaan SAP dan SAP negara lain internasional
 

Similar to Multi-Tenant Data Cloud with YARN & Helix

One Grid to rule them all: Building a Multi-tenant Data Cloud with YARN
One Grid to rule them all: Building a Multi-tenant Data Cloud with YARNOne Grid to rule them all: Building a Multi-tenant Data Cloud with YARN
One Grid to rule them all: Building a Multi-tenant Data Cloud with YARNDataWorks Summit
 
Hadoop tools with Examples
Hadoop tools with ExamplesHadoop tools with Examples
Hadoop tools with Examples
Joe McTee
 
Hptf 2240 Final
Hptf 2240 FinalHptf 2240 Final
Hptf 2240 Final
prosullivan
 
Towards SLA-based Scheduling on YARN Clusters
Towards SLA-based Scheduling on YARN ClustersTowards SLA-based Scheduling on YARN Clusters
Towards SLA-based Scheduling on YARN Clusters
DataWorks Summit
 
Petrel: A Programmatically Accessible Research Data Service
Petrel: A Programmatically Accessible Research Data ServicePetrel: A Programmatically Accessible Research Data Service
Petrel: A Programmatically Accessible Research Data Service
Globus
 
HPC Storage and IO Trends and Workflows
HPC Storage and IO Trends and WorkflowsHPC Storage and IO Trends and Workflows
HPC Storage and IO Trends and Workflows
inside-BigData.com
 
Cloudera Impala - HUG Karlsruhe, July 04, 2013
Cloudera Impala - HUG Karlsruhe, July 04, 2013Cloudera Impala - HUG Karlsruhe, July 04, 2013
Cloudera Impala - HUG Karlsruhe, July 04, 2013
Alexander Alten
 
Red Hat Storage Server Roadmap & Integration With Open Stack
Red Hat Storage Server Roadmap & Integration With Open StackRed Hat Storage Server Roadmap & Integration With Open Stack
Red Hat Storage Server Roadmap & Integration With Open Stack
Red_Hat_Storage
 
Hadoop Summit San Jose 2015: Towards SLA-based Scheduling on YARN Clusters
Hadoop Summit San Jose 2015: Towards SLA-based Scheduling on YARN Clusters Hadoop Summit San Jose 2015: Towards SLA-based Scheduling on YARN Clusters
Hadoop Summit San Jose 2015: Towards SLA-based Scheduling on YARN Clusters
Sumeet Singh
 
dgintro (1).ppt
dgintro (1).pptdgintro (1).ppt
dgintro (1).ppt
Ans Sembiring
 
Oracle presentations RAC dataguard active database
Oracle presentations RAC dataguard active databaseOracle presentations RAC dataguard active database
Oracle presentations RAC dataguard active database
mabessisindu
 
Apache Hadoop YARN, NameNode HA, HDFS Federation
Apache Hadoop YARN, NameNode HA, HDFS FederationApache Hadoop YARN, NameNode HA, HDFS Federation
Apache Hadoop YARN, NameNode HA, HDFS Federation
Adam Kawa
 
Pacemaker+DRBD
Pacemaker+DRBDPacemaker+DRBD
Pacemaker+DRBDDan Frincu
 
Hadoop Cluster With High Availability
Hadoop Cluster With High AvailabilityHadoop Cluster With High Availability
Hadoop Cluster With High Availability
Edureka!
 
State of the_gluster_-_lceu
State of the_gluster_-_lceuState of the_gluster_-_lceu
State of the_gluster_-_lceu
Gluster.org
 
Tachyon-2014-11-21-amp-camp5
Tachyon-2014-11-21-amp-camp5Tachyon-2014-11-21-amp-camp5
Tachyon-2014-11-21-amp-camp5
Haoyuan Li
 
Hadoop and Big Data Overview
Hadoop and Big Data OverviewHadoop and Big Data Overview
Hadoop and Big Data Overview
Prabhu Thukkaram
 
20150704 benchmark and user experience in sahara weiting
20150704 benchmark and user experience in sahara weiting20150704 benchmark and user experience in sahara weiting
20150704 benchmark and user experience in sahara weitingWei Ting Chen
 
White Paper: Perforce Administration Optimization, Scalability, Availability ...
White Paper: Perforce Administration Optimization, Scalability, Availability ...White Paper: Perforce Administration Optimization, Scalability, Availability ...
White Paper: Perforce Administration Optimization, Scalability, Availability ...
Perforce
 

Similar to Multi-Tenant Data Cloud with YARN & Helix (20)

One Grid to rule them all: Building a Multi-tenant Data Cloud with YARN
One Grid to rule them all: Building a Multi-tenant Data Cloud with YARNOne Grid to rule them all: Building a Multi-tenant Data Cloud with YARN
One Grid to rule them all: Building a Multi-tenant Data Cloud with YARN
 
Hadoop tools with Examples
Hadoop tools with ExamplesHadoop tools with Examples
Hadoop tools with Examples
 
Hptf 2240 Final
Hptf 2240 FinalHptf 2240 Final
Hptf 2240 Final
 
Towards SLA-based Scheduling on YARN Clusters
Towards SLA-based Scheduling on YARN ClustersTowards SLA-based Scheduling on YARN Clusters
Towards SLA-based Scheduling on YARN Clusters
 
Petrel: A Programmatically Accessible Research Data Service
Petrel: A Programmatically Accessible Research Data ServicePetrel: A Programmatically Accessible Research Data Service
Petrel: A Programmatically Accessible Research Data Service
 
HPC Storage and IO Trends and Workflows
HPC Storage and IO Trends and WorkflowsHPC Storage and IO Trends and Workflows
HPC Storage and IO Trends and Workflows
 
Cloudera Impala - HUG Karlsruhe, July 04, 2013
Cloudera Impala - HUG Karlsruhe, July 04, 2013Cloudera Impala - HUG Karlsruhe, July 04, 2013
Cloudera Impala - HUG Karlsruhe, July 04, 2013
 
Red Hat Storage Server Roadmap & Integration With Open Stack
Red Hat Storage Server Roadmap & Integration With Open StackRed Hat Storage Server Roadmap & Integration With Open Stack
Red Hat Storage Server Roadmap & Integration With Open Stack
 
Subhani_OrDBA5+
Subhani_OrDBA5+Subhani_OrDBA5+
Subhani_OrDBA5+
 
Hadoop Summit San Jose 2015: Towards SLA-based Scheduling on YARN Clusters
Hadoop Summit San Jose 2015: Towards SLA-based Scheduling on YARN Clusters Hadoop Summit San Jose 2015: Towards SLA-based Scheduling on YARN Clusters
Hadoop Summit San Jose 2015: Towards SLA-based Scheduling on YARN Clusters
 
dgintro (1).ppt
dgintro (1).pptdgintro (1).ppt
dgintro (1).ppt
 
Oracle presentations RAC dataguard active database
Oracle presentations RAC dataguard active databaseOracle presentations RAC dataguard active database
Oracle presentations RAC dataguard active database
 
Apache Hadoop YARN, NameNode HA, HDFS Federation
Apache Hadoop YARN, NameNode HA, HDFS FederationApache Hadoop YARN, NameNode HA, HDFS Federation
Apache Hadoop YARN, NameNode HA, HDFS Federation
 
Pacemaker+DRBD
Pacemaker+DRBDPacemaker+DRBD
Pacemaker+DRBD
 
Hadoop Cluster With High Availability
Hadoop Cluster With High AvailabilityHadoop Cluster With High Availability
Hadoop Cluster With High Availability
 
State of the_gluster_-_lceu
State of the_gluster_-_lceuState of the_gluster_-_lceu
State of the_gluster_-_lceu
 
Tachyon-2014-11-21-amp-camp5
Tachyon-2014-11-21-amp-camp5Tachyon-2014-11-21-amp-camp5
Tachyon-2014-11-21-amp-camp5
 
Hadoop and Big Data Overview
Hadoop and Big Data OverviewHadoop and Big Data Overview
Hadoop and Big Data Overview
 
20150704 benchmark and user experience in sahara weiting
20150704 benchmark and user experience in sahara weiting20150704 benchmark and user experience in sahara weiting
20150704 benchmark and user experience in sahara weiting
 
White Paper: Perforce Administration Optimization, Scalability, Availability ...
White Paper: Perforce Administration Optimization, Scalability, Availability ...White Paper: Perforce Administration Optimization, Scalability, Availability ...
White Paper: Perforce Administration Optimization, Scalability, Availability ...
 

Recently uploaded

HYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generationHYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generation
Robbie Edward Sayers
 
ASME IX(9) 2007 Full Version .pdf
ASME IX(9)  2007 Full Version       .pdfASME IX(9)  2007 Full Version       .pdf
ASME IX(9) 2007 Full Version .pdf
AhmedHussein950959
 
Railway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdfRailway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdf
TeeVichai
 
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
ydteq
 
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
thanhdowork
 
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
MdTanvirMahtab2
 
space technology lecture notes on satellite
space technology lecture notes on satellitespace technology lecture notes on satellite
space technology lecture notes on satellite
ongomchris
 
Final project report on grocery store management system..pdf
Final project report on grocery store management system..pdfFinal project report on grocery store management system..pdf
Final project report on grocery store management system..pdf
Kamal Acharya
 
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Dr.Costas Sachpazis
 
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdfGoverning Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
WENKENLI1
 
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdfAKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
SamSarthak3
 
ethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.pptethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.ppt
Jayaprasanna4
 
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
obonagu
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
zwunae
 
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxCFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
R&R Consult
 
Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdfWater Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation & Control
 
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
H.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdfH.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdf
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
MLILAB
 
Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024
Massimo Talia
 
weather web application report.pdf
weather web application report.pdfweather web application report.pdf
weather web application report.pdf
Pratik Pawar
 
MCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdfMCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdf
Osamah Alsalih
 

Recently uploaded (20)

HYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generationHYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generation
 
ASME IX(9) 2007 Full Version .pdf
ASME IX(9)  2007 Full Version       .pdfASME IX(9)  2007 Full Version       .pdf
ASME IX(9) 2007 Full Version .pdf
 
Railway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdfRailway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdf
 
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
 
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
 
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
 
space technology lecture notes on satellite
space technology lecture notes on satellitespace technology lecture notes on satellite
space technology lecture notes on satellite
 
Final project report on grocery store management system..pdf
Final project report on grocery store management system..pdfFinal project report on grocery store management system..pdf
Final project report on grocery store management system..pdf
 
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
 
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdfGoverning Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
 
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdfAKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
 
ethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.pptethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.ppt
 
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
 
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxCFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
 
Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdfWater Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdf
 
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
H.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdfH.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdf
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
 
Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024
 
weather web application report.pdf
weather web application report.pdfweather web application report.pdf
weather web application report.pdf
 
MCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdfMCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdf
 

Multi-Tenant Data Cloud with YARN & Helix

  • 1. Multi-Tenant Data Cloud with YARN & Helix LinkedIn - Data infra : Helix, Espresso @kishore_b_g Yahoo - Ads infra : S4 Kishore Gopalakrishna 1Thursday, June 5, 14
  • 2. What is YARN Next Generation Compute Platform MapReduce HDFS Hadoop 1.0 MapReduce HDFS Hadoop 2.0 Others (Batch, Interactive, Online, Streaming) YARN (cluster resource management) 2Thursday, June 5, 14
  • 3. What is YARN Next Generation Compute Platform MapReduce HDFS Hadoop 1.0 MapReduce HDFS Hadoop 2.0 Others (Batch, Interactive, Online, Streaming) YARN (cluster resource management) A1 A1 A2 A3 B1 C1 C5 B2 B3 C2 B4 B5 C3 C4 Enables 2Thursday, June 5, 14
  • 4. HDFS/Common Area YARN YARN Architecture Client Resource Manager Node Manager Node Manager submit job node statusnode status container request App Package Application Master Container 3Thursday, June 5, 14
  • 5. So, let’s build something 4Thursday, June 5, 14
  • 6. Example System Generate Data Serve M/R Redis Server 3 HDFS 3 - Generate data in Hadoop - Use it for serving 5Thursday, June 5, 14
  • 7. Example System Generate Data Serve M/R Server 3 HDFS 3 6Thursday, June 5, 14
  • 8. Example System Requirements Big Data :-) Partitioned, replicated Fault tolerant, Scalable Efficient resource utilization Generate Data Serve M/R Server 3 HDFS 3 6Thursday, June 5, 14
  • 9. Application Master Example System Request Containers Assign work Handle Failure Handle workload Changes Requirements Big Data :-) Partitioned, replicated Fault tolerant, Scalable Efficient resource utilization Generate Data Serve M/R Server 3 HDFS 3 6Thursday, June 5, 14
  • 10. Allocation + Assignment HDFS Server 1 Server 2Server 3 Partition Assignment - affinity, even distribution Replica Placement - on different physical machines Container Allocation - data affinity, rack aware placement M/Rp1 p2 p3 p4 p5 p6 p1 p2 p5 p4 Server 3 p3 p4 p1 p6 Server 3 p5 p6 p3 p2 Multiple servers to serve the partitioned data M/R job generates partitioned data 7Thursday, June 5, 14
  • 11. Failure Handling Server 1 Server 2Server 1 Acquire new container close to data if possible Assign failed partitions to new container On Failure - Even load distribution, while waiting for new container Server 23 Server 3 p5 p4 p1 p6 p3 p2 p1 p2 p3 p4 p5 p6 8Thursday, June 5, 14
  • 12. Failure Handling Server 1 Server 2Server 1 Acquire new container close to data if possible Assign failed partitions to new container On Failure - Even load distribution, while waiting for new container Server 23 Server 3 p5 p4 p1 p6 p3 p2 p1 p2 p3 p4 p5 p6 8Thursday, June 5, 14
  • 13. Failure Handling Server 1 Server 2Server 1 Acquire new container close to data if possible Assign failed partitions to new container On Failure - Even load distribution, while waiting for new container Server 23 Server 3 Server 4 p5 p4 p1 p6 p3 p2 p1 p2 p3 p4 p5 p6 p3 p2 p5 p6 8Thursday, June 5, 14
  • 14. Workload Changes Server 1 Server 2Server 3 Workload change - Acquire/Release containers Container change - Re-distribute work Monitor - CPU, Memory, Latency, Tps p1 p2 p5 p4 Server 3 p3 p4 p1 p6 Server 3 p5 p6 p3 p2 9Thursday, June 5, 14
  • 15. Workload Changes Server 1 Server 2Server 3 Workload change - Acquire/Release containers Container change - Re-distribute work Monitor - CPU, Memory, Latency, Tps p1 p2 p5 p4 Server 3 p3 p4 p1 p6 Server 3 p5 p6 p3 p2 Server 3 p4 p6 p2 9Thursday, June 5, 14
  • 16. Workload Changes Server 1 Server 2Server 3 Workload change - Acquire/Release containers Container change - Re-distribute work Monitor - CPU, Memory, Latency, Tps p1 p2 p5 Server 3 p3 p4 p1 Server 3 p5 p6 p3 Server 3 p4 p6 p2 9Thursday, June 5, 14
  • 17. Service Discovery Server 1 Server 2Server 3 Dynamically updated on changes Discover everything, what is running where p1 p2 p1 p1 Server 3 p3 p4 p1 p1 Server 3 p5 p6 p1 p1 10Thursday, June 5, 14
  • 18. Service Discovery Server 1 Server 2Server 3 Dynamically updated on changes Discover everything, what is running where p1 p2 p1 p1 Server 3 p3 p4 p1 p1 Server 3 p5 p6 p1 p1 Client Client Service Discovery 10Thursday, June 5, 14
  • 19. Building YARN Application Writing AM is Hard and Error Prone Handling Faults, Workload Changes is non-trivial and often overlooked Request container How many containers Where Assign work Place partitions & replicas Affinity Workload changes acquire/ release containers Minimize movement Faults Handling Detect non trivial failures new v/s reuse containers Other Service Discovery Monitoring 11Thursday, June 5, 14
  • 20. Building YARN Application Writing AM is Hard and Error Prone Handling Faults, Workload Changes is non-trivial and often overlooked Request container How many containers Where Assign work Place partitions & replicas Affinity Workload changes acquire/ release containers Minimize movement Faults Handling Detect non trivial failures new v/s reuse containers Other Service Discovery Monitoring Is there something that can make this easy? 11Thursday, June 5, 14
  • 22. What is Helix? Built at LinkedIn, 2+ years in production Generic cluster management framework Contributed to Apache, now a TLP: helix.apache.org Decoupling cluster management from core functionality 13Thursday, June 5, 14
  • 23. Helix at LinkedIn Oracle Oracle OracleDB Change Capture Change Consumers Index Search Index User Writes Data Replicator In Production ETL HDFS Analytics 14Thursday, June 5, 14
  • 24. Helix at LinkedIn In Production Over 1000 instances covering over 30000 partitions Over 1000 instances for change capture consumers As many as 500 instances in a single Helix cluster (all numbers are per-datacenter) 15Thursday, June 5, 14
  • 26. Helix concepts Resource (Database, Index, Topic, Task) 17Thursday, June 5, 14
  • 27. Helix concepts Resource (Database, Index, Topic, Task) Partitions p1 p2 p3 p4 p5 p6 17Thursday, June 5, 14
  • 28. Helix concepts Resource (Database, Index, Topic, Task) Partitions Replicas p1 p2 p3 p4 p5 p6 r1 r2 r3 17Thursday, June 5, 14
  • 29. Helix concepts Resource (Database, Index, Topic, Task) Partitions Replicas p1 p2 p3 p4 p5 p6 r1 r2 r3 Container Process Container Process Container Process 17Thursday, June 5, 14
  • 30. Helix concepts Resource (Database, Index, Topic, Task) Partitions Replicas p1 p2 p3 p4 p5 p6 r1 r2 r3 Container Process Container Process Container Process Assignment ? 17Thursday, June 5, 14
  • 31. State Model and Constraints Helix Concepts 18Thursday, June 5, 14
  • 32. Serve bootstrap State Model and Constraints Helix Concepts Stop 18Thursday, June 5, 14
  • 33. Serve bootstrap State Model and Constraints Helix Concepts State Constraints Transition Constraints Partition Resource Node Cluster Serve: 3 bootstrap: 0 Max T1 transitions in parallel - Max T2 transitions in parallel No more than 10 replicas Max T3 transitions in parallel - Max T4 transitions in parallel Stop 18Thursday, June 5, 14
  • 34. Serve bootstrap State Model and Constraints Helix Concepts State Constraints Transition Constraints Partition Resource Node Cluster Serve: 3 bootstrap: 0 Max T1 transitions in parallel - Max T2 transitions in parallel No more than 10 replicas Max T3 transitions in parallel - Max T4 transitions in parallel StateCount= Replication factor:3 Stop 18Thursday, June 5, 14
  • 35. ParticipantParticipantParticipant Helix Architecture P1 stop bootstrap server P2 P5 P3 P4 P8 P6 P7 Controller Client Client Target Provider Provisioner Rebalancer assign work via callback spectator spectator Service Discovery metrics metrics 19Thursday, June 5, 14
  • 37. Helix Controller Target Provider Determine how many containers are required along with the spec Fixed CPU Memory Bin Packing monitoring system provides usage information Default implementations, Bin Packing can be used to customize further TargetProvider Resources p1,p2 .. pn Existing containers c1,c2 .. cn Health of tasks, containers cpu, memory, health Allocation constraints Affinity, rack locality SLA Fixed: 10 containers CPU headroom:30% Memory Usage: 70% time: 5h Number of container release list acquire list Container spec cpu: x memory: y location: L 21Thursday, June 5, 14
  • 38. Helix Controller Provisioner Given the container spec, interact with YARN RM to acquire/release, NM to start/stop containers YARN Interacts with YARN RM and subscribes to notifications 22Thursday, June 5, 14
  • 39. Helix Controller Rebalancer Based on the current nodes in the cluster and constraints, find an assignment of task to node Auto Semi-Auto Static Rebalancer Tasks t1,t2 .. tn Existing containers c1,c2 .. cn Allocation constraints & objectives Affinity, rack locality, Even distribution of tasks, Minimize movement while expanding Assignment C1: t1,t2 C2: t3,t4 User defined Based on the FSM, compute & fire the transitions to Participants 23Thursday, June 5, 14
  • 40. Example System: Helix-Based Solution Solution Configure App Configure Target Provider Configure Provisioner Configure Rebalancer Generate Data Serve M/R Server 3 HDFS 3 24Thursday, June 5, 14
  • 41. Configure AppConfigure App App Name Partitioned Data Server App Master Package /path/to/ GenericHelixAppMaster.tar App package /path/to/ RedisServerLauncher.tar App Config DataDirectory: hdfs:/path/to/ data Configure target providerConfigure target provider TargetProvider RedisTargetProvider Goal Target TPS: 1 million Min container 1 Max containers 25 Configure ProvisionerConfigure Provisioner YARN RM host:port Configure RebalancerConfigure Rebalancer Partitions 6 Replica 2 Max partitions per container 4 Rebalancer.Mode AUTO Placement Data Affinity FailureHandling Even distribution Scaling Minimize Movement app_config_spec.yaml Example System: Helix-Based Solution 25Thursday, June 5, 14
  • 43. Helix + YARN Server 1 Server 2 27Thursday, June 5, 14
  • 44. Helix + YARN YARN Resource Manager Client submit job Server 1 Server 2 27Thursday, June 5, 14
  • 45. Application Master Helix + YARN YARN Resource Manager Client submit job Launch AM Server 1 Server 2 27Thursday, June 5, 14
  • 46. Application Master Helix + YARN Helix Controller YARN Resource Manager Target Provider Provisioner RebalancerClient submit job Launch AM Server 1 Server 2 27Thursday, June 5, 14
  • 47. Application Master Helix + YARN Helix Controller YARN Resource Manager Target Provider Provisioner RebalancerClient submit job Launch AM request cntrs Server 1 Server 2 27Thursday, June 5, 14
  • 48. Node ManagerNode Manager Application Master Helix + YARN Helix Controller Node Manager YARN Resource Manager Target Provider Provisioner RebalancerClient submit job Launch AM request cntrs launch containers Server 1 Server 2participant 3 participant 3 participant 3 27Thursday, June 5, 14
  • 49. Node ManagerNode Manager Application Master Helix + YARN Helix Controller Node Manager YARN Resource Manager Target Provider Provisioner Rebalancer assign work Client submit job Launch AM request cntrs launch containers Server 1 Server 2participant 3 p1 p2 p5 p4 participant 3 p3 p4 p1 p6 participant 3 p5 p6 p3 p2 27Thursday, June 5, 14
  • 50. Auto Scaling Non linear scaling from 0 to 1M TPS and back 28Thursday, June 5, 14
  • 51. Failure Handling: Random Faults Recovering from faults at 1M Tps (5%, 10%, 20% failures/min) 29Thursday, June 5, 14
  • 52. Summary HDFS YARN (cluster resource management) HELIX (container + task management) Others (Batch, Interactive, Online, Streaming) Fault tolerance, Expansion handled transparently Generic Application Master Efficient resource utilization by task model 30Thursday, June 5, 14
  • 53. Questions? Website Twitter Mail Team helix.apache.org, #apachehelix @apachehelix, @kishore_b_g user@helix.apache.org Kanak Biscuitwala, Zhen Zhang ?We love helping & being helped 31Thursday, June 5, 14