Multi-Tenant Data Cloud
with YARN & Helix
LinkedIn - Data infra : Helix, Espresso
@kishore_b_g
Yahoo - Ads infra : S4
Kish...
What is YARN
Next Generation Compute Platform
MapReduce
HDFS
Hadoop 1.0
MapReduce
HDFS
Hadoop 2.0
Others
(Batch, Interacti...
HDFS/Common Area
YARN
YARN Architecture
Client
Resource
Manager
Node Manager Node Manager
submit job
node statusnode statu...
So, let’s build something
Example System
Generate Data
Serve
M/R
Redis
Server 3
HDFS 3
- Generate data in Hadoop
- Use it for serving
Application
Master
Example System
Request
Containers Assign work
Handle Failure
Handle
workload
Changes
Requirements
Big D...
Allocation + Assignment
HDFS
Server 1 Server 2Server 3
Partition Assignment - affinity, even distribution
Replica Placement...
Failure Handling
Server 1 Server 2Server 1
Acquire new container close to data if possible
Assign failed partitions to new...
Workload Changes
Server 1 Server 2Server 3
Workload change - Acquire/Release containers
Container change - Re-distribute w...
Service Discovery
Server 1 Server 2Server 3
Dynamically updated on changes
Discover everything, what is running where
p1 p...
Building YARN Application
Writing AM is Hard and Error Prone
Handling Faults, Workload Changes is non-trivial and often ov...
Apache Helix
What is Helix?
Built at LinkedIn, 2+ years in production
Generic cluster management framework
Contributed to Apache, now a...
Helix at LinkedIn
Oracle
Oracle
OracleDB
Change Capture
Change
Consumers
Index Search Index
User Writes
Data Replicator
In...
Helix at LinkedIn
In Production
Over 1000 instances covering over 30000
partitions
Over 1000 instances for change
capture ...
Others Using Helix
Helix concepts
Resource
(Database, Index, Topic, Task)
Partitions
Replicas
p1 p2 p3 p4 p5 p6
r1
r2
r3
Container
Process
Co...
Serve
bootstrap
State Model and Constraints
Helix Concepts
State
Constraints
Transition
Constraints
Partition
Resource
Nod...
ParticipantParticipantParticipant
Helix Architecture
P1
stop
bootstrap
server
P2 P5
P3
P4
P8
P6
P7
Controller
Client Clien...
Helix Controller
High-Level Overview
Resource
Config
Constraints
Objectives
Controller
TargetProvider
Provisioner
Rebalance...
Helix Controller
Target Provider
Determine how many containers are required along with the spec
Fixed CPU Memory Bin Packi...
Helix Controller
Provisioner
Given the container spec, interact with YARN RM to
acquire/release, NM to start/stop containe...
Helix Controller
Rebalancer
Based on the current nodes in the cluster and constraints, find an
assignment of task to node
A...
Example System: Helix-Based Solution
Solution
Configure App
Configure Target Provider
Configure Provisioner
Configure Rebalanc...
Configure AppConfigure App
App Name Partitioned Data Server
App Master
Package
/path/to/
GenericHelixAppMaster.tar
App packa...
yarn_app_launcher.sh	
  app_config_spec.yaml
Launch Application
Node ManagerNode Manager
Application Master
Helix + YARN
Helix Controller
Node Manager
YARN
Resource
Manager
Target Provid...
Auto Scaling
Non linear scaling from 0 to 1M TPS and back
Failure Handling: Random Faults
Recovering from faults at 1M Tps (5%, 10%, 20% failures/min)
Summary
HDFS
YARN
(cluster resource management)
HELIX
(container + task management)
Others
(Batch, Interactive, Online, St...
Questions?
Website
Twitter
Mail
Team
helix.apache.org, #apachehelix
@apachehelix, @kishore_b_g
user@helix.apache.org
Kanak...
Upcoming SlideShare
Loading in...5
×

One Grid to rule them all: Building a Multi-tenant Data Cloud with YARN

1,271

Published on

Published in: Technology
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,271
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
0
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide

Transcript of "One Grid to rule them all: Building a Multi-tenant Data Cloud with YARN"

  1. 1. Multi-Tenant Data Cloud with YARN & Helix LinkedIn - Data infra : Helix, Espresso @kishore_b_g Yahoo - Ads infra : S4 Kishore Gopalakrishna
  2. 2. What is YARN Next Generation Compute Platform MapReduce HDFS Hadoop 1.0 MapReduce HDFS Hadoop 2.0 Others (Batch, Interactive, Online, Streaming) YARN (cluster resource management) A1 A1 A2 A3 B1 C1 C5 B2 B3 C2 B4 B5 C3 C4 Enables
  3. 3. HDFS/Common Area YARN YARN Architecture Client Resource Manager Node Manager Node Manager submit job node statusnode status container request App Package Application Master Container
  4. 4. So, let’s build something
  5. 5. Example System Generate Data Serve M/R Redis Server 3 HDFS 3 - Generate data in Hadoop - Use it for serving
  6. 6. Application Master Example System Request Containers Assign work Handle Failure Handle workload Changes Requirements Big Data :-) Partitioned, replicated Fault tolerant, Scalable Efficient resource utilization Generate Data Serve M/R Server 3 HDFS 3
  7. 7. Allocation + Assignment HDFS Server 1 Server 2Server 3 Partition Assignment - affinity, even distribution Replica Placement - on different physical machines Container Allocation - data affinity, rack aware placement M/Rp1 p2 p3 p4 p5 p6 p1 p2 p5 p4 Server 3 p3 p4 p1 p6 Server 3 p5 p6 p3 p2 Multiple servers to serve the partitioned data M/R job generates partitioned data
  8. 8. Failure Handling Server 1 Server 2Server 1 Acquire new container close to data if possible Assign failed partitions to new container On Failure - Even load distribution, while waiting for new container Server 23 Server 3 Server 4 p5 p4 p1 p6 p3 p2 p1 p2 p3 p4 p5 p6 p3 p2 p5 p6
  9. 9. Workload Changes Server 1 Server 2Server 3 Workload change - Acquire/Release containers Container change - Re-distribute work Monitor - CPU, Memory, Latency, Tps p1 p2 p5 p4 Server 3 p3 p4 p1 p6 Server 3 p5 p6 p3 p2 Server 3 p4 p6 p2
  10. 10. Service Discovery Server 1 Server 2Server 3 Dynamically updated on changes Discover everything, what is running where p1 p2 p1 p1 Server 3 p3 p4 p1 p1 Server 3 p5 p6 p1 p1 Client Client Service Discovery
  11. 11. Building YARN Application Writing AM is Hard and Error Prone Handling Faults, Workload Changes is non-trivial and often overlooked Request container How many containers Where Assign work Place partitions & replicas Affinity Workload changes acquire/ release containers Minimize movement Faults Handling Detect non trivial failures new v/s reuse containers Other Service Discovery Monitoring Is there something that can make this easy?
  12. 12. Apache Helix
  13. 13. What is Helix? Built at LinkedIn, 2+ years in production Generic cluster management framework Contributed to Apache, now a TLP: helix.apache.org Decoupling cluster management from core functionality
  14. 14. Helix at LinkedIn Oracle Oracle OracleDB Change Capture Change Consumers Index Search Index User Writes Data Replicator In Production ETL HDFS Analytics
  15. 15. Helix at LinkedIn In Production Over 1000 instances covering over 30000 partitions Over 1000 instances for change capture consumers As many as 500 instances in a single Helix cluster (all numbers are per-datacenter)
  16. 16. Others Using Helix
  17. 17. Helix concepts Resource (Database, Index, Topic, Task) Partitions Replicas p1 p2 p3 p4 p5 p6 r1 r2 r3 Container Process Container Process Container Process Assignment ?
  18. 18. Serve bootstrap State Model and Constraints Helix Concepts State Constraints Transition Constraints Partition Resource Node Cluster Serve: 3 bootstrap: 0 Max T1 transitions in parallel - Max T2 transitions in parallel No more than 10 replicas Max T3 transitions in parallel - Max T4 transitions in parallel StateCount= Replication factor:3 Stop
  19. 19. ParticipantParticipantParticipant Helix Architecture P1 stop bootstrap server P2 P5 P3 P4 P8 P6 P7 Controller Client Client Target Provider Provisioner Rebalancer assign work via callback spectator spectator Service Discovery metrics metrics
  20. 20. Helix Controller High-Level Overview Resource Config Constraints Objectives Controller TargetProvider Provisioner Rebalancer Number of Containers Task-> Container Mapping YARN RM
  21. 21. Helix Controller Target Provider Determine how many containers are required along with the spec Fixed CPU Memory Bin Packing monitoring system provides usage information Default implementations, Bin Packing can be used to customize further TargetProvider Resources p1,p2 .. pn Existing containers c1,c2 .. cn Health of tasks, containers cpu, memory, health Allocation constraints Affinity, rack locality SLA Fixed: 10 containers CPU headroom:30% Memory Usage: 70% time: 5h Number of container release list acquire list Container spec cpu: x memory: y location: L
  22. 22. Helix Controller Provisioner Given the container spec, interact with YARN RM to acquire/release, NM to start/stop containers YARN Interacts with YARN RM and subscribes to notifications
  23. 23. Helix Controller Rebalancer Based on the current nodes in the cluster and constraints, find an assignment of task to node Auto Semi-Auto Static Rebalancer Tasks t1,t2 .. tn Existing containers c1,c2 .. cn Allocation constraints & objectives Affinity, rack locality, Even distribution of tasks, Minimize movement while expanding Assignment C1: t1,t2 C2: t3,t4 User defined Based on the FSM, compute & fire the transitions to Participants
  24. 24. Example System: Helix-Based Solution Solution Configure App Configure Target Provider Configure Provisioner Configure Rebalancer Generate Data Serve M/R Server 3 HDFS 3
  25. 25. Configure AppConfigure App App Name Partitioned Data Server App Master Package /path/to/ GenericHelixAppMaster.tar App package /path/to/ RedisServerLauncher.tar App Config DataDirectory: hdfs:/path/to/ data Configure target providerConfigure target provider TargetProvider RedisTargetProvider Goal Target TPS: 1 million Min container 1 Max containers 25 Configure ProvisionerConfigure Provisioner YARN RM host:port Configure RebalancerConfigure Rebalancer Partitions 6 Replica 2 Max partitions per container 4 Rebalancer.Mode AUTO Placement Data Affinity FailureHandling Even distribution Scaling Minimize Movement app_config_spec.yaml Example System: Helix-Based Solution
  26. 26. yarn_app_launcher.sh  app_config_spec.yaml Launch Application
  27. 27. Node ManagerNode Manager Application Master Helix + YARN Helix Controller Node Manager YARN Resource Manager Target Provider Provisioner Rebalancer assign work Client submit job Launch AM request cntrs launch containers Server 1 Server 2participant 3 p1 p2 p5 p4 participant 3 p3 p4 p1 p6 participant 3 p5 p6 p3 p2
  28. 28. Auto Scaling Non linear scaling from 0 to 1M TPS and back
  29. 29. Failure Handling: Random Faults Recovering from faults at 1M Tps (5%, 10%, 20% failures/min)
  30. 30. Summary HDFS YARN (cluster resource management) HELIX (container + task management) Others (Batch, Interactive, Online, Streaming) Fault tolerance, Expansion handled transparently Generic Application Master Efficient resource utilization by task model
  31. 31. Questions? Website Twitter Mail Team helix.apache.org, #apachehelix @apachehelix, @kishore_b_g user@helix.apache.org Kanak Biscuitwala, Zhen Zhang ?We love helping & being helped

×