SlideShare a Scribd company logo
1 of 145
Download to read offline
Targeted Resource Management in
Multi-tenant Distributed Systems
Jonathan Mace
Brown University
Peter Bodik
MSR Redmond
Rodrigo Fonseca
Brown University
Madanlal Musuvathi
MSR Redmond
Resource Management in Multi-Tenant Systems
2
Failure in availability zone cascades to shared control
plane, causes thread pool starvation for all zones
• April 2011 – Amazon EBS Failure
Resource Management in Multi-Tenant Systems
2
Failure in availability zone cascades to shared control
plane, causes thread pool starvation for all zones
• April 2011 – Amazon EBS Failure
Aggressive background task responds to increased
hardware capacity with deluge of warnings and logging
• Aug. 2012 – Azure Storage Outage
Resource Management in Multi-Tenant Systems
2
Code change increases database usage, shifts
bottleneck to unmanaged application-level lock
• Nov. 2014 – Visual Studio Online outage
Failure in availability zone cascades to shared control
plane, causes thread pool starvation for all zones
• April 2011 – Amazon EBS Failure
Aggressive background task responds to increased
hardware capacity with deluge of warnings and logging
• Aug. 2012 – Azure Storage Outage
Resource Management in Multi-Tenant Systems
2
Code change increases database usage, shifts
bottleneck to unmanaged application-level lock
• Nov. 2014 – Visual Studio Online outage
Failure in availability zone cascades to shared control
plane, causes thread pool starvation for all zones
• April 2011 – Amazon EBS Failure
Aggressive background task responds to increased
hardware capacity with deluge of warnings and logging
• Aug. 2012 – Azure Storage Outage
Resource Management in Multi-Tenant Systems
2
Shared storage layer bottlenecks circumvent
resource management layer
• 2014 – Communication with Cloudera
Code change increases database usage, shifts
bottleneck to unmanaged application-level lock
• Nov. 2014 – Visual Studio Online outage
Failure in availability zone cascades to shared control
plane, causes thread pool starvation for all zones
• April 2011 – Amazon EBS Failure
Aggressive background task responds to increased
hardware capacity with deluge of warnings and logging
• Aug. 2012 – Azure Storage Outage
Resource Management in Multi-Tenant Systems
2
Degraded performance, Violated SLOs, system outages
Shared storage layer bottlenecks circumvent
resource management layer
• 2014 – Communication with Cloudera
Code change increases database usage, shifts
bottleneck to unmanaged application-level lock
• Nov. 2014 – Visual Studio Online outage
Failure in availability zone cascades to shared control
plane, causes thread pool starvation for all zones
• April 2011 – Amazon EBS Failure
Aggressive background task responds to increased
hardware capacity with deluge of warnings and logging
• Aug. 2012 – Azure Storage Outage
Resource Management in Multi-Tenant Systems
3
Degraded performance, Violated SLOs, system outages
Shared storage layer bottlenecks circumvent
resource management layer
• 2014 – Communication with Cloudera
Code change increases database usage, shifts
bottleneck to unmanaged application-level lock
• Nov. 2014 – Visual Studio Online outage
Failure in availability zone cascades to shared control
plane, causes thread pool starvation for all zones
• April 2011 – Amazon EBS Failure
Aggressive background task responds to increased
hardware capacity with deluge of warnings and logging
• Aug. 2012 – Azure Storage Outage
Resource Management in Multi-Tenant Systems
3
Degraded performance, Violated SLOs, system outages
Shared storage layer bottlenecks circumvent
resource management layer
• 2014 – Communication with Cloudera
OS
Hypervisor
Code change increases database usage, shifts
bottleneck to unmanaged application-level lock
• Nov. 2014 – Visual Studio Online outage
Failure in availability zone cascades to shared control
plane, causes thread pool starvation for all zones
• April 2011 – Amazon EBS Failure
Aggressive background task responds to increased
hardware capacity with deluge of warnings and logging
• Aug. 2012 – Azure Storage Outage
Resource Management in Multi-Tenant Systems
3
Degraded performance, Violated SLOs, system outages
Shared storage layer bottlenecks circumvent
resource management layer
• 2014 – Communication with Cloudera
OS
Hypervisor
Containers / VMs
4
Containers / VMs
Shared Systems:
Storage, Database,
Queueing, etc.
4
5
5
Monitors resource usage of each tenant in near real-time
Actively schedules tenants and activities
5
Monitors resource usage of each tenant in near real-time
Actively schedules tenants and activities
High-level, centralized policies:
Encapsulates resource management logic
5
Monitors resource usage of each tenant in near real-time
Actively schedules tenants and activities
High-level, centralized policies:
Encapsulates resource management logic
Abstractions – not specific to resource type, system
Achieve different goals: guarantee average latencies, fair share
a resource, etc.
5
Hadoop Distributed File System (HDFS)
6
Hadoop Distributed File System (HDFS)
HDFS NameNode
Filesystem metadata
6
HDFS DataNode
HDFS DataNode
HDFS DataNode
Hadoop Distributed File System (HDFS)
Replicated block storage
HDFS NameNode
Filesystem metadata
6
HDFS DataNode
HDFS DataNode
HDFS DataNode
Hadoop Distributed File System (HDFS)
Replicated block storage
HDFS NameNode
Filesystem metadata
Rename
6
HDFS DataNode
HDFS DataNode
HDFS DataNode
Hadoop Distributed File System (HDFS)
Replicated block storage
HDFS NameNode
Filesystem metadata
Read
Rename
6
HDFS DataNode
HDFS DataNode
HDFS DataNode
Hadoop Distributed File System (HDFS)
Replicated block storage
HDFS NameNode
Filesystem metadata
Read
Rename
6
HDFS NameNode HDFS DataNode
HDFS DataNode
HDFS DataNode
7
HDFS NameNode HDFS DataNode
HDFS DataNode
HDFS DataNode
7
time
500
0
HDFS NameNode HDFS DataNode
HDFS DataNode
HDFS DataNode
7
time
500
0
HDFS NameNode HDFS DataNode
HDFS DataNode
HDFS DataNode
8
500
0
HDFS NameNode HDFS DataNode
HDFS DataNode
HDFS DataNode
8
500
0
12
0
HDFS NameNode HDFS DataNode
HDFS DataNode
HDFS DataNode
8
500
0
12
0
HDFS NameNode HDFS DataNode
HDFS DataNode
HDFS DataNode
9
500
0
12
0
HDFS NameNode HDFS DataNode
HDFS DataNode
HDFS DataNode
9
500
0
12
0
500
0
HDFS NameNode HDFS DataNode
HDFS DataNode
HDFS DataNode
10
500
0
12
0
500
0
HDFS NameNode HDFS DataNode
HDFS DataNode
HDFS DataNode
10
500
0
12
0
500
0
HDFS NameNode HDFS DataNode
HDFS DataNode
HDFS DataNode
10
500
0
12
0
500
0
500
0
Local storage
HDFS DataNode
11
Local storage
HDFS DataNode
11
Local storage
HBase RegionServer
HDFS DataNode
11
Local storage
HBase RegionServer
HDFS DataNode
11
Local storage
HBase RegionServer
HDFS DataNode
MapReduce
Shuffler
Yarn ContainerYarn Container
Hadoop YARN
Container
MapReduce Tasks
11
Local storage
HBase RegionServer
HDFS DataNode
MapReduce
Shuffler
Yarn ContainerYarn Container
Hadoop YARN
Container
MapReduce Tasks
11
Local storage
HBase RegionServer
HDFS DataNode
Hadoop YARN
NodeManager
Yarn ContainerYarn ContainerHadoop YARN
Container
MapReduce
Local storage
HBase RegionServer
HDFS DataNode
Hadoop YARN
NodeManager
Yarn ContainerYarn ContainerHadoop YARN
Container
MapReduce
Local storage
HBase RegionServer
HDFS DataNode
MapReduce
Shuffler
Yarn ContainerYarn Container
Hadoop YARN
Container
MapReduce Tasks
11
Goals
12
Co-ordinated control across processes, machines, and
services
Goals
12
Co-ordinated control across processes, machines, and
services
Handle system and application level resources
Goals
12
Co-ordinated control across processes, machines, and
services
Handle system and application level resources
Principals: tenants, background tasks
Goals
12
Co-ordinated control across processes, machines, and
services
Handle system and application level resources
Principals: tenants, background tasks
Real-time and reactive
Goals
12
Co-ordinated control across processes, machines, and
services
Handle system and application level resources
Principals: tenants, background tasks
Real-time and reactive
Efficient: Only control what is needed
Goals
12
Architecture
13
14
Tenant Requests
14
Workflows
Purpose: identify requests from different users, background activities
eg, all requests from a tenant over time
eg, data balancing in HDFS
15
Workflows
Purpose: identify requests from different users, background activities
eg, all requests from a tenant over time
eg, data balancing in HDFS
Unit of resource measurement, attribution, and enforcement
15
Workflows
Purpose: identify requests from different users, background activities
eg, all requests from a tenant over time
eg, data balancing in HDFS
Unit of resource measurement, attribution, and enforcement
Tracks a request across varying levels of granularity
Orthogonal to threads, processes, network flows, etc.
15
Workflows
16
Workflows
16
Workflows Resources
Purpose: cope with diversity of resources
16
Workflows Resources
Purpose: cope with diversity of resources
What we need:
1. Identify overloaded resources
16
Workflows Resources
Purpose: cope with diversity of resources
What we need:
1. Identify overloaded resources
2. Identify culprit workflows
16
Workflows Resources
Purpose: cope with diversity of resources
What we need:
1. Identify overloaded resources
Slowdown Ratio of how slow the resource is now compared to
its baseline performance with no contention.
2. Identify culprit workflows
16
Workflows Resources
Purpose: cope with diversity of resources
What we need:
1. Identify overloaded resources
Slowdown Ratio of how slow the resource is now compared to
its baseline performance with no contention.
2. Identify culprit workflows
Load Fraction of current utilization that we can attribute
to each workflow
16
Workflows Resources
17
Purpose: cope with diversity of resources
What we need:
1. Identify overloaded resources
Slowdown Ratio of how slow the resource is now compared to
its baseline performance with no contention.
2. Identify culprit workflows
Load Fraction of current utilization that we can attribute
to each workflow
Workflows Resources
17
Purpose: cope with diversity of resources
What we need:
1. Identify overloaded resources
Slowdown Ratio of how slow the resource is now compared to
its baseline performance with no contention.
2. Identify culprit workflows
Load Fraction of current utilization that we can attribute
to each workflow
Workflows Resources
Slowdown (queue time + execute time) / execute time
eg. 100ms queue, 10ms execute
=> slowdown 11
Load time spent executing
eg. 10ms execute
=> load 10
17
Purpose: cope with diversity of resources
What we need:
1. Identify overloaded resources
Slowdown Ratio of how slow the resource is now compared to
its baseline performance with no contention.
2. Identify culprit workflows
Load Fraction of current utilization that we can attribute
to each workflow
Workflows Resources
17
Workflows Resources Control Points
Goal: enforce resource management decisions
18
Workflows Resources Control Points
Goal: enforce resource management decisions
Decoupled from resources
Rate-limits workflows, agnostic to underlying implementation e.g.,
token bucket
priority queue
18
Workflows Resources Control Points
Goal: enforce resource management decisions
Decoupled from resources
Rate-limits workflows, agnostic to underlying implementation e.g.,
token bucket
priority queue
19
Workflows Resources Control Points
Goal: enforce resource management decisions
Decoupled from resources
Rate-limits workflows, agnostic to underlying implementation e.g.,
token bucket
priority queue
19
Workflows Resources Control Points
Goal: enforce resource management decisions
Decoupled from resources
Rate-limits workflows, agnostic to underlying implementation e.g.,
token bucket
priority queue
19
Workflows Resources Control Points
Goal: enforce resource management decisions
Decoupled from resources
Rate-limits workflows, agnostic to underlying implementation e.g.,
token bucket
priority queue
19
Workflows Resources Control Points
Goal: enforce resource management decisions
Decoupled from resources
Rate-limits workflows, agnostic to underlying implementation e.g.,
token bucket
priority queue
19
Workflows Resources Control Points
Goal: enforce resource management decisions
Decoupled from resources
Rate-limits workflows, agnostic to underlying implementation e.g.,
token bucket
priority queue
19
Workflows Resources Control Points
Goal: enforce resource management decisions
Decoupled from resources
Rate-limits workflows, agnostic to underlying implementation e.g.,
token bucket
priority queue
19
Workflows Resources Control Points
20
Pervasive Measurement
1. Pervasive Measurement
Aggregated locally then reported centrally once per second
Workflows Resources Control Points
20
RetroControllerAPI
Pervasive Measurement
1. Pervasive Measurement
Aggregated locally then reported centrally once per second
2. Centralized Controller
Global, abstracted view of the system
Workflows Resources Control Points
20
RetroControllerAPI
Pervasive Measurement
1. Pervasive Measurement
Aggregated locally then reported centrally once per second
2. Centralized Controller
Global, abstracted view of the system
Policies run in continuous control loop
PolicyPolicyPolicy
Workflows Resources Control Points
20
RetroControllerAPI
Pervasive Measurement
1. Pervasive Measurement
Aggregated locally then reported centrally once per second
2. Centralized Controller
Global, abstracted view of the system
Policies run in continuous control loop
3. Distributed Enforcement
Co-ordinates enforcement using distributed token bucket
PolicyPolicyPolicy
Workflows Resources Control Points
Distributed Enforcement
20
RetroControllerAPI
Distributed Enforcement
Pervasive Measurement
Workflows Resources Control Points
PolicyPolicyPolicy
“Control Plane” for resource management
Global, abstracted view of the system
Easier to write
Reusable
21
22
Example: LatencySLOPolicy
22
Example: LatencySLO
H High Priority Workflows
Policy
“200ms average request latency”
22
Example: LatencySLO
H High Priority Workflows
Policy
L Low Priority Workflows
“200ms average request latency”
(use spare capacity)
22
Example: LatencySLO
H High Priority Workflows
Policy
L Low Priority Workflows
“200ms average request latency”
(use spare capacity)
monitor latencies
22
Example: LatencySLO
H High Priority Workflows
Policy
L Low Priority Workflows
“200ms average request latency”
(use spare capacity)
monitor latencies
attribute interference
22
Example: LatencySLO
H High Priority Workflows
Policy
L Low Priority Workflows
“200ms average request latency”
(use spare capacity)
monitor latencies
throttle interfering workflows
23
1 foreach candidate in H
2 miss[candidate] = latency(candidate) / guarantee[candidate]
3 W = candidate in H with max miss[candidate]
4 foreach rsrc in resources() // calculate importance of each resource for hipri
5 importance[rsrc] = latency(W, rsrc) * log(slowdown(rsrc))
6 foreach lopri in L // calculate low priority workflow interference
7 interference[lopri] = Σrsrc importance[rsrc] * load(lopri, rsrc) / load(rsrc)
8 foreach lopri in L // normalize interference
9 interference[lopri] /= Σk interference[k]
10 foreach lopri in L
11 if miss[W] > 1 // throttle
12 scalefactor = 1 – α * (miss[W] – 1) * interference[lopri]
13 else // release
14 scalefactor = 1 + β
15 foreach cpoint in controlpoints() // apply new rates
16 set_rate(cpoint, lopri, scalefactor * get_rate(cpoint, lopri)
Example: LatencySLO
H High Priority Workflows
Policy
L Low Priority Workflows
23
1 foreach candidate in H
2 miss[candidate] = latency(candidate) / guarantee[candidate]
3 W = candidate in H with max miss[candidate]
4 foreach rsrc in resources() // calculate importance of each resource for hipri
5 importance[rsrc] = latency(W, rsrc) * log(slowdown(rsrc))
6 foreach lopri in L // calculate low priority workflow interference
7 interference[lopri] = Σrsrc importance[rsrc] * load(lopri, rsrc) / load(rsrc)
8 foreach lopri in L // normalize interference
9 interference[lopri] /= Σk interference[k]
10 foreach lopri in L
11 if miss[W] > 1 // throttle
12 scalefactor = 1 – α * (miss[W] – 1) * interference[lopri]
13 else // release
14 scalefactor = 1 + β
15 foreach cpoint in controlpoints() // apply new rates
16 set_rate(cpoint, lopri, scalefactor * get_rate(cpoint, lopri)
Example: LatencySLO
H High Priority Workflows
Policy
L Low Priority Workflows
24
1 foreach candidate in H
2 miss[candidate] = latency(candidate) / guarantee[candidate]
3 W = candidate in H with max miss[candidate]
4 foreach rsrc in resources() // calculate importance of each resource for hipri
5 importance[rsrc] = latency(W, rsrc) * log(slowdown(rsrc))
6 foreach lopri in L // calculate low priority workflow interference
7 interference[lopri] = Σrsrc importance[rsrc] * load(lopri, rsrc) / load(rsrc)
8 foreach lopri in L // normalize interference
9 interference[lopri] /= Σk interference[k]
10 foreach lopri in L
11 if miss[W] > 1 // throttle
12 scalefactor = 1 – α * (miss[W] – 1) * interference[lopri]
13 else // release
14 scalefactor = 1 + β
15 foreach cpoint in controlpoints() // apply new rates
16 set_rate(cpoint, lopri, scalefactor * get_rate(cpoint, lopri)
Example: LatencySLO
H High Priority Workflows
Policy
L Low Priority Workflows
Select the high priority workflow W with worst performance
24
1 foreach candidate in H
2 miss[candidate] = latency(candidate) / guarantee[candidate]
3 W = candidate in H with max miss[candidate]
4 foreach rsrc in resources() // calculate importance of each resource for hipri
5 importance[rsrc] = latency(W, rsrc) * log(slowdown(rsrc))
6 foreach lopri in L // calculate low priority workflow interference
7 interference[lopri] = Σrsrc importance[rsrc] * load(lopri, rsrc) / load(rsrc)
8 foreach lopri in L // normalize interference
9 interference[lopri] /= Σk interference[k]
10 foreach lopri in L
11 if miss[W] > 1 // throttle
12 scalefactor = 1 – α * (miss[W] – 1) * interference[lopri]
13 else // release
14 scalefactor = 1 + β
15 foreach cpoint in controlpoints() // apply new rates
16 set_rate(cpoint, lopri, scalefactor * get_rate(cpoint, lopri)
Example: LatencySLO
H High Priority Workflows
Policy
L Low Priority Workflows
Select the high priority workflow W with worst performance
Weight low priority workflows by their interference with W
24
1 foreach candidate in H
2 miss[candidate] = latency(candidate) / guarantee[candidate]
3 W = candidate in H with max miss[candidate]
4 foreach rsrc in resources() // calculate importance of each resource for hipri
5 importance[rsrc] = latency(W, rsrc) * log(slowdown(rsrc))
6 foreach lopri in L // calculate low priority workflow interference
7 interference[lopri] = Σrsrc importance[rsrc] * load(lopri, rsrc) / load(rsrc)
8 foreach lopri in L // normalize interference
9 interference[lopri] /= Σk interference[k]
10 foreach lopri in L
11 if miss[W] > 1 // throttle
12 scalefactor = 1 – α * (miss[W] – 1) * interference[lopri]
13 else // release
14 scalefactor = 1 + β
15 foreach cpoint in controlpoints() // apply new rates
16 set_rate(cpoint, lopri, scalefactor * get_rate(cpoint, lopri)
Example: LatencySLO
H High Priority Workflows
Policy
L Low Priority Workflows
Select the high priority workflow W with worst performance
Weight low priority workflows by their interference with W
Throttle low priority workflows proportionally to their weight
25
1 foreach candidate in H
2 miss[candidate] = latency(candidate) / guarantee[candidate]
3 W = candidate in H with max miss[candidate]
4 foreach rsrc in resources() // calculate importance of each resource for hipri
5 importance[rsrc] = latency(W, rsrc) * log(slowdown(rsrc))
6 foreach lopri in L // calculate low priority workflow interference
7 interference[lopri] = Σrsrc importance[rsrc] * load(lopri, rsrc) / load(rsrc)
8 foreach lopri in L // normalize interference
9 interference[lopri] /= Σk interference[k]
10 foreach lopri in L
11 if miss[W] > 1 // throttle
12 scalefactor = 1 – α * (miss[W] – 1) * interference[lopri]
13 else // release
14 scalefactor = 1 + β
15 foreach cpoint in controlpoints() // apply new rates
16 set_rate(cpoint, lopri, scalefactor * get_rate(cpoint, lopri)
Example: LatencySLO
H High Priority Workflows
Policy
L Low Priority Workflows
Weight low priority workflows by their interference with W
Select the high priority workflow W with worst performance
Throttle low priority workflows proportionally to their weight
26
1 foreach candidate in H
2 miss[candidate] = latency(candidate) / guarantee[candidate]
3 W = candidate in H with max miss[candidate]
4 foreach rsrc in resources() // calculate importance of each resource for hipri
5 importance[rsrc] = latency(W, rsrc) * log(slowdown(rsrc))
6 foreach lopri in L // calculate low priority workflow interference
7 interference[lopri] = Σrsrc importance[rsrc] * load(lopri, rsrc) / load(rsrc)
8 foreach lopri in L // normalize interference
9 interference[lopri] /= Σk interference[k]
10 foreach lopri in L
11 if miss[W] > 1 // throttle
12 scalefactor = 1 – α * (miss[W] – 1) * interference[lopri]
13 else // release
14 scalefactor = 1 + β
15 foreach cpoint in controlpoints() // apply new rates
16 set_rate(cpoint, lopri, scalefactor * get_rate(cpoint, lopri)
Example: LatencySLO
H High Priority Workflows
Policy
L Low Priority Workflows
Throttle low priority workflows proportionally to their weight
Select the high priority workflow W with worst performance
Weight low priority workflows by their interference with W
27
1 foreach candidate in H
2 miss[candidate] = latency(candidate) / guarantee[candidate]
3 W = candidate in H with max miss[candidate]
4 foreach rsrc in resources() // calculate importance of each resource for hipri
5 importance[rsrc] = latency(W, rsrc) * log(slowdown(rsrc))
6 foreach lopri in L // calculate low priority workflow interference
7 interference[lopri] = Σrsrc importance[rsrc] * load(lopri, rsrc) / load(rsrc)
8 foreach lopri in L // normalize interference
9 interference[lopri] /= Σk interference[k]
10 foreach lopri in L
11 if miss[W] > 1 // throttle
12 scalefactor = 1 – α * (miss[W] – 1) * interference[lopri]
13 else // release
14 scalefactor = 1 + β
15 foreach cpoint in controlpoints() // apply new rates
16 set_rate(cpoint, lopri, scalefactor * get_rate(cpoint, lopri)
Example: LatencySLO
H High Priority Workflows
Policy
L Low Priority Workflows
Select the high priority workflow W with worst performance
Weight low priority workflows by their interference with W
Throttle low priority workflows proportionally to their weight
27
1 foreach candidate in H
2 miss[candidate] = latency(candidate) / guarantee[candidate]
3 W = candidate in H with max miss[candidate]
4 foreach rsrc in resources() // calculate importance of each resource for hipri
5 importance[rsrc] = latency(W, rsrc) * log(slowdown(rsrc))
6 foreach lopri in L // calculate low priority workflow interference
7 interference[lopri] = Σrsrc importance[rsrc] * load(lopri, rsrc) / load(rsrc)
8 foreach lopri in L // normalize interference
9 interference[lopri] /= Σk interference[k]
10 foreach lopri in L
11 if miss[W] > 1 // throttle
12 scalefactor = 1 – α * (miss[W] – 1) * interference[lopri]
13 else // release
14 scalefactor = 1 + β
15 foreach cpoint in controlpoints() // apply new rates
16 set_rate(cpoint, lopri, scalefactor * get_rate(cpoint, lopri)
Example: LatencySLO
H High Priority Workflows
Policy
L Low Priority Workflows
Select the high priority workflow W with worst performance
Weight low priority workflows by their interference with W
Throttle low priority workflows proportionally to their weight
Other types of policy…
28
Other types of policy…
Bottleneck Fairness
Detect most overloaded resource
Fair-share resource between tenants using it
Policy
28
Other types of policy…
Bottleneck Fairness
Detect most overloaded resource
Fair-share resource between tenants using it
Dominant Resource Fairness
Estimate demands and capacities from
measurements
Policy
Policy
28
Other types of policy…
Bottleneck Fairness
Detect most overloaded resource
Fair-share resource between tenants using it
Dominant Resource Fairness
Estimate demands and capacities from
measurements
Policy
Policy
Concise
Any resources can be bottleneck (policy doesn’t care)
Not system specific
28
Evaluation
29
Instrumentation
30
Instrumentation
Retro implementation in Java
Instrumentation Library
Central controller implementation
30
Instrumentation
Retro implementation in Java
Instrumentation Library
Central controller implementation
To enable Retro
Propagate Workflow ID within application (like X-Trace, Dapper)
Instrument resources with wrapper classes
30
Instrumentation
Retro implementation in Java
Instrumentation Library
Central controller implementation
To enable Retro
Propagate Workflow ID within application (like X-Trace, Dapper)
Instrument resources with wrapper classes
Overheads
31
Instrumentation
Retro implementation in Java
Instrumentation Library
Central controller implementation
To enable Retro
Propagate Workflow ID within application (like X-Trace, Dapper)
Instrument resources with wrapper classes
Overheads
Resource instrumentation automatic using AspectJ
31
Instrumentation
Retro implementation in Java
Instrumentation Library
Central controller implementation
To enable Retro
Propagate Workflow ID within application (like X-Trace, Dapper)
Instrument resources with wrapper classes
Overheads
Resource instrumentation automatic using AspectJ
Overall 50-200 lines per system to modify RPCs
31
Instrumentation
Retro implementation in Java
Instrumentation Library
Central controller implementation
To enable Retro
Propagate Workflow ID within application (like X-Trace, Dapper)
Instrument resources with wrapper classes
Overheads
Resource instrumentation automatic using AspectJ
Overall 50-200 lines per system to modify RPCs
Retro overhead: max 1-2% on throughput, latency
31
Experiments
32
Experiments
YARN MapReduce
ZooKeeper
Systems
HDFS
HBase
32
Experiments
YARN MapReduce
ZooKeeper
Systems
Workflows MapReduce Jobs (HiBench)
HBase (YCSB)
HDFS clients
Background Data Replication
Background Heartbeats
HDFS
HBase
32
Experiments
YARN MapReduce
ZooKeeper
Systems
Workflows MapReduce Jobs (HiBench)
HBase (YCSB)
HDFS clients
Background Data Replication
Background Heartbeats
Resources
CPU, Disk, Network (All systems)
Locks, Queues (HDFS, HBase)
HDFS
HBase
32
Experiments
YARN MapReduce
ZooKeeper
Systems
Workflows MapReduce Jobs (HiBench)
HBase (YCSB)
HDFS clients
Background Data Replication
Background Heartbeats
Resources
CPU, Disk, Network (All systems)
Locks, Queues (HDFS, HBase)
Policies
LatencySLO
Bottleneck
Fairness
Dominant Resource
Fairness
HDFS
HBase
32
Experiments
Policies for a mixture of systems, workflows, and resources
Results on clusters up to 200 nodes
YARN MapReduce
ZooKeeper
Systems
Workflows MapReduce Jobs (HiBench)
HBase (YCSB)
HDFS clients
Background Data Replication
Background Heartbeats
Resources
CPU, Disk, Network (All systems)
Locks, Queues (HDFS, HBase)
Policies
LatencySLO
Bottleneck
Fairness
Dominant Resource
Fairness
See paper for full experiment results
HDFS
HBase
32
Experiments
Policies for a mixture of systems, workflows, and resources
Results on clusters up to 200 nodes
YARN MapReduce
ZooKeeper
Systems
Workflows MapReduce Jobs (HiBench)
HBase (YCSB)
HDFS clients
Background Data Replication
Background Heartbeats
Resources
CPU, Disk, Network (All systems)
Locks, Queues (HDFS, HBase)
Policies
LatencySLO
Bottleneck
Fairness
Dominant Resource
Fairness
See paper for full experiment results
This talk LatencySLO policy results
HDFS
HBase
32
Experiments
Policies for a mixture of systems, workflows, and resources
Results on clusters up to 200 nodes
YARN MapReduce
ZooKeeper
Systems
Workflows MapReduce Jobs (HiBench)
HBase (YCSB)
HDFS clients
Background Data Replication
Background Heartbeats
Resources
CPU, Disk, Network (All systems)
Locks, Queues (HDFS, HBase)
Policies
LatencySLO
Bottleneck
Fairness
Dominant Resource
Fairness
See paper for full experiment results
This talk LatencySLO policy results
HDFS
HBase
32
HDFS read 8k
HBase read 1
cached row
HBase read 1 row
33
HDFS read 8k
HBase read 1
cached row
HBase read 1 row
HBase
Cached
Table Scan
HDFS
mkdir
HBase
Table Scan
33
1
10
100
1000
0 5 10 15 20 25 30
0.1
SLO-NormalizedLatency
Time [minutes]
HDFS read 8k
HBase read 1
cached row
HBase read 1 row
HBase
Cached
Table Scan
HDFS
mkdir
HBase
Table Scan
33
1
10
100
1000
0 5 10 15 20 25 30
0.1
SLO-NormalizedLatency
Time [minutes]
HDFS read 8k
HBase read 1
cached row
HBase read 1 row
HBase
Cached
Table Scan
HDFS
mkdir
HBase
Table Scan
33
1
10
100
1000
0 5 10 15 20 25 30
0.1
SLO-NormalizedLatency
Time [minutes]
HDFS read 8k
HBase read 1
cached row
HBase read 1 row
HBase
Cached
Table Scan
HDFS
mkdir
HBase
Table Scan
SLO
target
33
1
10
100
1000
0 5 10 15 20 25 30
0.1
SLO-NormalizedLatency
Time [minutes]
HDFS read 8k
HBase read 1
cached row
HBase read 1 row
HBase
Cached
Table Scan
HDFS
mkdir
HBase
Table Scan
SLO
target
33
1
10
100
1000
0 5 10 15 20 25 30
0.1
SLO-NormalizedLatency
Time [minutes]
HDFS read 8k
HBase read 1
cached row
HBase read 1 row
HBase
Cached
Table Scan
HDFS
mkdir
HBase
Table Scan
SLO
target
33
0
5
10
15
0 5 10 15 20 25 30
Slowdown
Time [minutes]
1
10
100
1000
0 5 10 15 20 25 30
0.1
SLO-NormalizedLatency
Time [minutes]
HDFS read 8k
HBase read 1
cached row
HBase read 1 row
HBase
Cached
Table Scan
HDFS
mkdir
HBase
Table Scan
SLO
target
34
0
5
10
15
0 5 10 15 20 25 30
Slowdown
Time [minutes]
1
10
100
1000
0 5 10 15 20 25 30
0.1
SLO-NormalizedLatency
Time [minutes]
HDFS read 8k
HBase read 1
cached row
Disk
HBase read 1 row
HBase
Cached
Table Scan
HDFS
mkdir
HBase
Table Scan
SLO
target
34
0
5
10
15
0 5 10 15 20 25 30
Slowdown
Time [minutes]
1
10
100
1000
0 5 10 15 20 25 30
0.1
SLO-NormalizedLatency
Time [minutes]
HDFS read 8k
HBase read 1
cached row
Disk
CPU
HBase read 1 row
HBase
Cached
Table Scan
HDFS
mkdir
HBase
Table Scan
SLO
target
34
0
5
10
15
0 5 10 15 20 25 30
Slowdown
Time [minutes]
1
10
100
1000
0 5 10 15 20 25 30
0.1
SLO-NormalizedLatency
Time [minutes]
HDFS read 8k
HBase read 1
cached row
Disk
CPU
HDFS NN Lock
HBase read 1 row
HBase
Cached
Table Scan
HDFS
mkdir
HBase
Table Scan
SLO
target
34
0
5
10
15
0 5 10 15 20 25 30
Slowdown
Time [minutes]
1
10
100
1000
0 5 10 15 20 25 30
0.1
SLO-NormalizedLatency
Time [minutes]
HDFS read 8k
HBase read 1
cached row
Disk
CPU
HDFS NN Lock
HDFS NN Queue
HBase read 1 row
HBase
Cached
Table Scan
HDFS
mkdir
HBase
Table Scan
SLO
target
34
0
5
10
15
0 5 10 15 20 25 30
Slowdown
Time [minutes]
1
10
100
1000
0 5 10 15 20 25 30
0.1
SLO-NormalizedLatency
Time [minutes]
HDFS read 8k
HBase read 1
cached row
Disk
CPU
HDFS NN Lock
HDFS NN Queue
HBase Queue
HBase read 1 row
HBase
Cached
Table Scan
HDFS
mkdir
HBase
Table Scan
SLO
target
34
1
10
100
1000
0 5 10 15 20 25 30
0.1
SLO-NormalizedLatency
Time [minutes]
HDFS read 8k
HBase read 1
cached row
0.1
1
10
0 5 10 15 20 25 30
SLO-NormalizedLatency
Time [minutes]
+ SLO Policy Enabled
SLO target
HBase read 1 row
HBase
Cached
Table Scan
HDFS
mkdir
HBase
Table Scan
35
SLO
target
1
10
100
1000
0 5 10 15 20 25 30
0.1
SLO-NormalizedLatency
Time [minutes]
HDFS read 8k
HBase read 1
cached row
0.1
1
10
0 5 10 15 20 25 30
SLO-NormalizedLatency
Time [minutes]
+ SLO Policy Enabled
SLO target
HBase read 1 row
HBase
Cached
Table Scan
HDFS
mkdir
HBase
Table Scan
35
SLO
target
0.1
1
10
0 5 10 15 20 25 30
SLO-NormalizedLatency
Time [minutes]
HDFS read 8k
HBase read 1
cached row
HBase read 1 row
HBase
Cached
Table Scan
HDFS
mkdir
HBase
Table Scan
36
0.1
1
10
0 5 10 15 20 25 30
SLO-NormalizedLatency
Time [minutes]
HDFS read 8k
HBase read 1
cached row
0.2
1 HBase
Table Scan
HBase read 1 row
HBase
Cached
Table Scan
HDFS
mkdir
HBase
Table Scan
36
0.1
1
10
0 5 10 15 20 25 30
SLO-NormalizedLatency
Time [minutes]
HDFS read 8k
HBase read 1
cached row
0.2
1 HBase
Table Scan
HBase read 1 row
HBase
Cached
Table Scan
HDFS
mkdir
HBase
Table Scan
36
0.1
1
10
0 5 10 15 20 25 30
SLO-NormalizedLatency
Time [minutes]
HDFS read 8k
HBase read 1
cached row
0.2
1 HBase
Table Scan
HBase read 1 row
HBase
Cached
Table Scan
HDFS
mkdir
HBase
Table Scan
36
0.1
1
10
0 5 10 15 20 25 30
SLO-NormalizedLatency
Time [minutes]
HDFS read 8k
HBase read 1
cached row
0.2
1 HBase
Table Scan
0.5
1 HDFS
mkdir
HBase read 1 row
HBase
Cached
Table Scan
HDFS
mkdir
HBase
Table Scan
36
0.1
1
10
0 5 10 15 20 25 30
SLO-NormalizedLatency
Time [minutes]
HDFS read 8k
HBase read 1
cached row
0.2
1 HBase
Table Scan
0.5
1 HDFS
mkdir
HBase read 1 row
HBase
Cached
Table Scan
HDFS
mkdir
HBase
Table Scan
36
0.1
1
10
0 5 10 15 20 25 30
SLO-NormalizedLatency
Time [minutes]
HDFS read 8k
HBase read 1
cached row
0.2
1 HBase
Table Scan
0.5
1 HDFS
mkdir
HBase read 1 row
HBase
Cached
Table Scan
HDFS
mkdir
HBase
Table Scan
36
0.1
1
10
0 5 10 15 20 25 30
SLO-NormalizedLatency
Time [minutes]
HDFS read 8k
HBase read 1
cached row
0.5
1 HBase
Cached
Table Scan
0.2
1 HBase
Table Scan
0.5
1 HDFS
mkdir
HBase read 1 row
HBase
Cached
Table Scan
HDFS
mkdir
HBase
Table Scan
36
0.1
1
10
0 5 10 15 20 25 30
SLO-NormalizedLatency
Time [minutes]
HDFS read 8k
HBase read 1
cached row
0.5
1 HBase
Cached
Table Scan
0.2
1 HBase
Table Scan
0.5
1 HDFS
mkdir
HBase read 1 row
HBase
Cached
Table Scan
HDFS
mkdir
HBase
Table Scan
36
0.1
1
10
0 5 10 15 20 25 30
SLO-NormalizedLatency
Time [minutes]
HDFS read 8k
HBase read 1
cached row
0.5
1 HBase
Cached
Table Scan
0.2
1 HBase
Table Scan
0.5
1 HDFS
mkdir
HBase read 1 row
HBase
Cached
Table Scan
HDFS
mkdir
HBase
Table Scan
36
Conclusion
37
Resource management for shared
distributed systems
Conclusion
37
Resource management for shared
distributed systems
Centralized resource management
Conclusion
37
Resource management for shared
distributed systems
Centralized resource management
Conclusion
Comprehensive: resources,
processes, tenants, background
tasks
37
Resource management for shared
distributed systems
Centralized resource management
Conclusion
Comprehensive: resources,
processes, tenants, background
tasks
Abstractions for writing concise,
general-purpose policies:
Workflows
Resources (slowdown, load)
Control points
37
Resource management for shared
distributed systems
Centralized resource management
Conclusion
http://cs.brown.edu/~jcmace
Comprehensive: resources,
processes, tenants, background
tasks
Abstractions for writing concise,
general-purpose policies:
Workflows
Resources (slowdown, load)
Control points

More Related Content

What's hot

Apache Falcon : 22 Sept 2014 for Hadoop User Group France (@Criteo)
Apache Falcon : 22 Sept 2014 for Hadoop User Group France (@Criteo)Apache Falcon : 22 Sept 2014 for Hadoop User Group France (@Criteo)
Apache Falcon : 22 Sept 2014 for Hadoop User Group France (@Criteo)Cedric CARBONE
 
DeathStar: Easy, Dynamic, Multi-Tenant HBase via YARN
DeathStar: Easy, Dynamic, Multi-Tenant HBase via YARNDeathStar: Easy, Dynamic, Multi-Tenant HBase via YARN
DeathStar: Easy, Dynamic, Multi-Tenant HBase via YARNDataWorks Summit
 
Introduction to NoSQL and Couchbase
Introduction to NoSQL and CouchbaseIntroduction to NoSQL and Couchbase
Introduction to NoSQL and CouchbaseCecile Le Pape
 
Apache Hadoop YARN: Present and Future
Apache Hadoop YARN: Present and FutureApache Hadoop YARN: Present and Future
Apache Hadoop YARN: Present and FutureDataWorks Summit
 
Hadoop and Kerberos: the Madness Beyond the Gate
Hadoop and Kerberos: the Madness Beyond the GateHadoop and Kerberos: the Madness Beyond the Gate
Hadoop and Kerberos: the Madness Beyond the GateSteve Loughran
 
Case Study - How Rackspace Query Terabytes Of Data
Case Study - How Rackspace Query Terabytes Of DataCase Study - How Rackspace Query Terabytes Of Data
Case Study - How Rackspace Query Terabytes Of DataSchubert Zhang
 
Couchbase presentation
Couchbase presentationCouchbase presentation
Couchbase presentationsharonyb
 
2014 sept 4_hadoop_security
2014 sept 4_hadoop_security2014 sept 4_hadoop_security
2014 sept 4_hadoop_securityAdam Muise
 
Tuning Apache Ambari performance for Big Data at scale with 3000 agents
Tuning Apache Ambari performance for Big Data at scale with 3000 agentsTuning Apache Ambari performance for Big Data at scale with 3000 agents
Tuning Apache Ambari performance for Big Data at scale with 3000 agentsDataWorks Summit
 
SharePoint on Microsoft Azure
SharePoint on Microsoft AzureSharePoint on Microsoft Azure
SharePoint on Microsoft AzureK.Mohamed Faizal
 
Introduction to couchbase
Introduction to couchbaseIntroduction to couchbase
Introduction to couchbaseDipti Borkar
 
Cloudera Impala: A modern SQL Query Engine for Hadoop
Cloudera Impala: A modern SQL Query Engine for HadoopCloudera Impala: A modern SQL Query Engine for Hadoop
Cloudera Impala: A modern SQL Query Engine for HadoopCloudera, Inc.
 
HDFS Tiered Storage: Mounting Object Stores in HDFS
HDFS Tiered Storage: Mounting Object Stores in HDFSHDFS Tiered Storage: Mounting Object Stores in HDFS
HDFS Tiered Storage: Mounting Object Stores in HDFSDataWorks Summit
 
Fault Tolerance with Kafka
Fault Tolerance with KafkaFault Tolerance with Kafka
Fault Tolerance with KafkaEdureka!
 
Building Highly Scalable Web Applications
Building Highly Scalable Web ApplicationsBuilding Highly Scalable Web Applications
Building Highly Scalable Web ApplicationsIWMW
 
a Secure Public Cache for YARN Application Resources
a Secure Public Cache for YARN Application Resourcesa Secure Public Cache for YARN Application Resources
a Secure Public Cache for YARN Application ResourcesDataWorks Summit
 

What's hot (20)

Hadoop Security
Hadoop SecurityHadoop Security
Hadoop Security
 
Apache Falcon : 22 Sept 2014 for Hadoop User Group France (@Criteo)
Apache Falcon : 22 Sept 2014 for Hadoop User Group France (@Criteo)Apache Falcon : 22 Sept 2014 for Hadoop User Group France (@Criteo)
Apache Falcon : 22 Sept 2014 for Hadoop User Group France (@Criteo)
 
DeathStar: Easy, Dynamic, Multi-Tenant HBase via YARN
DeathStar: Easy, Dynamic, Multi-Tenant HBase via YARNDeathStar: Easy, Dynamic, Multi-Tenant HBase via YARN
DeathStar: Easy, Dynamic, Multi-Tenant HBase via YARN
 
Introduction to NoSQL and Couchbase
Introduction to NoSQL and CouchbaseIntroduction to NoSQL and Couchbase
Introduction to NoSQL and Couchbase
 
Apache Hadoop YARN: Present and Future
Apache Hadoop YARN: Present and FutureApache Hadoop YARN: Present and Future
Apache Hadoop YARN: Present and Future
 
Hadoop and Kerberos: the Madness Beyond the Gate
Hadoop and Kerberos: the Madness Beyond the GateHadoop and Kerberos: the Madness Beyond the Gate
Hadoop and Kerberos: the Madness Beyond the Gate
 
Case Study - How Rackspace Query Terabytes Of Data
Case Study - How Rackspace Query Terabytes Of DataCase Study - How Rackspace Query Terabytes Of Data
Case Study - How Rackspace Query Terabytes Of Data
 
Couchbase presentation
Couchbase presentationCouchbase presentation
Couchbase presentation
 
Evolving HDFS to a Generalized Storage Subsystem
Evolving HDFS to a Generalized Storage SubsystemEvolving HDFS to a Generalized Storage Subsystem
Evolving HDFS to a Generalized Storage Subsystem
 
2014 sept 4_hadoop_security
2014 sept 4_hadoop_security2014 sept 4_hadoop_security
2014 sept 4_hadoop_security
 
Apache Hadoop 3.0 What's new in YARN and MapReduce
Apache Hadoop 3.0 What's new in YARN and MapReduceApache Hadoop 3.0 What's new in YARN and MapReduce
Apache Hadoop 3.0 What's new in YARN and MapReduce
 
Tuning Apache Ambari performance for Big Data at scale with 3000 agents
Tuning Apache Ambari performance for Big Data at scale with 3000 agentsTuning Apache Ambari performance for Big Data at scale with 3000 agents
Tuning Apache Ambari performance for Big Data at scale with 3000 agents
 
SharePoint on Microsoft Azure
SharePoint on Microsoft AzureSharePoint on Microsoft Azure
SharePoint on Microsoft Azure
 
Introduction to couchbase
Introduction to couchbaseIntroduction to couchbase
Introduction to couchbase
 
Cloudera Impala: A modern SQL Query Engine for Hadoop
Cloudera Impala: A modern SQL Query Engine for HadoopCloudera Impala: A modern SQL Query Engine for Hadoop
Cloudera Impala: A modern SQL Query Engine for Hadoop
 
Disaster Recovery Synapse
Disaster Recovery SynapseDisaster Recovery Synapse
Disaster Recovery Synapse
 
HDFS Tiered Storage: Mounting Object Stores in HDFS
HDFS Tiered Storage: Mounting Object Stores in HDFSHDFS Tiered Storage: Mounting Object Stores in HDFS
HDFS Tiered Storage: Mounting Object Stores in HDFS
 
Fault Tolerance with Kafka
Fault Tolerance with KafkaFault Tolerance with Kafka
Fault Tolerance with Kafka
 
Building Highly Scalable Web Applications
Building Highly Scalable Web ApplicationsBuilding Highly Scalable Web Applications
Building Highly Scalable Web Applications
 
a Secure Public Cache for YARN Application Resources
a Secure Public Cache for YARN Application Resourcesa Secure Public Cache for YARN Application Resources
a Secure Public Cache for YARN Application Resources
 

Viewers also liked

National Treasures | Ian Whittock
National Treasures | Ian WhittockNational Treasures | Ian Whittock
National Treasures | Ian WhittockIan Whittock
 
Сәкен Сейфуллин
Сәкен СейфуллинСәкен Сейфуллин
Сәкен СейфуллинNurlan Abilhanov
 
AR3 - Itemized Deductions Schedule
AR3 - Itemized Deductions ScheduleAR3 - Itemized Deductions Schedule
AR3 - Itemized Deductions Scheduletaxman taxman
 
404 c Schneider presentation
404 c Schneider presentation404 c Schneider presentation
404 c Schneider presentationNAGC
 
Tugas jaringan komputer II
Tugas jaringan komputer IITugas jaringan komputer II
Tugas jaringan komputer IIjakaumaril
 
Universidad tecnológica de santiago recinto mao
Universidad tecnológica de santiago recinto maoUniversidad tecnológica de santiago recinto mao
Universidad tecnológica de santiago recinto maoArelis128
 
имаш іңкәр grdggrdgdrg drgrdgdrgdrgdr
имаш іңкәр grdggrdgdrg drgrdgdrgdrgdrимаш іңкәр grdggrdgdrg drgrdgdrgdrgdr
имаш іңкәр grdggrdgdrg drgrdgdrgdrgdrNurlan Abilhanov
 
Manual de juego de tangram on line
Manual de juego de tangram on lineManual de juego de tangram on line
Manual de juego de tangram on lineMariángeles Esteban
 
ACTIVIDADES CONEXIÓN MATEMÁTICA PARA 3º y 4º DE PRIMARIA
ACTIVIDADES CONEXIÓN MATEMÁTICA PARA 3º y 4º DE PRIMARIAACTIVIDADES CONEXIÓN MATEMÁTICA PARA 3º y 4º DE PRIMARIA
ACTIVIDADES CONEXIÓN MATEMÁTICA PARA 3º y 4º DE PRIMARIAMariángeles Esteban
 
Basics of Clustering
Basics of ClusteringBasics of Clustering
Basics of ClusteringB. Nichols
 
2 Marketing Plan - Marketing Strategy & Objectives by www.marketingPlanNOW.com
2 Marketing Plan - Marketing Strategy & Objectives by www.marketingPlanNOW.com2 Marketing Plan - Marketing Strategy & Objectives by www.marketingPlanNOW.com
2 Marketing Plan - Marketing Strategy & Objectives by www.marketingPlanNOW.comwww.marketingPlanMODE.com
 
Corporate income tax.feb.2011
Corporate income tax.feb.2011Corporate income tax.feb.2011
Corporate income tax.feb.2011Phil Taxation
 
Presentación de Netbeans
Presentación de NetbeansPresentación de Netbeans
Presentación de NetbeansMichelle Peña
 
Бунақденелілер класы
Бунақденелілер класыБунақденелілер класы
Бунақденелілер класыBilim All
 
Релаксация
РелаксацияРелаксация
РелаксацияBilim All
 

Viewers also liked (19)

National Treasures | Ian Whittock
National Treasures | Ian WhittockNational Treasures | Ian Whittock
National Treasures | Ian Whittock
 
Сәкен Сейфуллин
Сәкен СейфуллинСәкен Сейфуллин
Сәкен Сейфуллин
 
Rt rw net rani
Rt rw net  raniRt rw net  rani
Rt rw net rani
 
AR3 - Itemized Deductions Schedule
AR3 - Itemized Deductions ScheduleAR3 - Itemized Deductions Schedule
AR3 - Itemized Deductions Schedule
 
404 c Schneider presentation
404 c Schneider presentation404 c Schneider presentation
404 c Schneider presentation
 
tik bab 5
tik bab 5tik bab 5
tik bab 5
 
Tugas jaringan komputer II
Tugas jaringan komputer IITugas jaringan komputer II
Tugas jaringan komputer II
 
Universidad tecnológica de santiago recinto mao
Universidad tecnológica de santiago recinto maoUniversidad tecnológica de santiago recinto mao
Universidad tecnológica de santiago recinto mao
 
имаш іңкәр grdggrdgdrg drgrdgdrgdrgdr
имаш іңкәр grdggrdgdrg drgrdgdrgdrgdrимаш іңкәр grdggrdgdrg drgrdgdrgdrgdr
имаш іңкәр grdggrdgdrg drgrdgdrgdrgdr
 
Manual de juego de tangram on line
Manual de juego de tangram on lineManual de juego de tangram on line
Manual de juego de tangram on line
 
ACTIVIDADES CONEXIÓN MATEMÁTICA PARA 3º y 4º DE PRIMARIA
ACTIVIDADES CONEXIÓN MATEMÁTICA PARA 3º y 4º DE PRIMARIAACTIVIDADES CONEXIÓN MATEMÁTICA PARA 3º y 4º DE PRIMARIA
ACTIVIDADES CONEXIÓN MATEMÁTICA PARA 3º y 4º DE PRIMARIA
 
Basics of Clustering
Basics of ClusteringBasics of Clustering
Basics of Clustering
 
2 Marketing Plan - Marketing Strategy & Objectives by www.marketingPlanNOW.com
2 Marketing Plan - Marketing Strategy & Objectives by www.marketingPlanNOW.com2 Marketing Plan - Marketing Strategy & Objectives by www.marketingPlanNOW.com
2 Marketing Plan - Marketing Strategy & Objectives by www.marketingPlanNOW.com
 
Corporate income tax.feb.2011
Corporate income tax.feb.2011Corporate income tax.feb.2011
Corporate income tax.feb.2011
 
Tutorial eclipse
Tutorial eclipse Tutorial eclipse
Tutorial eclipse
 
Presentación de Netbeans
Presentación de NetbeansPresentación de Netbeans
Presentación de Netbeans
 
Said bayona
Said bayonaSaid bayona
Said bayona
 
Бунақденелілер класы
Бунақденелілер класыБунақденелілер класы
Бунақденелілер класы
 
Релаксация
РелаксацияРелаксация
Релаксация
 

Similar to mace15retro-presentation

Welcome to Hadoop2Land!
Welcome to Hadoop2Land!Welcome to Hadoop2Land!
Welcome to Hadoop2Land!Uwe Printz
 
1. beyond mission critical virtualizing big data and hadoop
1. beyond mission critical   virtualizing big data and hadoop1. beyond mission critical   virtualizing big data and hadoop
1. beyond mission critical virtualizing big data and hadoopChiou-Nan Chen
 
Introduction to databasecasmfnbskdfjnfkjsdnsjkdfn
Introduction to databasecasmfnbskdfjnfkjsdnsjkdfnIntroduction to databasecasmfnbskdfjnfkjsdnsjkdfn
Introduction to databasecasmfnbskdfjnfkjsdnsjkdfnaj01bhisma
 
Big Data in the Cloud - The What, Why and How from the Experts
Big Data in the Cloud - The What, Why and How from the ExpertsBig Data in the Cloud - The What, Why and How from the Experts
Big Data in the Cloud - The What, Why and How from the ExpertsDataWorks Summit/Hadoop Summit
 
A sdn based application aware and network provisioning
A sdn based application aware and network provisioningA sdn based application aware and network provisioning
A sdn based application aware and network provisioningStanley Wang
 
Supporting Research through "Desktop as a Service" models of e-infrastructure...
Supporting Research through "Desktop as a Service" models of e-infrastructure...Supporting Research through "Desktop as a Service" models of e-infrastructure...
Supporting Research through "Desktop as a Service" models of e-infrastructure...David Wallom
 
HDFS- What is New and Future
HDFS- What is New and FutureHDFS- What is New and Future
HDFS- What is New and FutureDataWorks Summit
 
Dynamic Provisioning of Data Intensive Computing Middleware Frameworks
Dynamic Provisioning of Data Intensive Computing Middleware FrameworksDynamic Provisioning of Data Intensive Computing Middleware Frameworks
Dynamic Provisioning of Data Intensive Computing Middleware FrameworksLinh Ngo
 
Hadoop in the cloud – The what, why and how from the experts
Hadoop in the cloud – The what, why and how from the expertsHadoop in the cloud – The what, why and how from the experts
Hadoop in the cloud – The what, why and how from the expertsDataWorks Summit
 
State of Resource Management in Big Data
State of Resource Management in Big DataState of Resource Management in Big Data
State of Resource Management in Big DataKhalid Ahmed
 
VMworld 2013: Beyond Mission Critical: Virtualizing Big-Data, Hadoop, HPC, Cl...
VMworld 2013: Beyond Mission Critical: Virtualizing Big-Data, Hadoop, HPC, Cl...VMworld 2013: Beyond Mission Critical: Virtualizing Big-Data, Hadoop, HPC, Cl...
VMworld 2013: Beyond Mission Critical: Virtualizing Big-Data, Hadoop, HPC, Cl...VMworld
 
10-15-13 “Metadata and Repository Services for Research Data Curation” Presen...
10-15-13 “Metadata and Repository Services for Research Data Curation” Presen...10-15-13 “Metadata and Repository Services for Research Data Curation” Presen...
10-15-13 “Metadata and Repository Services for Research Data Curation” Presen...DuraSpace
 
Apache hadoop: POSH Meetup Palo Alto, CA April 2014
Apache hadoop: POSH Meetup Palo Alto, CA April 2014Apache hadoop: POSH Meetup Palo Alto, CA April 2014
Apache hadoop: POSH Meetup Palo Alto, CA April 2014Kevin Crocker
 
From Insights to Value - Building a Modern Logical Data Lake to Drive User Ad...
From Insights to Value - Building a Modern Logical Data Lake to Drive User Ad...From Insights to Value - Building a Modern Logical Data Lake to Drive User Ad...
From Insights to Value - Building a Modern Logical Data Lake to Drive User Ad...DataWorks Summit
 
Duraspace Hot Topics Series 6: Metadata and Repository Services
Duraspace Hot Topics Series 6: Metadata and Repository ServicesDuraspace Hot Topics Series 6: Metadata and Repository Services
Duraspace Hot Topics Series 6: Metadata and Repository ServicesMatthew Critchlow
 
Red Hat Storage Day Atlanta - Persistent Storage for Linux Containers
Red Hat Storage Day Atlanta - Persistent Storage for Linux Containers Red Hat Storage Day Atlanta - Persistent Storage for Linux Containers
Red Hat Storage Day Atlanta - Persistent Storage for Linux Containers Red_Hat_Storage
 
Containerizing Traditional Applications
Containerizing Traditional ApplicationsContainerizing Traditional Applications
Containerizing Traditional ApplicationsJim Bugwadia
 
FDS (Sixth Edition) | C1 | Databases and Database Users
FDS (Sixth Edition) | C1 | Databases and Database UsersFDS (Sixth Edition) | C1 | Databases and Database Users
FDS (Sixth Edition) | C1 | Databases and Database UsersHarsh Verdhan Raj
 

Similar to mace15retro-presentation (20)

Welcome to Hadoop2Land!
Welcome to Hadoop2Land!Welcome to Hadoop2Land!
Welcome to Hadoop2Land!
 
1. beyond mission critical virtualizing big data and hadoop
1. beyond mission critical   virtualizing big data and hadoop1. beyond mission critical   virtualizing big data and hadoop
1. beyond mission critical virtualizing big data and hadoop
 
Introduction to databasecasmfnbskdfjnfkjsdnsjkdfn
Introduction to databasecasmfnbskdfjnfkjsdnsjkdfnIntroduction to databasecasmfnbskdfjnfkjsdnsjkdfn
Introduction to databasecasmfnbskdfjnfkjsdnsjkdfn
 
Big Data in the Cloud - The What, Why and How from the Experts
Big Data in the Cloud - The What, Why and How from the ExpertsBig Data in the Cloud - The What, Why and How from the Experts
Big Data in the Cloud - The What, Why and How from the Experts
 
Hadoop ecosystem
Hadoop ecosystemHadoop ecosystem
Hadoop ecosystem
 
Hadoop ecosystem
Hadoop ecosystemHadoop ecosystem
Hadoop ecosystem
 
A sdn based application aware and network provisioning
A sdn based application aware and network provisioningA sdn based application aware and network provisioning
A sdn based application aware and network provisioning
 
Supporting Research through "Desktop as a Service" models of e-infrastructure...
Supporting Research through "Desktop as a Service" models of e-infrastructure...Supporting Research through "Desktop as a Service" models of e-infrastructure...
Supporting Research through "Desktop as a Service" models of e-infrastructure...
 
HDFS- What is New and Future
HDFS- What is New and FutureHDFS- What is New and Future
HDFS- What is New and Future
 
Dynamic Provisioning of Data Intensive Computing Middleware Frameworks
Dynamic Provisioning of Data Intensive Computing Middleware FrameworksDynamic Provisioning of Data Intensive Computing Middleware Frameworks
Dynamic Provisioning of Data Intensive Computing Middleware Frameworks
 
Hadoop in the cloud – The what, why and how from the experts
Hadoop in the cloud – The what, why and how from the expertsHadoop in the cloud – The what, why and how from the experts
Hadoop in the cloud – The what, why and how from the experts
 
State of Resource Management in Big Data
State of Resource Management in Big DataState of Resource Management in Big Data
State of Resource Management in Big Data
 
VMworld 2013: Beyond Mission Critical: Virtualizing Big-Data, Hadoop, HPC, Cl...
VMworld 2013: Beyond Mission Critical: Virtualizing Big-Data, Hadoop, HPC, Cl...VMworld 2013: Beyond Mission Critical: Virtualizing Big-Data, Hadoop, HPC, Cl...
VMworld 2013: Beyond Mission Critical: Virtualizing Big-Data, Hadoop, HPC, Cl...
 
10-15-13 “Metadata and Repository Services for Research Data Curation” Presen...
10-15-13 “Metadata and Repository Services for Research Data Curation” Presen...10-15-13 “Metadata and Repository Services for Research Data Curation” Presen...
10-15-13 “Metadata and Repository Services for Research Data Curation” Presen...
 
Apache hadoop: POSH Meetup Palo Alto, CA April 2014
Apache hadoop: POSH Meetup Palo Alto, CA April 2014Apache hadoop: POSH Meetup Palo Alto, CA April 2014
Apache hadoop: POSH Meetup Palo Alto, CA April 2014
 
From Insights to Value - Building a Modern Logical Data Lake to Drive User Ad...
From Insights to Value - Building a Modern Logical Data Lake to Drive User Ad...From Insights to Value - Building a Modern Logical Data Lake to Drive User Ad...
From Insights to Value - Building a Modern Logical Data Lake to Drive User Ad...
 
Duraspace Hot Topics Series 6: Metadata and Repository Services
Duraspace Hot Topics Series 6: Metadata and Repository ServicesDuraspace Hot Topics Series 6: Metadata and Repository Services
Duraspace Hot Topics Series 6: Metadata and Repository Services
 
Red Hat Storage Day Atlanta - Persistent Storage for Linux Containers
Red Hat Storage Day Atlanta - Persistent Storage for Linux Containers Red Hat Storage Day Atlanta - Persistent Storage for Linux Containers
Red Hat Storage Day Atlanta - Persistent Storage for Linux Containers
 
Containerizing Traditional Applications
Containerizing Traditional ApplicationsContainerizing Traditional Applications
Containerizing Traditional Applications
 
FDS (Sixth Edition) | C1 | Databases and Database Users
FDS (Sixth Edition) | C1 | Databases and Database UsersFDS (Sixth Edition) | C1 | Databases and Database Users
FDS (Sixth Edition) | C1 | Databases and Database Users
 

mace15retro-presentation

  • 1. Targeted Resource Management in Multi-tenant Distributed Systems Jonathan Mace Brown University Peter Bodik MSR Redmond Rodrigo Fonseca Brown University Madanlal Musuvathi MSR Redmond
  • 2. Resource Management in Multi-Tenant Systems 2
  • 3. Failure in availability zone cascades to shared control plane, causes thread pool starvation for all zones • April 2011 – Amazon EBS Failure Resource Management in Multi-Tenant Systems 2
  • 4. Failure in availability zone cascades to shared control plane, causes thread pool starvation for all zones • April 2011 – Amazon EBS Failure Aggressive background task responds to increased hardware capacity with deluge of warnings and logging • Aug. 2012 – Azure Storage Outage Resource Management in Multi-Tenant Systems 2
  • 5. Code change increases database usage, shifts bottleneck to unmanaged application-level lock • Nov. 2014 – Visual Studio Online outage Failure in availability zone cascades to shared control plane, causes thread pool starvation for all zones • April 2011 – Amazon EBS Failure Aggressive background task responds to increased hardware capacity with deluge of warnings and logging • Aug. 2012 – Azure Storage Outage Resource Management in Multi-Tenant Systems 2
  • 6. Code change increases database usage, shifts bottleneck to unmanaged application-level lock • Nov. 2014 – Visual Studio Online outage Failure in availability zone cascades to shared control plane, causes thread pool starvation for all zones • April 2011 – Amazon EBS Failure Aggressive background task responds to increased hardware capacity with deluge of warnings and logging • Aug. 2012 – Azure Storage Outage Resource Management in Multi-Tenant Systems 2 Shared storage layer bottlenecks circumvent resource management layer • 2014 – Communication with Cloudera
  • 7. Code change increases database usage, shifts bottleneck to unmanaged application-level lock • Nov. 2014 – Visual Studio Online outage Failure in availability zone cascades to shared control plane, causes thread pool starvation for all zones • April 2011 – Amazon EBS Failure Aggressive background task responds to increased hardware capacity with deluge of warnings and logging • Aug. 2012 – Azure Storage Outage Resource Management in Multi-Tenant Systems 2 Degraded performance, Violated SLOs, system outages Shared storage layer bottlenecks circumvent resource management layer • 2014 – Communication with Cloudera
  • 8. Code change increases database usage, shifts bottleneck to unmanaged application-level lock • Nov. 2014 – Visual Studio Online outage Failure in availability zone cascades to shared control plane, causes thread pool starvation for all zones • April 2011 – Amazon EBS Failure Aggressive background task responds to increased hardware capacity with deluge of warnings and logging • Aug. 2012 – Azure Storage Outage Resource Management in Multi-Tenant Systems 3 Degraded performance, Violated SLOs, system outages Shared storage layer bottlenecks circumvent resource management layer • 2014 – Communication with Cloudera
  • 9. Code change increases database usage, shifts bottleneck to unmanaged application-level lock • Nov. 2014 – Visual Studio Online outage Failure in availability zone cascades to shared control plane, causes thread pool starvation for all zones • April 2011 – Amazon EBS Failure Aggressive background task responds to increased hardware capacity with deluge of warnings and logging • Aug. 2012 – Azure Storage Outage Resource Management in Multi-Tenant Systems 3 Degraded performance, Violated SLOs, system outages Shared storage layer bottlenecks circumvent resource management layer • 2014 – Communication with Cloudera OS Hypervisor
  • 10. Code change increases database usage, shifts bottleneck to unmanaged application-level lock • Nov. 2014 – Visual Studio Online outage Failure in availability zone cascades to shared control plane, causes thread pool starvation for all zones • April 2011 – Amazon EBS Failure Aggressive background task responds to increased hardware capacity with deluge of warnings and logging • Aug. 2012 – Azure Storage Outage Resource Management in Multi-Tenant Systems 3 Degraded performance, Violated SLOs, system outages Shared storage layer bottlenecks circumvent resource management layer • 2014 – Communication with Cloudera OS Hypervisor
  • 12. Containers / VMs Shared Systems: Storage, Database, Queueing, etc. 4
  • 13. 5
  • 14. 5
  • 15. Monitors resource usage of each tenant in near real-time Actively schedules tenants and activities 5
  • 16. Monitors resource usage of each tenant in near real-time Actively schedules tenants and activities High-level, centralized policies: Encapsulates resource management logic 5
  • 17. Monitors resource usage of each tenant in near real-time Actively schedules tenants and activities High-level, centralized policies: Encapsulates resource management logic Abstractions – not specific to resource type, system Achieve different goals: guarantee average latencies, fair share a resource, etc. 5
  • 18. Hadoop Distributed File System (HDFS) 6
  • 19. Hadoop Distributed File System (HDFS) HDFS NameNode Filesystem metadata 6
  • 20. HDFS DataNode HDFS DataNode HDFS DataNode Hadoop Distributed File System (HDFS) Replicated block storage HDFS NameNode Filesystem metadata 6
  • 21. HDFS DataNode HDFS DataNode HDFS DataNode Hadoop Distributed File System (HDFS) Replicated block storage HDFS NameNode Filesystem metadata Rename 6
  • 22. HDFS DataNode HDFS DataNode HDFS DataNode Hadoop Distributed File System (HDFS) Replicated block storage HDFS NameNode Filesystem metadata Read Rename 6
  • 23. HDFS DataNode HDFS DataNode HDFS DataNode Hadoop Distributed File System (HDFS) Replicated block storage HDFS NameNode Filesystem metadata Read Rename 6
  • 24. HDFS NameNode HDFS DataNode HDFS DataNode HDFS DataNode 7
  • 25. HDFS NameNode HDFS DataNode HDFS DataNode HDFS DataNode 7 time 500 0
  • 26. HDFS NameNode HDFS DataNode HDFS DataNode HDFS DataNode 7 time 500 0
  • 27. HDFS NameNode HDFS DataNode HDFS DataNode HDFS DataNode 8 500 0
  • 28. HDFS NameNode HDFS DataNode HDFS DataNode HDFS DataNode 8 500 0 12 0
  • 29. HDFS NameNode HDFS DataNode HDFS DataNode HDFS DataNode 8 500 0 12 0
  • 30. HDFS NameNode HDFS DataNode HDFS DataNode HDFS DataNode 9 500 0 12 0
  • 31. HDFS NameNode HDFS DataNode HDFS DataNode HDFS DataNode 9 500 0 12 0 500 0
  • 32. HDFS NameNode HDFS DataNode HDFS DataNode HDFS DataNode 10 500 0 12 0 500 0
  • 33. HDFS NameNode HDFS DataNode HDFS DataNode HDFS DataNode 10 500 0 12 0 500 0
  • 34. HDFS NameNode HDFS DataNode HDFS DataNode HDFS DataNode 10 500 0 12 0 500 0 500 0
  • 39. Local storage HBase RegionServer HDFS DataNode MapReduce Shuffler Yarn ContainerYarn Container Hadoop YARN Container MapReduce Tasks 11
  • 40. Local storage HBase RegionServer HDFS DataNode MapReduce Shuffler Yarn ContainerYarn Container Hadoop YARN Container MapReduce Tasks 11
  • 41. Local storage HBase RegionServer HDFS DataNode Hadoop YARN NodeManager Yarn ContainerYarn ContainerHadoop YARN Container MapReduce Local storage HBase RegionServer HDFS DataNode Hadoop YARN NodeManager Yarn ContainerYarn ContainerHadoop YARN Container MapReduce Local storage HBase RegionServer HDFS DataNode MapReduce Shuffler Yarn ContainerYarn Container Hadoop YARN Container MapReduce Tasks 11
  • 43. Co-ordinated control across processes, machines, and services Goals 12
  • 44. Co-ordinated control across processes, machines, and services Handle system and application level resources Goals 12
  • 45. Co-ordinated control across processes, machines, and services Handle system and application level resources Principals: tenants, background tasks Goals 12
  • 46. Co-ordinated control across processes, machines, and services Handle system and application level resources Principals: tenants, background tasks Real-time and reactive Goals 12
  • 47. Co-ordinated control across processes, machines, and services Handle system and application level resources Principals: tenants, background tasks Real-time and reactive Efficient: Only control what is needed Goals 12
  • 49. 14
  • 51. Workflows Purpose: identify requests from different users, background activities eg, all requests from a tenant over time eg, data balancing in HDFS 15
  • 52. Workflows Purpose: identify requests from different users, background activities eg, all requests from a tenant over time eg, data balancing in HDFS Unit of resource measurement, attribution, and enforcement 15
  • 53. Workflows Purpose: identify requests from different users, background activities eg, all requests from a tenant over time eg, data balancing in HDFS Unit of resource measurement, attribution, and enforcement Tracks a request across varying levels of granularity Orthogonal to threads, processes, network flows, etc. 15
  • 56. Workflows Resources Purpose: cope with diversity of resources 16
  • 57. Workflows Resources Purpose: cope with diversity of resources What we need: 1. Identify overloaded resources 16
  • 58. Workflows Resources Purpose: cope with diversity of resources What we need: 1. Identify overloaded resources 2. Identify culprit workflows 16
  • 59. Workflows Resources Purpose: cope with diversity of resources What we need: 1. Identify overloaded resources Slowdown Ratio of how slow the resource is now compared to its baseline performance with no contention. 2. Identify culprit workflows 16
  • 60. Workflows Resources Purpose: cope with diversity of resources What we need: 1. Identify overloaded resources Slowdown Ratio of how slow the resource is now compared to its baseline performance with no contention. 2. Identify culprit workflows Load Fraction of current utilization that we can attribute to each workflow 16
  • 61. Workflows Resources 17 Purpose: cope with diversity of resources What we need: 1. Identify overloaded resources Slowdown Ratio of how slow the resource is now compared to its baseline performance with no contention. 2. Identify culprit workflows Load Fraction of current utilization that we can attribute to each workflow
  • 62. Workflows Resources 17 Purpose: cope with diversity of resources What we need: 1. Identify overloaded resources Slowdown Ratio of how slow the resource is now compared to its baseline performance with no contention. 2. Identify culprit workflows Load Fraction of current utilization that we can attribute to each workflow
  • 63. Workflows Resources Slowdown (queue time + execute time) / execute time eg. 100ms queue, 10ms execute => slowdown 11 Load time spent executing eg. 10ms execute => load 10 17 Purpose: cope with diversity of resources What we need: 1. Identify overloaded resources Slowdown Ratio of how slow the resource is now compared to its baseline performance with no contention. 2. Identify culprit workflows Load Fraction of current utilization that we can attribute to each workflow
  • 65. Workflows Resources Control Points Goal: enforce resource management decisions 18
  • 66. Workflows Resources Control Points Goal: enforce resource management decisions Decoupled from resources Rate-limits workflows, agnostic to underlying implementation e.g., token bucket priority queue 18
  • 67. Workflows Resources Control Points Goal: enforce resource management decisions Decoupled from resources Rate-limits workflows, agnostic to underlying implementation e.g., token bucket priority queue 19
  • 68. Workflows Resources Control Points Goal: enforce resource management decisions Decoupled from resources Rate-limits workflows, agnostic to underlying implementation e.g., token bucket priority queue 19
  • 69. Workflows Resources Control Points Goal: enforce resource management decisions Decoupled from resources Rate-limits workflows, agnostic to underlying implementation e.g., token bucket priority queue 19
  • 70. Workflows Resources Control Points Goal: enforce resource management decisions Decoupled from resources Rate-limits workflows, agnostic to underlying implementation e.g., token bucket priority queue 19
  • 71. Workflows Resources Control Points Goal: enforce resource management decisions Decoupled from resources Rate-limits workflows, agnostic to underlying implementation e.g., token bucket priority queue 19
  • 72. Workflows Resources Control Points Goal: enforce resource management decisions Decoupled from resources Rate-limits workflows, agnostic to underlying implementation e.g., token bucket priority queue 19
  • 73. Workflows Resources Control Points Goal: enforce resource management decisions Decoupled from resources Rate-limits workflows, agnostic to underlying implementation e.g., token bucket priority queue 19
  • 75. Pervasive Measurement 1. Pervasive Measurement Aggregated locally then reported centrally once per second Workflows Resources Control Points 20
  • 76. RetroControllerAPI Pervasive Measurement 1. Pervasive Measurement Aggregated locally then reported centrally once per second 2. Centralized Controller Global, abstracted view of the system Workflows Resources Control Points 20
  • 77. RetroControllerAPI Pervasive Measurement 1. Pervasive Measurement Aggregated locally then reported centrally once per second 2. Centralized Controller Global, abstracted view of the system Policies run in continuous control loop PolicyPolicyPolicy Workflows Resources Control Points 20
  • 78. RetroControllerAPI Pervasive Measurement 1. Pervasive Measurement Aggregated locally then reported centrally once per second 2. Centralized Controller Global, abstracted view of the system Policies run in continuous control loop 3. Distributed Enforcement Co-ordinates enforcement using distributed token bucket PolicyPolicyPolicy Workflows Resources Control Points Distributed Enforcement 20
  • 79. RetroControllerAPI Distributed Enforcement Pervasive Measurement Workflows Resources Control Points PolicyPolicyPolicy “Control Plane” for resource management Global, abstracted view of the system Easier to write Reusable 21
  • 81. 22 Example: LatencySLO H High Priority Workflows Policy “200ms average request latency”
  • 82. 22 Example: LatencySLO H High Priority Workflows Policy L Low Priority Workflows “200ms average request latency” (use spare capacity)
  • 83. 22 Example: LatencySLO H High Priority Workflows Policy L Low Priority Workflows “200ms average request latency” (use spare capacity) monitor latencies
  • 84. 22 Example: LatencySLO H High Priority Workflows Policy L Low Priority Workflows “200ms average request latency” (use spare capacity) monitor latencies attribute interference
  • 85. 22 Example: LatencySLO H High Priority Workflows Policy L Low Priority Workflows “200ms average request latency” (use spare capacity) monitor latencies throttle interfering workflows
  • 86. 23 1 foreach candidate in H 2 miss[candidate] = latency(candidate) / guarantee[candidate] 3 W = candidate in H with max miss[candidate] 4 foreach rsrc in resources() // calculate importance of each resource for hipri 5 importance[rsrc] = latency(W, rsrc) * log(slowdown(rsrc)) 6 foreach lopri in L // calculate low priority workflow interference 7 interference[lopri] = Σrsrc importance[rsrc] * load(lopri, rsrc) / load(rsrc) 8 foreach lopri in L // normalize interference 9 interference[lopri] /= Σk interference[k] 10 foreach lopri in L 11 if miss[W] > 1 // throttle 12 scalefactor = 1 – α * (miss[W] – 1) * interference[lopri] 13 else // release 14 scalefactor = 1 + β 15 foreach cpoint in controlpoints() // apply new rates 16 set_rate(cpoint, lopri, scalefactor * get_rate(cpoint, lopri) Example: LatencySLO H High Priority Workflows Policy L Low Priority Workflows
  • 87. 23 1 foreach candidate in H 2 miss[candidate] = latency(candidate) / guarantee[candidate] 3 W = candidate in H with max miss[candidate] 4 foreach rsrc in resources() // calculate importance of each resource for hipri 5 importance[rsrc] = latency(W, rsrc) * log(slowdown(rsrc)) 6 foreach lopri in L // calculate low priority workflow interference 7 interference[lopri] = Σrsrc importance[rsrc] * load(lopri, rsrc) / load(rsrc) 8 foreach lopri in L // normalize interference 9 interference[lopri] /= Σk interference[k] 10 foreach lopri in L 11 if miss[W] > 1 // throttle 12 scalefactor = 1 – α * (miss[W] – 1) * interference[lopri] 13 else // release 14 scalefactor = 1 + β 15 foreach cpoint in controlpoints() // apply new rates 16 set_rate(cpoint, lopri, scalefactor * get_rate(cpoint, lopri) Example: LatencySLO H High Priority Workflows Policy L Low Priority Workflows
  • 88. 24 1 foreach candidate in H 2 miss[candidate] = latency(candidate) / guarantee[candidate] 3 W = candidate in H with max miss[candidate] 4 foreach rsrc in resources() // calculate importance of each resource for hipri 5 importance[rsrc] = latency(W, rsrc) * log(slowdown(rsrc)) 6 foreach lopri in L // calculate low priority workflow interference 7 interference[lopri] = Σrsrc importance[rsrc] * load(lopri, rsrc) / load(rsrc) 8 foreach lopri in L // normalize interference 9 interference[lopri] /= Σk interference[k] 10 foreach lopri in L 11 if miss[W] > 1 // throttle 12 scalefactor = 1 – α * (miss[W] – 1) * interference[lopri] 13 else // release 14 scalefactor = 1 + β 15 foreach cpoint in controlpoints() // apply new rates 16 set_rate(cpoint, lopri, scalefactor * get_rate(cpoint, lopri) Example: LatencySLO H High Priority Workflows Policy L Low Priority Workflows Select the high priority workflow W with worst performance
  • 89. 24 1 foreach candidate in H 2 miss[candidate] = latency(candidate) / guarantee[candidate] 3 W = candidate in H with max miss[candidate] 4 foreach rsrc in resources() // calculate importance of each resource for hipri 5 importance[rsrc] = latency(W, rsrc) * log(slowdown(rsrc)) 6 foreach lopri in L // calculate low priority workflow interference 7 interference[lopri] = Σrsrc importance[rsrc] * load(lopri, rsrc) / load(rsrc) 8 foreach lopri in L // normalize interference 9 interference[lopri] /= Σk interference[k] 10 foreach lopri in L 11 if miss[W] > 1 // throttle 12 scalefactor = 1 – α * (miss[W] – 1) * interference[lopri] 13 else // release 14 scalefactor = 1 + β 15 foreach cpoint in controlpoints() // apply new rates 16 set_rate(cpoint, lopri, scalefactor * get_rate(cpoint, lopri) Example: LatencySLO H High Priority Workflows Policy L Low Priority Workflows Select the high priority workflow W with worst performance Weight low priority workflows by their interference with W
  • 90. 24 1 foreach candidate in H 2 miss[candidate] = latency(candidate) / guarantee[candidate] 3 W = candidate in H with max miss[candidate] 4 foreach rsrc in resources() // calculate importance of each resource for hipri 5 importance[rsrc] = latency(W, rsrc) * log(slowdown(rsrc)) 6 foreach lopri in L // calculate low priority workflow interference 7 interference[lopri] = Σrsrc importance[rsrc] * load(lopri, rsrc) / load(rsrc) 8 foreach lopri in L // normalize interference 9 interference[lopri] /= Σk interference[k] 10 foreach lopri in L 11 if miss[W] > 1 // throttle 12 scalefactor = 1 – α * (miss[W] – 1) * interference[lopri] 13 else // release 14 scalefactor = 1 + β 15 foreach cpoint in controlpoints() // apply new rates 16 set_rate(cpoint, lopri, scalefactor * get_rate(cpoint, lopri) Example: LatencySLO H High Priority Workflows Policy L Low Priority Workflows Select the high priority workflow W with worst performance Weight low priority workflows by their interference with W Throttle low priority workflows proportionally to their weight
  • 91. 25 1 foreach candidate in H 2 miss[candidate] = latency(candidate) / guarantee[candidate] 3 W = candidate in H with max miss[candidate] 4 foreach rsrc in resources() // calculate importance of each resource for hipri 5 importance[rsrc] = latency(W, rsrc) * log(slowdown(rsrc)) 6 foreach lopri in L // calculate low priority workflow interference 7 interference[lopri] = Σrsrc importance[rsrc] * load(lopri, rsrc) / load(rsrc) 8 foreach lopri in L // normalize interference 9 interference[lopri] /= Σk interference[k] 10 foreach lopri in L 11 if miss[W] > 1 // throttle 12 scalefactor = 1 – α * (miss[W] – 1) * interference[lopri] 13 else // release 14 scalefactor = 1 + β 15 foreach cpoint in controlpoints() // apply new rates 16 set_rate(cpoint, lopri, scalefactor * get_rate(cpoint, lopri) Example: LatencySLO H High Priority Workflows Policy L Low Priority Workflows Weight low priority workflows by their interference with W Select the high priority workflow W with worst performance Throttle low priority workflows proportionally to their weight
  • 92. 26 1 foreach candidate in H 2 miss[candidate] = latency(candidate) / guarantee[candidate] 3 W = candidate in H with max miss[candidate] 4 foreach rsrc in resources() // calculate importance of each resource for hipri 5 importance[rsrc] = latency(W, rsrc) * log(slowdown(rsrc)) 6 foreach lopri in L // calculate low priority workflow interference 7 interference[lopri] = Σrsrc importance[rsrc] * load(lopri, rsrc) / load(rsrc) 8 foreach lopri in L // normalize interference 9 interference[lopri] /= Σk interference[k] 10 foreach lopri in L 11 if miss[W] > 1 // throttle 12 scalefactor = 1 – α * (miss[W] – 1) * interference[lopri] 13 else // release 14 scalefactor = 1 + β 15 foreach cpoint in controlpoints() // apply new rates 16 set_rate(cpoint, lopri, scalefactor * get_rate(cpoint, lopri) Example: LatencySLO H High Priority Workflows Policy L Low Priority Workflows Throttle low priority workflows proportionally to their weight Select the high priority workflow W with worst performance Weight low priority workflows by their interference with W
  • 93. 27 1 foreach candidate in H 2 miss[candidate] = latency(candidate) / guarantee[candidate] 3 W = candidate in H with max miss[candidate] 4 foreach rsrc in resources() // calculate importance of each resource for hipri 5 importance[rsrc] = latency(W, rsrc) * log(slowdown(rsrc)) 6 foreach lopri in L // calculate low priority workflow interference 7 interference[lopri] = Σrsrc importance[rsrc] * load(lopri, rsrc) / load(rsrc) 8 foreach lopri in L // normalize interference 9 interference[lopri] /= Σk interference[k] 10 foreach lopri in L 11 if miss[W] > 1 // throttle 12 scalefactor = 1 – α * (miss[W] – 1) * interference[lopri] 13 else // release 14 scalefactor = 1 + β 15 foreach cpoint in controlpoints() // apply new rates 16 set_rate(cpoint, lopri, scalefactor * get_rate(cpoint, lopri) Example: LatencySLO H High Priority Workflows Policy L Low Priority Workflows Select the high priority workflow W with worst performance Weight low priority workflows by their interference with W Throttle low priority workflows proportionally to their weight
  • 94. 27 1 foreach candidate in H 2 miss[candidate] = latency(candidate) / guarantee[candidate] 3 W = candidate in H with max miss[candidate] 4 foreach rsrc in resources() // calculate importance of each resource for hipri 5 importance[rsrc] = latency(W, rsrc) * log(slowdown(rsrc)) 6 foreach lopri in L // calculate low priority workflow interference 7 interference[lopri] = Σrsrc importance[rsrc] * load(lopri, rsrc) / load(rsrc) 8 foreach lopri in L // normalize interference 9 interference[lopri] /= Σk interference[k] 10 foreach lopri in L 11 if miss[W] > 1 // throttle 12 scalefactor = 1 – α * (miss[W] – 1) * interference[lopri] 13 else // release 14 scalefactor = 1 + β 15 foreach cpoint in controlpoints() // apply new rates 16 set_rate(cpoint, lopri, scalefactor * get_rate(cpoint, lopri) Example: LatencySLO H High Priority Workflows Policy L Low Priority Workflows Select the high priority workflow W with worst performance Weight low priority workflows by their interference with W Throttle low priority workflows proportionally to their weight
  • 95. Other types of policy… 28
  • 96. Other types of policy… Bottleneck Fairness Detect most overloaded resource Fair-share resource between tenants using it Policy 28
  • 97. Other types of policy… Bottleneck Fairness Detect most overloaded resource Fair-share resource between tenants using it Dominant Resource Fairness Estimate demands and capacities from measurements Policy Policy 28
  • 98. Other types of policy… Bottleneck Fairness Detect most overloaded resource Fair-share resource between tenants using it Dominant Resource Fairness Estimate demands and capacities from measurements Policy Policy Concise Any resources can be bottleneck (policy doesn’t care) Not system specific 28
  • 101. Instrumentation Retro implementation in Java Instrumentation Library Central controller implementation 30
  • 102. Instrumentation Retro implementation in Java Instrumentation Library Central controller implementation To enable Retro Propagate Workflow ID within application (like X-Trace, Dapper) Instrument resources with wrapper classes 30
  • 103. Instrumentation Retro implementation in Java Instrumentation Library Central controller implementation To enable Retro Propagate Workflow ID within application (like X-Trace, Dapper) Instrument resources with wrapper classes Overheads 31
  • 104. Instrumentation Retro implementation in Java Instrumentation Library Central controller implementation To enable Retro Propagate Workflow ID within application (like X-Trace, Dapper) Instrument resources with wrapper classes Overheads Resource instrumentation automatic using AspectJ 31
  • 105. Instrumentation Retro implementation in Java Instrumentation Library Central controller implementation To enable Retro Propagate Workflow ID within application (like X-Trace, Dapper) Instrument resources with wrapper classes Overheads Resource instrumentation automatic using AspectJ Overall 50-200 lines per system to modify RPCs 31
  • 106. Instrumentation Retro implementation in Java Instrumentation Library Central controller implementation To enable Retro Propagate Workflow ID within application (like X-Trace, Dapper) Instrument resources with wrapper classes Overheads Resource instrumentation automatic using AspectJ Overall 50-200 lines per system to modify RPCs Retro overhead: max 1-2% on throughput, latency 31
  • 109. Experiments YARN MapReduce ZooKeeper Systems Workflows MapReduce Jobs (HiBench) HBase (YCSB) HDFS clients Background Data Replication Background Heartbeats HDFS HBase 32
  • 110. Experiments YARN MapReduce ZooKeeper Systems Workflows MapReduce Jobs (HiBench) HBase (YCSB) HDFS clients Background Data Replication Background Heartbeats Resources CPU, Disk, Network (All systems) Locks, Queues (HDFS, HBase) HDFS HBase 32
  • 111. Experiments YARN MapReduce ZooKeeper Systems Workflows MapReduce Jobs (HiBench) HBase (YCSB) HDFS clients Background Data Replication Background Heartbeats Resources CPU, Disk, Network (All systems) Locks, Queues (HDFS, HBase) Policies LatencySLO Bottleneck Fairness Dominant Resource Fairness HDFS HBase 32
  • 112. Experiments Policies for a mixture of systems, workflows, and resources Results on clusters up to 200 nodes YARN MapReduce ZooKeeper Systems Workflows MapReduce Jobs (HiBench) HBase (YCSB) HDFS clients Background Data Replication Background Heartbeats Resources CPU, Disk, Network (All systems) Locks, Queues (HDFS, HBase) Policies LatencySLO Bottleneck Fairness Dominant Resource Fairness See paper for full experiment results HDFS HBase 32
  • 113. Experiments Policies for a mixture of systems, workflows, and resources Results on clusters up to 200 nodes YARN MapReduce ZooKeeper Systems Workflows MapReduce Jobs (HiBench) HBase (YCSB) HDFS clients Background Data Replication Background Heartbeats Resources CPU, Disk, Network (All systems) Locks, Queues (HDFS, HBase) Policies LatencySLO Bottleneck Fairness Dominant Resource Fairness See paper for full experiment results This talk LatencySLO policy results HDFS HBase 32
  • 114. Experiments Policies for a mixture of systems, workflows, and resources Results on clusters up to 200 nodes YARN MapReduce ZooKeeper Systems Workflows MapReduce Jobs (HiBench) HBase (YCSB) HDFS clients Background Data Replication Background Heartbeats Resources CPU, Disk, Network (All systems) Locks, Queues (HDFS, HBase) Policies LatencySLO Bottleneck Fairness Dominant Resource Fairness See paper for full experiment results This talk LatencySLO policy results HDFS HBase 32
  • 115. HDFS read 8k HBase read 1 cached row HBase read 1 row 33
  • 116. HDFS read 8k HBase read 1 cached row HBase read 1 row HBase Cached Table Scan HDFS mkdir HBase Table Scan 33
  • 117. 1 10 100 1000 0 5 10 15 20 25 30 0.1 SLO-NormalizedLatency Time [minutes] HDFS read 8k HBase read 1 cached row HBase read 1 row HBase Cached Table Scan HDFS mkdir HBase Table Scan 33
  • 118. 1 10 100 1000 0 5 10 15 20 25 30 0.1 SLO-NormalizedLatency Time [minutes] HDFS read 8k HBase read 1 cached row HBase read 1 row HBase Cached Table Scan HDFS mkdir HBase Table Scan 33
  • 119. 1 10 100 1000 0 5 10 15 20 25 30 0.1 SLO-NormalizedLatency Time [minutes] HDFS read 8k HBase read 1 cached row HBase read 1 row HBase Cached Table Scan HDFS mkdir HBase Table Scan SLO target 33
  • 120. 1 10 100 1000 0 5 10 15 20 25 30 0.1 SLO-NormalizedLatency Time [minutes] HDFS read 8k HBase read 1 cached row HBase read 1 row HBase Cached Table Scan HDFS mkdir HBase Table Scan SLO target 33
  • 121. 1 10 100 1000 0 5 10 15 20 25 30 0.1 SLO-NormalizedLatency Time [minutes] HDFS read 8k HBase read 1 cached row HBase read 1 row HBase Cached Table Scan HDFS mkdir HBase Table Scan SLO target 33
  • 122. 0 5 10 15 0 5 10 15 20 25 30 Slowdown Time [minutes] 1 10 100 1000 0 5 10 15 20 25 30 0.1 SLO-NormalizedLatency Time [minutes] HDFS read 8k HBase read 1 cached row HBase read 1 row HBase Cached Table Scan HDFS mkdir HBase Table Scan SLO target 34
  • 123. 0 5 10 15 0 5 10 15 20 25 30 Slowdown Time [minutes] 1 10 100 1000 0 5 10 15 20 25 30 0.1 SLO-NormalizedLatency Time [minutes] HDFS read 8k HBase read 1 cached row Disk HBase read 1 row HBase Cached Table Scan HDFS mkdir HBase Table Scan SLO target 34
  • 124. 0 5 10 15 0 5 10 15 20 25 30 Slowdown Time [minutes] 1 10 100 1000 0 5 10 15 20 25 30 0.1 SLO-NormalizedLatency Time [minutes] HDFS read 8k HBase read 1 cached row Disk CPU HBase read 1 row HBase Cached Table Scan HDFS mkdir HBase Table Scan SLO target 34
  • 125. 0 5 10 15 0 5 10 15 20 25 30 Slowdown Time [minutes] 1 10 100 1000 0 5 10 15 20 25 30 0.1 SLO-NormalizedLatency Time [minutes] HDFS read 8k HBase read 1 cached row Disk CPU HDFS NN Lock HBase read 1 row HBase Cached Table Scan HDFS mkdir HBase Table Scan SLO target 34
  • 126. 0 5 10 15 0 5 10 15 20 25 30 Slowdown Time [minutes] 1 10 100 1000 0 5 10 15 20 25 30 0.1 SLO-NormalizedLatency Time [minutes] HDFS read 8k HBase read 1 cached row Disk CPU HDFS NN Lock HDFS NN Queue HBase read 1 row HBase Cached Table Scan HDFS mkdir HBase Table Scan SLO target 34
  • 127. 0 5 10 15 0 5 10 15 20 25 30 Slowdown Time [minutes] 1 10 100 1000 0 5 10 15 20 25 30 0.1 SLO-NormalizedLatency Time [minutes] HDFS read 8k HBase read 1 cached row Disk CPU HDFS NN Lock HDFS NN Queue HBase Queue HBase read 1 row HBase Cached Table Scan HDFS mkdir HBase Table Scan SLO target 34
  • 128. 1 10 100 1000 0 5 10 15 20 25 30 0.1 SLO-NormalizedLatency Time [minutes] HDFS read 8k HBase read 1 cached row 0.1 1 10 0 5 10 15 20 25 30 SLO-NormalizedLatency Time [minutes] + SLO Policy Enabled SLO target HBase read 1 row HBase Cached Table Scan HDFS mkdir HBase Table Scan 35 SLO target
  • 129. 1 10 100 1000 0 5 10 15 20 25 30 0.1 SLO-NormalizedLatency Time [minutes] HDFS read 8k HBase read 1 cached row 0.1 1 10 0 5 10 15 20 25 30 SLO-NormalizedLatency Time [minutes] + SLO Policy Enabled SLO target HBase read 1 row HBase Cached Table Scan HDFS mkdir HBase Table Scan 35 SLO target
  • 130. 0.1 1 10 0 5 10 15 20 25 30 SLO-NormalizedLatency Time [minutes] HDFS read 8k HBase read 1 cached row HBase read 1 row HBase Cached Table Scan HDFS mkdir HBase Table Scan 36
  • 131. 0.1 1 10 0 5 10 15 20 25 30 SLO-NormalizedLatency Time [minutes] HDFS read 8k HBase read 1 cached row 0.2 1 HBase Table Scan HBase read 1 row HBase Cached Table Scan HDFS mkdir HBase Table Scan 36
  • 132. 0.1 1 10 0 5 10 15 20 25 30 SLO-NormalizedLatency Time [minutes] HDFS read 8k HBase read 1 cached row 0.2 1 HBase Table Scan HBase read 1 row HBase Cached Table Scan HDFS mkdir HBase Table Scan 36
  • 133. 0.1 1 10 0 5 10 15 20 25 30 SLO-NormalizedLatency Time [minutes] HDFS read 8k HBase read 1 cached row 0.2 1 HBase Table Scan HBase read 1 row HBase Cached Table Scan HDFS mkdir HBase Table Scan 36
  • 134. 0.1 1 10 0 5 10 15 20 25 30 SLO-NormalizedLatency Time [minutes] HDFS read 8k HBase read 1 cached row 0.2 1 HBase Table Scan 0.5 1 HDFS mkdir HBase read 1 row HBase Cached Table Scan HDFS mkdir HBase Table Scan 36
  • 135. 0.1 1 10 0 5 10 15 20 25 30 SLO-NormalizedLatency Time [minutes] HDFS read 8k HBase read 1 cached row 0.2 1 HBase Table Scan 0.5 1 HDFS mkdir HBase read 1 row HBase Cached Table Scan HDFS mkdir HBase Table Scan 36
  • 136. 0.1 1 10 0 5 10 15 20 25 30 SLO-NormalizedLatency Time [minutes] HDFS read 8k HBase read 1 cached row 0.2 1 HBase Table Scan 0.5 1 HDFS mkdir HBase read 1 row HBase Cached Table Scan HDFS mkdir HBase Table Scan 36
  • 137. 0.1 1 10 0 5 10 15 20 25 30 SLO-NormalizedLatency Time [minutes] HDFS read 8k HBase read 1 cached row 0.5 1 HBase Cached Table Scan 0.2 1 HBase Table Scan 0.5 1 HDFS mkdir HBase read 1 row HBase Cached Table Scan HDFS mkdir HBase Table Scan 36
  • 138. 0.1 1 10 0 5 10 15 20 25 30 SLO-NormalizedLatency Time [minutes] HDFS read 8k HBase read 1 cached row 0.5 1 HBase Cached Table Scan 0.2 1 HBase Table Scan 0.5 1 HDFS mkdir HBase read 1 row HBase Cached Table Scan HDFS mkdir HBase Table Scan 36
  • 139. 0.1 1 10 0 5 10 15 20 25 30 SLO-NormalizedLatency Time [minutes] HDFS read 8k HBase read 1 cached row 0.5 1 HBase Cached Table Scan 0.2 1 HBase Table Scan 0.5 1 HDFS mkdir HBase read 1 row HBase Cached Table Scan HDFS mkdir HBase Table Scan 36
  • 141. Resource management for shared distributed systems Conclusion 37
  • 142. Resource management for shared distributed systems Centralized resource management Conclusion 37
  • 143. Resource management for shared distributed systems Centralized resource management Conclusion Comprehensive: resources, processes, tenants, background tasks 37
  • 144. Resource management for shared distributed systems Centralized resource management Conclusion Comprehensive: resources, processes, tenants, background tasks Abstractions for writing concise, general-purpose policies: Workflows Resources (slowdown, load) Control points 37
  • 145. Resource management for shared distributed systems Centralized resource management Conclusion http://cs.brown.edu/~jcmace Comprehensive: resources, processes, tenants, background tasks Abstractions for writing concise, general-purpose policies: Workflows Resources (slowdown, load) Control points