A RTRM Proposal for Multi/Many-Core Platforms and Reconfigurable Applications

A RTRM Proposal for Multi/Many-Core Platforms
and Reconfigurable Applications
P. Bellasi, G. Massari and W. Fornaciari
{bellasi, massari, fornacia}@elet.polimi.it

ReCoSoC 2012

Dipartimento di Elettronica e Informazione
Politecnico di Milano

Last revision Jul, 2 2012

Introduction
Why Run-Time Resource Management?

 Run-Time Resources Management (RTRM) is about
finding the optimal tradeoff between
QoS requirements and resources availability

 Target scenario
Shared HW resources
upcoming many-core devices are complex systems
process variations and run-time issues
Mixed SW workloads
resources sharing and competition
among applications with different and time-varying requirements
 Simple solutions are required
support for frequently changing use-cases
suitable for both critical and best-effort applications
The BarbequeRTRM Framework 2

Introduction
Paper Contribution

 Methodology to support system-wide run-time
resource management
exploiting design-time information
hierarchical and distributed control
http://www.2parma.eu

 BarbequeRTRM Framework
multi-objective optimization strategy
easily portable and modular design
run-time tunable and scalable policies
open source project http://bosp.dei.polimi.it


Introduction
How we compare?

Desirable Properties

De
sig
Co
Clu

ntr

n-T
Ho

ste
Mu

ol-
He

im
Mu

mo

red
Re

Th
ter

eE
lt.
lti-

g.
Resources

co

eo
.P
Re

Re

xp
Po

Ob

Pla
nf.

ry
lat
so
Managers

so

loi
r

jec
tab

/ Ad

tfo

Mo
for
urc

urc

tat
Proposals

ti v

rm
ilit

ap

ion
ms

de
es

es
e

s
y

t

l
StarPU
Binotto et al.

Fu et al.
ACTORS

SEEC

DistRM
BarbequeRTRM


The BarbequeRTRM
System-Wide RTRM: Overall Framework

user-space Application-Specific RTM
Critical Apps Best-Effort Apps Fine grained control on application
a b allocated resources:
A B - task ordering
- virtual processor assignment
RTLib RTLib - DVFS
Dynamic Code - application parameters monitoring
Generation C C
c System-Wide RTRM
Res Accounting Res Partitioning Coarse grained control on platform
G available resources:
Res Abstraction - resource accounting, partitioning
and abstraction
- high-level HW events handling
MRAPI Platform Proxy e.g., critical conditions, faults...
- manage applications priorities
D E - power/thermal “coarse tuning”
kernel
Platform DRV
Platform DRV d
Platform Driver
BarbequeRTRM
supported platforms F
f
Task Mapping I Platform Firmware
Legend
DDM H X SW Interface (API)
e
Y SW/HW Meta-data


The BarbequeRTRM
System-Wide RTRM: Presentation Outline

1
A B - task ordering
RTLib RTLib - DVFS
Generation C C
c 2 System-Wide RTRM
and abstraction
3 - manage applications priorities
kernel
Platform DRV
Platform DRV d
Platform Driver
BarbequeRTRM
f
Legend
e
Y SW/HW Meta-data


The BarbequeRTRM
System-Wide RTRM: Distributed Hierarchical Control

1
A B - task ordering
RTLib RTLib - DVFS
Generation C C
c System-Wide RTRM
and abstraction
kernel
Platform DRV
Platform DRV d
Platform Driver
BarbequeRTRM
f
Legend
e
Y SW/HW Meta-data


The Proposed Control Solution
Distributed Hierarchical Control

 Different subsystems have their own control loop (CL)
System-wide level (resources partitioning, system-wide optimization, ...)
Application specific (application tuning, dynamic memory management, ...)
Firmware/OS level (F/V control, thermal alarms, resource availability, ...)

 FF closed CL
using OP and AWM

 Optimal
user defined goal functions
including overheads

 Robust
 Adaptive

The BarbequeRTRM
System-Wide RTRM: Resource Partitioning Strategy

user-space Application-Specific RTRM
A B - task ordering
RTLib RTLib - DVFS
Generation C C
c 2 System-Wide RTRM
and abstraction
kernel
Platform DRV
Platform DRV d
Platform Driver
BarbequeRTRM
f
Legend
e
Y SW/HW Meta-data


Scheduling Policy
YaMS - A modular multi-objective scheduler

 Introduction of a new modular policy (YaMS)
partition available resources (R) on applications (A)
considering A priorities and R “residual” availabilities
multi-objective optimization
support a set of tunable goals
DONE: performances, overheads,
congestion, fairness
WIP: stability, robustness,
thermal and power
increase overall system value
considering discrete and tunable
improvements
 LP theory, MMKP heuristic
promote scheduling of some AWMs
which improve optimization goals
demote scheduling of others AWMs
which degrade solution metrics
e.g. stability and robustness

Scheduling Policy
YaMS - Scalability

Speedup

+36%

+54%


Scheduling Policy
System-Wide Controller – Overall View

BBQ Validation Policy
- enforce certain control properties
energy budget, stability and robustness
- authorize resources synchronization


Scheduling Policy
System-Wide Controller – Inner-Loop “Scheduling”



Scheduling Policy
System-Wide Controller – Inner-Loop Overheads

Apps with 3 AWM, 3 Clusters => 9 configuration per application
BBQ running on NSJ, 4 CPUs @ 2.5GHz (max)

+
+

+


The BarbequeRTRM
System-Wide RTRM: Platform Integration

user-space Application-Specific RTRM
A B - task ordering
RTLib RTLib - DVFS
Generation C C
c System-Wide RTRM
and abstraction
3 - manage applications priorities
kernel
Platform DRV
Platform DRV d
Platform Driver
BarbequeRTRM
f
Legend
e
Y SW/HW Meta-data


The BarbequeRTRM
Platform Integration Layer

 Support for generic Linux SMP/NUMA machines
portable solution, based on Linux Control Group
CPUs and Memory nodes assignments
Memory amount assignment
CPU bandwidth quota assignment (requires kernel 3.2)
 Support both resources monitoring and control
pre-configured CGroup to define BBQ controlled resources
at Barbeque start, by parsing a pre-configured cgroups
… than Barbeque takes control over these resources
tun-time generation of new CGroup to control applications
requires RTLib integration (of course)

 Working on P2012 integration
new genration many-core platform from STMicroelectronics
first version: 4 cluster with 16 Processing Element each one

Synchronization Policy
System-Wide Controller – Outer-Loop “Synchronization”



Synchronization Policy
System-Wide Controller – Outer-Loop Overheads

min AWM 25% CPU Time, 3 Clusters x 4CPUs => max 48 syncs
BBQ running on NSJ, 4 CPUs @ 2.5GHz (max)

+

Linux kernel 3.2
Creation overheads: ~500ms
Update overheads: ~100ms
+
(1/3 on quadcore i7)
+

+

+
Application dependent

CGroups
PIL


The BarbequeRTRM Framework
Power Optimizations

 Initial experiments on congested workloads
increasing running instances of Bodytrack (PARSEC)
3AWM: [1,2,4] Threads
system-wide power measurements
via the standard IPMI interface

Power Gains
2,3-3,7%

Time Gains
338-625%

X86_64 NUMA machine: 3 Clusters x 4CPUs
BBQ running on NSJ, 4 CPUs @ 800MHz


Conclusions

 Framework for System-Wide RTRM
flexibility and scalability of the RTRM strategy
thanks to its hierarchical and distributed control structure
acceptable overheads for real usage scenarios
including those with variable workload
tunable multi-objective optimization policies
to cope with several design constraints and goals
e.g., performance, power, thermal and reliability, ...
promising results in terms of performance improving
and power consumption reduction
for a highly parallel workload, on a NUMA multi-core architecture

availability of a simple API interface
making straightforward for the programmers to take full advantages
from framework services


Future Works

 Optimization policies
robustness and stability assessment
improving power/energy optimizations
thermal and reliability management

 Platform integration
extended support for Android targets
complete the integration with the P2012 many-core platform


Thanks for your attention!

If you are interested, please check
the project website for further information
and keep update with the developments
http://bosp.dei.polimi.it

A RTRM Proposal for Multi/Many-Core Platforms and Reconfigurable Applications

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (7)

Recently uploaded

Recently uploaded (20)

A RTRM Proposal for Multi/Many-Core Platforms and Reconfigurable Applications