• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Exploiting Linux Control Groups for Effective Run-time Resource Management
 

Exploiting Linux Control Groups for Effective Run-time Resource Management

on

  • 721 views

Emerging multi/many-core architectures, targeting both High Performance Computing (HPC) and mobile devices, ...

Emerging multi/many-core architectures, targeting both High Performance Computing (HPC) and mobile devices,
increase the interest for self-adaptive systems, where both applications and computational resources could smoothly adapt
to the changing of the working conditions. In these scenarios, an efficient Run-Time Resource Manager (RTRM) framework
can provide a valuable support to identify the optimal trade-off between the Quality-of-Service (QoS) requirements of the
applications and the time varying resources availability.

This presentation introduces a new approach to the development of a system-wide RTRM featuring: a) a hierarchical and distributed control, b) the exploitation of design-time information, c) a rich multi-objective optimization strategy and d) a portable and modular design based on a set of tunable policies. The framework is already available as an Open Source project, targeting a NUMA architecture and a new generation multi/many-core research platform. First tests show benefits for the execution of parallel applications, the scalability of the proposed multi-objective resources partitioning strategy, and the sustainability of the overheads introduced by the framework.

Statistics

Views

Total Views
721
Views on SlideShare
721
Embed Views
0

Actions

Likes
2
Downloads
5
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

CC Attribution License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Exploiting Linux Control Groups for Effective Run-time Resource Management Exploiting Linux Control Groups for Effective Run-time Resource Management Presentation Transcript

    • Exploiting Linux Control Groups for Effective Run-time Resource Management P. Bellasi, G. Massari and W. Fornaciari {bellasi, massari, fornacia}@elet.polimi.it Speaker: Prof. William Fornaciari Dipartimento di Elettronica, Informazione e Bioingegneria Politecnico di MilanoLast revision Jan, 18 2013
    • Introduction Why Run-Time Resource Management?  Computing platforms convergence targeting both HPC and high-end embedded and mobile systems parallelism level ranging from few to hundreds of PEs thanks to silicon technology progresses  Emerging new set of non-functional constraints thermal management, system reliability and fault-tolerance area and power are typical design issues embedded systems are loosing exclusiveness effective resource management policies required to properly exploit modern computing platformsExploiting Linux CGroups for Effective RTRM2
    • Introduction How we compare?  Different approaches targeting resources allocation Linux scheduler extensions mostly based on adding new scheduler classes [2,4,7] force the adoption of a customized kernel Virtualization Hypervisor acting as a global system manager Both commercial and open source solutions Commercial: e.g. OpenVZ, VServer, Montavista Linux; Open: e.g. KVM, Linux Containers require HW support on the target system User-space approaches more portable solutions [3,6,11] mostly limited to CPU assignment[2] Bini et. al., “Resource management on multicore systems: The actors approach”. Micro 2011.[3] Blagodurov and Fedorova, “User-level scheduling on numa multicore systems under linux”, Linux Symposium 2011.[4] Fu and Wang., “Utilization-controlled task consolidation for power optimization in multi-core real-time systems”. RTCSA 2011.[6] Hofmeyr et. al.,. “Load balancing on speed”. PpoPP 2010.[7] Li et. al., “Efficient operating system scheduling for performance-asymmetric multi-core architectures”. SC 2007.[11] Sondag and Rajan, “Phase-based tuning for better utilization of performance-asymmetric multicore processors”. CGO 2011.Exploiting Linux CGroups for Effective RTRM3
    • Introduction How we compare?  Different approaches targeting resources allocation Linux scheduler extensions mostly based on adding new scheduler classes [2,4,7] More dynamic usage of Linux Control Groups More dynamic force the adoption of a customized kernel usage of Linux Control Groups to manage multiple resources with aaportable to manage multiple resources with portable Virtualization and modular RTRM running in user-space and modular RTRM running in user-space Hypervisor acting as a global system manager Both commercial and open source solutions Commercial: e.g. OpenVZ, VServer, Montavista Linux; Open: e.g. KVM, Linux Containers require HW support on the target system User-space approaches more portable solutions [36,11] mostly limited to CPU assignment[2] Bini et. al., “Resource management on multicore systems: The actors approach”. Micro 2011.[3] Blagodurov and Fedorova, “User-level scheduling on numa multicore systems under linux”, Linux Symposium 2011.[4] Fu and Wang., “Utilization-controlled task consolidation for power optimization in multi-core real-time systems”. RTCSA 2011.[6] Hofmeyr et. al.,. “Load balancing on speed”. PpoPP 2010.[7] Li et. al., “Efficient operating system scheduling for performance-asymmetric multi-core architectures”. SC 2007.[11] Sondag and Rajan, “Phase-based tuning for better utilization of performance-asymmetric multicore processors”. CGO 2011.Exploiting Linux CGroups for Effective RTRM4
    • Resources Partitioning Mechanism Why Linux Control Groups?  Standard Linux framework, since 2.6.24 allows to group and bind tasks to a set of system resources e.g. CPUs, memory quota and I/O bandwidth resources could be either shared or exclusive assigned  Allows to define isolated execution environments light-weight virtualization, i.e. low run-time overhead  Mostly used for “quasi-static” configuration administrators tool to shape the system usage We use ititin aadynamic way We use in dynamic way as aaeffective mechanism as effective mechanism to support RTRM to support RTRM Increasing set of resources controllersExploiting Linux CGroups for Effective RTRM5
    • The BarbequeRTRM Overall View on Run-Time Resource Managementuser-space Application-Specific RTM Critical Apps Best-Effort Apps Fine grained control on application a b allocated resources: A B - task ordering - virtual processor assignment RTLib RTLib - DVFS Dynamic Code - application parameters monitoring Generation C C c System-Wide RTRM Res Accounting Res Partitioning Coarse grained control on platform G available resources: Res Abstraction - resource accounting, partitioning and abstraction - high-level HW events handling MRAPI Platform Proxy e.g., critical conditions, faults... - manage applications priorities D E - power/thermal “coarse tuning”kernel Platform DRV Platform DRV d Platform Driver BarbequeRTRMsupported platforms F f Task Mapping I Platform Firmware Legend DDM H X SW Interface (API) e [1] Bellasi et.al., ”A RTRM proposal for multi/many-core platforms Y and reconfigurable applications”. ReCoSoC 2012. SW/HW Meta-dataExploiting Linux CGroups for Effective RTRM6
    • The BarbequeRTRM Overall View on Run-Time Resource Management Congesteduser-space workloads Application-Specific RTM Critical Apps Best-Effort Apps Fine grained control on application a b allocated resources: A B - task ordering - virtual processor assignment RTLib RTLib - DVFS Dynamic Code - application parameters monitoring Generation C C c Extend advanced and System-Wide RTRM Res Accounting Res Partitioning efficient resources control Coarse grained control on platform G capabilityaccounting, partitioning offered by available resources: Regular Workload Res Abstraction - resource modern Linux Kernels and abstraction - high-level HW events handling MRAPI CGroups Platform Proxy e.g., critical conditions, faults... with suitable - manage applications priorities D E - power/thermal “coarse tuning”kernel resources partitioning Platform DRV Platform DRV d policies Platform Driver BarbequeRTRMsupported platforms F running in user-space f Task Mapping I Platform Firmware CGroups based Legend DDM resourcesH X SW Interface (API) abstraction layer e Y SW/HW Meta-dataExploiting Linux CGroups for Effective RTRM7
    • Experimental Setup Hardware Platform and Workloads  Workloads: increasing number of concurrently running applications Bodytrack (BT) (PARSEC v2.1) modified to be run-time tunable and integrated with the BarbequeRTRM https://bitbucket.org/bosp/benchmarks-parsec  Platform: Quad-Core AMD Opteron 8378 4 core host partition, 3x4 CPUs accelerator partition running up to 2.8GHz , 16 Processing Elements (PE) CPUFreq and its on-demand policy Cgroups Managed Device PartitionLinux Goal: assess framework capability to Host efficiently manage resources on increasingly congested workload scenariosExploiting Linux CGroups for Effective RTRM8
    • Experimental Setup Metrics Collection  Compare Bodytrack original vs integrated version using same maximum amount of thread the BBQ Managed version could reduce this number at Run-Time  Original version controlled by Linux scheduler, integrated version managed by BarbequeRTRM *  Performances profiling using standard frameworks IPMI Interface for system-wide power consumption [W] Using Linux perf framework to collect HW/SW performance counter (*) The lower the better, for all metrics but the IPCExploiting Linux CGroups for Effective RTRM9
    • Results Workload Burst Performance Comparison Completion Time CPU Migrations CTX Switches Power [W] BBQ managed apps pinned improved code to assigned CPUs execution efficiency IPC: 1.080 => 1.235 A B 1 Thread High System congestion BBQ partially serialize the execution of IPC: 1.070 => 1.325 concurrent workloads C D 8 Threads Reduced OS overhead Improvements [%] - BBQ Manged vs Unmanaged Improved code efficiency > x1.3 faster A Up to x6 more B energy efficient C DStatistics based on: 30 runs, 99% confidence intervalExploiting Linux CGroups for Effective RTRM10
    • Results Benefits and Loss Comparison Normalized speedups for all collected performance counters Same order of magnitude for 1 “migrations” on lower congestions 2 2 2 2 A C “page faults” and “branch rate” 1 1 2 always degraded because of code organization for BBQ integration loop-unrolling could not be applied, but... an improved integration has already been identified 1 1 2 2 Instruction stream optimization 2 2 could be achieved by treading B D compile time optimization with effective resources assignment1 Thread 8 Threadspositive bar corresponds to an improvement while a negative bar represent adeficiency of the managed application with respect of the original oneExploiting Linux CGroups for Effective RTRM11
    • The BarbequeRTRM Framework Conclusions & Future Works  New user-space approach to RTRM exploiting an advanced and efficient resources control framework offered by modern Linux kernels providing a tunable resources partitioning policy  Evaluate the Linux Control Group effectiveness to support mandatory resources assignment to concurrently running applications assessment using a benchmark from PARSEC v2.1 updated to be run-time tunable and integrated with our framework  More than 30% speed-up, up to x6 energy efficiency overall improved instruction stream optimization confirmed by many HW/SW performance counters  Main future activities: integrate more benchmarks and explore more compiler-friendly integrationsExploiting Linux CGroups for Effective RTRM12
    • Thanks for your attention! If you are interested, please checkthe project website for further information and keep update with the developments http://bosp.dei.polimi.it
    • Backup Slides
    • The BarbequeRTRM How we compare? De Desirable Properties sig Co Clu ntr n-T Ho ste Mu ol- He im Mu mo red Re Th ter eE lt. lt g. Resources co i-O eo .P R Re xp Po Pla es n ry lat bje Managers so loi f./A rt a ou tfo Mo for urc tat cti bil da Proposals rce rm ion ms de ve ity es pt s s l StarPU Binotto et al. Fu et al. ACTORS SEEC DistRM BarbequeRTRM P. Bellasi et. al. “A RTRM proposal for Multi/Many-Core platforms and reconfigurable applications” 7th International Workshop on Reconfigurable Communication-centric Systems-on-Chip (ReCoSoC12), York, UK, 07/2012.Exploiting Linux CGroups for Effective RTRM15
    • The Proposed Control Solution Distributed Hierarchical Control  Different subsystems have their own control loop (CL) System-wide level (resources partitioning, system-wide optimization, ...) Application specific (application tuning, dynamic memory management, ...) Firmware/OS level (F/V control, thermal alarms, resource availability, ...)  FF closed CL using OP and AWM  Optimal user defined goal functions including overheads  Robust BBQ  AdaptiveExploiting Linux CGroups for Effective RTRM16
    • Scheduling Policy System-Wide Controller – Overall View BBQ Validation Policy - enforce certain control properties energy budget, stability and robustness - authorize resources synchronizationExploiting Linux CGroups for Effective RTRM17
    • Scheduling Policy System-Wide Controller – Inner-Loop “Scheduling” YaMS BBQ Validation Policy - enforce certain control properties energy budget, stability and robustness - authorize resources synchronizationExploiting Linux CGroups for Effective RTRM18
    • Scheduling Policy System-Wide Controller – Inner-Loop Overheads Apps with 3 AWM, 3 Clusters => 9 configuration per application BBQ running on NSJ, 4 CPUs @ 2.5GHz (max) + + +Exploiting Linux CGroups for Effective RTRM19
    • Scheduling Policy YaMS - Scalability Speedup +36% +54%Exploiting Linux CGroups for Effective RTRM20
    • The BarbequeRTRM The Barbeque OpenSource Project (BOSP)  Based on (a customization of) Android building system freely available for download and (automatized) building Framework dependencies External libs, tools, ... Framework Sources BarbequeRTRM, RTLib Framework Tools PyGrill (loggrapher), ... Contributions Tutorials, demo Public GIT repository https://bitbucket.org/bospExploiting Linux CGroups for Effective RTRM21
    • The BarbequeRTRM Framework Why such a name?!? Because of its “sweet analogy” with something everyone knows... QoS Applications how good is the grill the stuff to cook Overheads PriorityCook fast and light how thick is the meat or how much you are hungry Task mapping the chefs secret Mixed Workload sausages, steaks, chops and vegetablesReliability Issuesdropping the flesh Thermal Issues burning the flesh Policy Resources the cooking recipe coals and grill Exploiting Linux CGroups for Effective RTRM 22
    • Synchronization Policy System-Wide Controller – Outer-Loop “Synchronization” BBQ Validation Policy - enforce certain control properties energy budget, stability and robustness - authorize resources synchronizationExploiting Linux CGroups for Effective RTRM23
    • Synchronization Policy System-Wide Controller – Outer-Loop Overheadsmin AWM 25% CPU Time, 3 Clusters x 4CPUs => max 48 syncsBBQ running on NSJ, 4 CPUs @ 2.5GHz (max) + Linux kernel 3.2 Creation overheads: ~500ms Update overheads: ~100ms + (1/3 on quadcore i7) + + + Application dependent CGroups PILExploiting Linux CGroups for Effective RTRM24
    • The BarbequeRTRM Framework Power Optimizations  Initial experiments on congested workloads increasing running instances of Bodytrack (PARSEC) 3AWM: [1,2,4] Threads system-wide power measurements via the standard IPMI interface Power Gains 2,3-3,7% Time Gains 338-625% X86_64 NUMA machine: 3 Clusters x 4CPUs BBQ running on NSJ, 4 CPUs @ 800MHzExploiting Linux CGroups for Effective RTRM25