T. Cucinotta – Real-Time Systems Laboratory (ReTiS) – Scuola Superiore Sant’Anna
Virtual Network FunctionsVirtual Network Functions
as Real-time Containers in Private Cloudsas Real-time Containers in Private Clouds
IEEE CLOUD 2018IEEE CLOUD 2018
Tommaso Cucinotta
Luca Abeni
Mauro Marinoni
Alessio Balsini
Real-Time Systems Laboratory
Scuola Superiore Sant’Anna
Pisa, Italy
Carlo Vitucci
Ericsson AB
Stockholm, Sweden
T. Cucinotta – Real-Time Systems Laboratory (ReTiS) – Scuola Superiore Sant’Anna 2
Background andBackground and
motivations #1/2motivations #1/2
Widespread evolution of distributed computing
●
Cloud Computing, Native Cloud Applications
●
Big-Data Processing
●
Key related technologies: virtualization (machine, network, storage)
Performance drawbacks of machine virtualization
●
vm-exit and vm-resume (privileged instruction traps and IRQs)
●
emulation of special CPU registers
●
peripheral emulation
Increasing demand for efficient management of resources
●
Hardware-assisted virtualization
●
Para-virtualization
●
OS-level lightweight virtualization (a.k.a., containers)
T. Cucinotta – Real-Time Systems Laboratory (ReTiS) – Scuola Superiore Sant’Anna 3
Background andBackground and
motivations #2/2motivations #2/2
Evolution of network operators’ horizon
●
IP convergence (e.g., LTE)
●
Massive use of software-based technologies for
configuring & managing the network
e.g., Software-Defined Networking (SDN), OpenFlow, P4
●
Network Function Virtualization (NFV)
– from physical appliances sized for the peak hour
– to software-only Virtual Network Functions (VNFs) deployed
in a flexible NFV Infrastructure, managed as a private cloud
●
e.g., elastic VNFs grow and shrink matching the dynamic demand
T. Cucinotta – Real-Time Systems Laboratory (ReTiS) – Scuola Superiore Sant’Anna 4
Reference scenariosReference scenarios
Real-Time Cloud Services with tight interactivity
requirements
●
per-request processing time within 10s-100s ms
●
e.g., on-line/cloud gaming, real-time streaming, ...
VNFs with tight response-time requirements
●
virtualized Radio Access Network (vRAN)
●
functional split of processing btw. antenna<->NFVI
●
e.g., HARQ acks to be sent back
within 4ms or connection drops
T. Cucinotta – Real-Time Systems Laboratory (ReTiS) – Scuola Superiore Sant’Anna 5
Reference serviceReference service
topologiestopologies
Service Topologies
●
load-balanced service
●
VNF service-chain
●
arbitrary DAGs of
service compositions
Major focus
●
end-to-end processing latency
●
stability of processing latency of individual
instance is critical
T. Cucinotta – Real-Time Systems Laboratory (ReTiS) – Scuola Superiore Sant’Anna 6
Problem presentationProblem presentation
Noisy neighbour
(temporal interference)
problem
●
unstable performance
due to shared physical
resources
●
response times of a
service affected by
dynamic changes in
workload of co-located
services
Physical HostPhysical Host
LXCLXC
VMVM
LXCLXC
VMVM
VM Alone
2 VMs
~30ms
~120ms
τ1
= (30ms, 150ms)
τ2
= (50ms, 200ms)
T. Cucinotta – Real-Time Systems Laboratory (ReTiS) – Scuola Superiore Sant’Anna 7
Existing solutions #1/2Existing solutions #1/2
Standard practice for performance control in clouds
●
elasticity
Existing workarounds for temporal isolation
●
no over-subscription
(1-to-1 vCPU to pCPU mapping)
●
dedicated physical hosts
Major drawbacks
●
rough granularity in infrastructure allocation
– single core (or single host)
●
possible under-utilization and energy waste
T. Cucinotta – Real-Time Systems Laboratory (ReTiS) – Scuola Superiore Sant’Anna 8
Existing solutions #2/2Existing solutions #2/2
CPU over-subscription
 Temporal interferences
 Unstable performance
 Tail latency out of control
☺ NFVI capacity saturation
☺ Energy-efficient NFVI
No CPU over-subscription
No temporal interferences ☺
Stable performance ☺
Controlled tail latency ☺
Under-utilization of NFVI 
Inefficient energy mgmt 
T. Cucinotta – Real-Time Systems Laboratory (ReTiS) – Scuola Superiore Sant’Anna 9
Proposed approach #1/2Proposed approach #1/2
Main elements of our proposal
●
focus on LXC+Linux
(applicable to KVM as well)
●
OS-level scheduler
– providing reservation-based
scheduling
– H-CBS: hierarchical extension of
SCHED_DEADLINE in the Linux
kernel
●
real-time services in
LXC containers with guaranteed
overall (runtime, period) for
the container
●
use the Compositional Scheduling
Framework (CSF) analysis for a
sound mathematically correct set-up
(Shin & Lee, RTSS 2004)
T. Cucinotta – Real-Time Systems Laboratory (ReTiS) – Scuola Superiore Sant’Anna 10
Proposed approach #2/2Proposed approach #2/2
SCHED_DEADLINE
●
in mainline kernel since v3.14
●
reservation-based paradigm
– runtime (Q) guaranteed on CPU
– within relative deadline (D)
– every given period (P)
●
guarantee for periodic tasks with (C, D=T)
– if Q = CWCET & P = TMIN ==> all deadlines respected (P-EDF case)
H-CBS
●
hierarchical extension of SCHED_DEADLINE
●
POSIX FP-based scheduler nested in (Q, D=P) reservations
●
multi-threaded software component(s) (e.g., LXC, KVM, JVM, Apache, etc...) are
guaranteed an overall Q every P on the CPU(s)
● guarantee from CSF analysis: for a set of RT tasks with rtprio, CWCET and TMIN
params, all deadlines are met with proper reservation params (Q, P)
– independently from what other reservations exist and their actual workload
D
P
Q
time
T. Cucinotta – Real-Time Systems Laboratory (ReTiS) – Scuola Superiore Sant’Anna 11
Preliminary resultsPreliminary results
Preliminary experiment
●
RT tasks set running in LXC
●
regular CFS scheduler exhibit arbitrarily high tail-
latency due to interference of other container (not shown)
With H-CBS
●
parameters coherent with CSF
(Q=8ms, P=18ms)
=> no experim. deadline miss
● Q=16ms < 23.5=QMIN according
to CSF when P=36ms
=> no experim. deadline miss
(pessimism in CSF)
● Q=32ms << 59.5=QMIN according
to CSF when P=72ms
=> experim. lateness of 0.46
(deadline misses)
C (ms) T (ms)
4.88 30.0
0.56 36.0
10.4 104
4.41 109
20.3 250
deadline
T. Cucinotta – Real-Time Systems Laboratory (ReTiS) – Scuola Superiore Sant’Anna 12
ConclusionsConclusions
Proposal to achieve efficient handling of real-time
cloud/VNF services
●
as real-time services within LXC containers
●
under a new reservation-based scheduler on Linux
Advantages of the proposed technique
●
reservation parameters let us control tail latency
●
preserving energy efficiency typical of highly
consolidated workloads
T. Cucinotta – Real-Time Systems Laboratory (ReTiS) – Scuola Superiore Sant’Anna 13
ConclusionsConclusions
Proposal to achieve efficient handling of real-time
cloud/VNF services
●
as real-time services within LXC containers
●
under a new reservation-based scheduler on Linux
Advantages of the proposed technique
●
reservation parameters let us control tail latency
●
preserving energy efficiency typical of highly
consolidated workloads
Controlled CPU sharing
No temporal interferences ☺
Stable performance ☺
Controlled tail latency ☺
NFVI capacity saturation ☺
Energy-efficient NFVI ☺
T. Cucinotta – Real-Time Systems Laboratory (ReTiS) – Scuola Superiore Sant’Anna 14
Future workFuture work
Modifications to OpenStack and Tacker NFVI
manager to integrate H-CBS
●
modified OpenStack as PoC
●
real-time scheduling parameters specified at
TOSCA descriptors level, and being handled all the
way through the whole stack:
Tacker+OpenStack/Nova+LibVirt+CGroup+Kernel
Modifications to real VNF components
●
e.g., OpenAirInterface
T. Cucinotta – Real-Time Systems Laboratory (ReTiS) – Scuola Superiore Sant’Anna 15
Our releated publicationsOur releated publications
Journals
●
Improving Responsiveness of Time-Sensitive Applications by Exploiting Dynamic
Task Dependencies, Software: Practice and Experience, April’18
●
Elastic Admission Control for Federated Cloud Services, IEEE Transactions on Cloud
Computing, July’14
Conferences
●
Allocation and control of computing resources for RT VNFs, SOFTNETWORKING’18
●
The Importance of Being OS-aware ... , CLOSER’18, Funchal
●
Temporal Isolation Among LTE/5G Network Functions by RT Scheduling,
CLOSER’17, Porto
●
Data Centre Optimisation Enhanced by SDN, IEEE CLOUD’14, Alaska
●
Brokering SLAs for end-to-end QoS in Cloud Computing, CLOSER’14, Barcelona
●
Admission Control for Elastic Cloud Services, IEEE CLOUD’12, Honolulu
●
Efficient Virtualisation of RT Activities, IEEE RTSOAA’11, Irvine, CA
●
Virtualised e-Learning with RT Guarantees on the IRMOS Platform, IEEE SOCA’10,
Perth, Best Paper Award
●
Providing Performance Guarantees to VMs using RT Scheduling, VHPC’10
T. Cucinotta – Real-Time Systems Laboratory (ReTiS) – Scuola Superiore Sant’Anna 16
Questions ?
tommaso.cucinotta@santannapisa.it
http://retis.santannapisa.it/~tommaso
Thanks!Thanks!

Virtual Network Functions as Real-Time Containers in Private Clouds

  • 1.
    T. Cucinotta –Real-Time Systems Laboratory (ReTiS) – Scuola Superiore Sant’Anna Virtual Network FunctionsVirtual Network Functions as Real-time Containers in Private Cloudsas Real-time Containers in Private Clouds IEEE CLOUD 2018IEEE CLOUD 2018 Tommaso Cucinotta Luca Abeni Mauro Marinoni Alessio Balsini Real-Time Systems Laboratory Scuola Superiore Sant’Anna Pisa, Italy Carlo Vitucci Ericsson AB Stockholm, Sweden
  • 2.
    T. Cucinotta –Real-Time Systems Laboratory (ReTiS) – Scuola Superiore Sant’Anna 2 Background andBackground and motivations #1/2motivations #1/2 Widespread evolution of distributed computing ● Cloud Computing, Native Cloud Applications ● Big-Data Processing ● Key related technologies: virtualization (machine, network, storage) Performance drawbacks of machine virtualization ● vm-exit and vm-resume (privileged instruction traps and IRQs) ● emulation of special CPU registers ● peripheral emulation Increasing demand for efficient management of resources ● Hardware-assisted virtualization ● Para-virtualization ● OS-level lightweight virtualization (a.k.a., containers)
  • 3.
    T. Cucinotta –Real-Time Systems Laboratory (ReTiS) – Scuola Superiore Sant’Anna 3 Background andBackground and motivations #2/2motivations #2/2 Evolution of network operators’ horizon ● IP convergence (e.g., LTE) ● Massive use of software-based technologies for configuring & managing the network e.g., Software-Defined Networking (SDN), OpenFlow, P4 ● Network Function Virtualization (NFV) – from physical appliances sized for the peak hour – to software-only Virtual Network Functions (VNFs) deployed in a flexible NFV Infrastructure, managed as a private cloud ● e.g., elastic VNFs grow and shrink matching the dynamic demand
  • 4.
    T. Cucinotta –Real-Time Systems Laboratory (ReTiS) – Scuola Superiore Sant’Anna 4 Reference scenariosReference scenarios Real-Time Cloud Services with tight interactivity requirements ● per-request processing time within 10s-100s ms ● e.g., on-line/cloud gaming, real-time streaming, ... VNFs with tight response-time requirements ● virtualized Radio Access Network (vRAN) ● functional split of processing btw. antenna<->NFVI ● e.g., HARQ acks to be sent back within 4ms or connection drops
  • 5.
    T. Cucinotta –Real-Time Systems Laboratory (ReTiS) – Scuola Superiore Sant’Anna 5 Reference serviceReference service topologiestopologies Service Topologies ● load-balanced service ● VNF service-chain ● arbitrary DAGs of service compositions Major focus ● end-to-end processing latency ● stability of processing latency of individual instance is critical
  • 6.
    T. Cucinotta –Real-Time Systems Laboratory (ReTiS) – Scuola Superiore Sant’Anna 6 Problem presentationProblem presentation Noisy neighbour (temporal interference) problem ● unstable performance due to shared physical resources ● response times of a service affected by dynamic changes in workload of co-located services Physical HostPhysical Host LXCLXC VMVM LXCLXC VMVM VM Alone 2 VMs ~30ms ~120ms τ1 = (30ms, 150ms) τ2 = (50ms, 200ms)
  • 7.
    T. Cucinotta –Real-Time Systems Laboratory (ReTiS) – Scuola Superiore Sant’Anna 7 Existing solutions #1/2Existing solutions #1/2 Standard practice for performance control in clouds ● elasticity Existing workarounds for temporal isolation ● no over-subscription (1-to-1 vCPU to pCPU mapping) ● dedicated physical hosts Major drawbacks ● rough granularity in infrastructure allocation – single core (or single host) ● possible under-utilization and energy waste
  • 8.
    T. Cucinotta –Real-Time Systems Laboratory (ReTiS) – Scuola Superiore Sant’Anna 8 Existing solutions #2/2Existing solutions #2/2 CPU over-subscription  Temporal interferences  Unstable performance  Tail latency out of control ☺ NFVI capacity saturation ☺ Energy-efficient NFVI No CPU over-subscription No temporal interferences ☺ Stable performance ☺ Controlled tail latency ☺ Under-utilization of NFVI  Inefficient energy mgmt 
  • 9.
    T. Cucinotta –Real-Time Systems Laboratory (ReTiS) – Scuola Superiore Sant’Anna 9 Proposed approach #1/2Proposed approach #1/2 Main elements of our proposal ● focus on LXC+Linux (applicable to KVM as well) ● OS-level scheduler – providing reservation-based scheduling – H-CBS: hierarchical extension of SCHED_DEADLINE in the Linux kernel ● real-time services in LXC containers with guaranteed overall (runtime, period) for the container ● use the Compositional Scheduling Framework (CSF) analysis for a sound mathematically correct set-up (Shin & Lee, RTSS 2004)
  • 10.
    T. Cucinotta –Real-Time Systems Laboratory (ReTiS) – Scuola Superiore Sant’Anna 10 Proposed approach #2/2Proposed approach #2/2 SCHED_DEADLINE ● in mainline kernel since v3.14 ● reservation-based paradigm – runtime (Q) guaranteed on CPU – within relative deadline (D) – every given period (P) ● guarantee for periodic tasks with (C, D=T) – if Q = CWCET & P = TMIN ==> all deadlines respected (P-EDF case) H-CBS ● hierarchical extension of SCHED_DEADLINE ● POSIX FP-based scheduler nested in (Q, D=P) reservations ● multi-threaded software component(s) (e.g., LXC, KVM, JVM, Apache, etc...) are guaranteed an overall Q every P on the CPU(s) ● guarantee from CSF analysis: for a set of RT tasks with rtprio, CWCET and TMIN params, all deadlines are met with proper reservation params (Q, P) – independently from what other reservations exist and their actual workload D P Q time
  • 11.
    T. Cucinotta –Real-Time Systems Laboratory (ReTiS) – Scuola Superiore Sant’Anna 11 Preliminary resultsPreliminary results Preliminary experiment ● RT tasks set running in LXC ● regular CFS scheduler exhibit arbitrarily high tail- latency due to interference of other container (not shown) With H-CBS ● parameters coherent with CSF (Q=8ms, P=18ms) => no experim. deadline miss ● Q=16ms < 23.5=QMIN according to CSF when P=36ms => no experim. deadline miss (pessimism in CSF) ● Q=32ms << 59.5=QMIN according to CSF when P=72ms => experim. lateness of 0.46 (deadline misses) C (ms) T (ms) 4.88 30.0 0.56 36.0 10.4 104 4.41 109 20.3 250 deadline
  • 12.
    T. Cucinotta –Real-Time Systems Laboratory (ReTiS) – Scuola Superiore Sant’Anna 12 ConclusionsConclusions Proposal to achieve efficient handling of real-time cloud/VNF services ● as real-time services within LXC containers ● under a new reservation-based scheduler on Linux Advantages of the proposed technique ● reservation parameters let us control tail latency ● preserving energy efficiency typical of highly consolidated workloads
  • 13.
    T. Cucinotta –Real-Time Systems Laboratory (ReTiS) – Scuola Superiore Sant’Anna 13 ConclusionsConclusions Proposal to achieve efficient handling of real-time cloud/VNF services ● as real-time services within LXC containers ● under a new reservation-based scheduler on Linux Advantages of the proposed technique ● reservation parameters let us control tail latency ● preserving energy efficiency typical of highly consolidated workloads Controlled CPU sharing No temporal interferences ☺ Stable performance ☺ Controlled tail latency ☺ NFVI capacity saturation ☺ Energy-efficient NFVI ☺
  • 14.
    T. Cucinotta –Real-Time Systems Laboratory (ReTiS) – Scuola Superiore Sant’Anna 14 Future workFuture work Modifications to OpenStack and Tacker NFVI manager to integrate H-CBS ● modified OpenStack as PoC ● real-time scheduling parameters specified at TOSCA descriptors level, and being handled all the way through the whole stack: Tacker+OpenStack/Nova+LibVirt+CGroup+Kernel Modifications to real VNF components ● e.g., OpenAirInterface
  • 15.
    T. Cucinotta –Real-Time Systems Laboratory (ReTiS) – Scuola Superiore Sant’Anna 15 Our releated publicationsOur releated publications Journals ● Improving Responsiveness of Time-Sensitive Applications by Exploiting Dynamic Task Dependencies, Software: Practice and Experience, April’18 ● Elastic Admission Control for Federated Cloud Services, IEEE Transactions on Cloud Computing, July’14 Conferences ● Allocation and control of computing resources for RT VNFs, SOFTNETWORKING’18 ● The Importance of Being OS-aware ... , CLOSER’18, Funchal ● Temporal Isolation Among LTE/5G Network Functions by RT Scheduling, CLOSER’17, Porto ● Data Centre Optimisation Enhanced by SDN, IEEE CLOUD’14, Alaska ● Brokering SLAs for end-to-end QoS in Cloud Computing, CLOSER’14, Barcelona ● Admission Control for Elastic Cloud Services, IEEE CLOUD’12, Honolulu ● Efficient Virtualisation of RT Activities, IEEE RTSOAA’11, Irvine, CA ● Virtualised e-Learning with RT Guarantees on the IRMOS Platform, IEEE SOCA’10, Perth, Best Paper Award ● Providing Performance Guarantees to VMs using RT Scheduling, VHPC’10
  • 16.
    T. Cucinotta –Real-Time Systems Laboratory (ReTiS) – Scuola Superiore Sant’Anna 16 Questions ? tommaso.cucinotta@santannapisa.it http://retis.santannapisa.it/~tommaso Thanks!Thanks!