Autonomic SLA-driven Provisioning for Cloud Applications

Autonomic SLA-driven
Provisioning for Cloud
Applications

Nicolas Bonvin, Thanasis Papaioannou, Karl Aberer

CCGRID 2011, May 23-26 2011, New Port Beach, CA, USA

nicolas.bonvin@epfl.ch
LSIR - EPFL

Cloud Apps – Issue #1 : Placement

● A distributed, component-based application running on an elastic
infrastructure

C1
C1 C2
C2 C3
C3 C4
C4

2 EPFL – LSIR - Nicolas Bonvin


infrastructure

C1
C1 C2
C2 C3
C3 C4
C4

VM1 VM2 VM3



infrastructure
● Performance of C1, C2 and C3 is probably less than C4
● No info on other VMs colocated on same server !

C1
C1 C2
C2 C3
C3 C4
C4

VM1 VM2 VM3

Server 1 Server 2



infrastructure
● Performance of C1, C2 and C3 is probably less than C4
● No info on other VMs colocated on same server !

C1
C1 C2
C2 C3
C3 C4
C4

VM1 VM2 VM3

Server 1 Server 2

No control on placement


Cloud Apps – Issue #2 : Unstability

● Load-balanced trafic to 4 identical components on 4 identical VMs

C1
C1 C1
C1 C1
C1 C1
C1

VM1 VM2 VM3 VM4

100 ms 100 ms 100 ms 100 ms



– VM performance can vary up to a ratio 4 ! [Dej2009]
● Physical server, Hypervisor, Storage, ...

C1
C1 C1
C1 C1
C1 C1
C1

VM1 VM2 VM3 VM4

100 ms 140 ms 100 ms 100 ms



● Component overloaded

C1
C1 C1
C1 C1
C1 C1
C1

VM1 VM2 VM3 VM4

130 ms 140 ms 100 ms 100 ms



● Component bug, crash, deadlock, ...

C1
C1 C1
C1 C1
C1 C1
C1

VM1 VM2 VM3 VM4

130 ms 140 ms 100 ms infinity



● Failure of C1 on VM4 -> load is rebalanced

C1
C1 C1
C1 C1
C1 C1
C1

VM1 VM2 VM3 VM4




● Failure of C1 on VM4 -> load is rebalanced

C1
C1 C1
C1 C1
C1 C1
C1

VM1 VM2 VM3 VM4


Application should react early !


Cloud Apps – Overview

● Build for failures
– Do not trust the underlying infrastructure
– Do not trust your components either !
● Components should adapt to the changing conditions
– Quickly
– Automatically
– e.g. by replacing a wonky VM by a new one


Scarce:
a framework to build scalable cloud applications

Architecture Overview

● An agent on each server / VM
– starts/stops/monitors the components
– Takes decisions on behalf of the components
● An agent communicates with other agents
– Routing table
– Status of the server (resources usage)

Server Agent
Agent
A

B Agent GOSSIPING
+ BROADCAST
Agent
Agent
E

Agent


An economic approach

● Time is split into epochs (no synchronization between servers)
● Servers charge a virtual rent for hosting a component according to
– Current resource usage (I/O, CPU, ...) of the server
– Technical factors (HW, connectivity, ...)
– Non-technical factors (country stability, ....)


An economic approach

● Time is split into epochs (no synchronization between servers)
● Servers charge a virtual rent for hosting a component according to
– Current resource usage (I/O, CPU, ...) of the server
– Technical factors (HW, connectivity, ...)
– Non-technical factors (country stability, ....)

● Components
– Pay virtual rent at each epoch
– Gain virtual money by processing requests
– Take decisions based on balance ( = gain – rent )
● Replicate, migrate, suicide, stay

● Virtual rents are updated by gossiping (no centralized board)


Economic model (i)

● The rent of a server is different for each component !


Economic model (ii)

CPU : 70%
I/O : 20%
VM1
CPU : 30%
I/O : 5%
C1
C1 ?
CPU : 25%
I/O : 65%
VM2

● VM1 and VM2 have an « identical » resources usage : 45%
● Server rent = server's resources usage with component's weights
– Rent for C1 @ VM1 > rent for C1 @ VM2

Multiplexing of server resources


Economic model (iii)

● Choosing a candidate server j during replication/migration of a
component i
– netbenefit maximization

● 2 optimization goals :
– high-availability by geographical diversity of replicas
– low latency by grouping related components
● gj : weight related to the proximity of the server location to the
geographical distribution of the client requests to the component
● Si is the set of server hosting a replica of component i


SLA Performance Guarantees (i)

● Each component has its own SLA constraints
● SLA derived directly from entry components

C2
C2 C4
C4

C1
C1
SLA :: 500ms
SLA 500ms

C3
C3 C5
C5

● Resp. Time = Service Time + max (Resp. Time of Dependencies)


SLA Performance Guarantees (ii)

● SLA propagation from parents to children
● Parent j sends its performance constraints (e.g. response time upper
bound) to its dependencies D(j) :

● Child i computes its own performance constraints :

● : group of constraints sent by the replicas of the parent g


SLA Performance Guarantees (iii)

● SLA propagation from parents to children


Automatic Provisioning

● Usage of allocated resources is maximized :
– autonomic migration / replication / suicide of components
– not enough to ensure end-to-end response time

● Cloud resources managed by framework via cloud API

● Each individual component has to satisfy its own SLA
– SLA easily met -> decrease resources (scale down)
– SLA not met -> increase resources (scale up, scale out)


Adaptivity to slow servers

● Each component keeps statistics about its children
– e.g. 95th perc. response time
● A routing coefficient is computed for each child at each epoch
– Send more requests to more performant children


Evaluation: Setup

● 5 components, mostly CPU-intensive (wc >> wm,wn,wd)

C2
C2 C4
C4

C1
C1
SLA :: 500ms
SLA 500ms

C3
C3 C5
C5

● 8 8-cores servers (Intel Core i7 920, 2.67 GHz, 8GB, Linux 2.6.32-
trunk-amd64)
● d=0, C=110, k =10000, xs* = 25%


Adaptation to Varying Load (i)

● 5 rps to 60 rps at minute 8, step 5 rps/min
● Static setup : 2 servers with 2 cores


Adaptation to Varying Load (ii)

● 5 rps to 60 rps at minute 8, step 5 rps/min
● Static setup : 2 servers with 2 cores


Adaptation to Slow Server

● Max 2 cores/server, 25 rps
● At minute 4, a server gets slower (200 ms delay)


Scalability

● Add 5 rps
per minute until 150 rps
● Max 6 cores/server


Conclusion

● Framework for building cloud applications
● Elasticity : add/remove resources
● High Availability : software, hardware, network failures
● Scalability : growing load, peaks, scaling down, ...
– Quick replication of busy components
● Load Balancing : load has to be shared by all available servers
– Replication of busy components
– Migration of less busy components
– Reach equilibrium when load is stable
● SLA performance guarantees
– Automatic provisioning
● No synchronization, fully decentralized


Autonomic SLA-driven Provisioning for Cloud Applications

More Related Content

Viewers also liked

Recently uploaded

Autonomic SLA-driven Provisioning for Cloud Applications