SlideShare a Scribd company logo
1 of 34
Download to read offline
1
cloud automation. revolutionized.
Intelligent Cloud Automation
- A Research perspective on (semi-)autonomous
cloud management
Erik Elmroth
erik.elmroth@elastisys.com
Unavailable or slow Internet services …
Lost sale?
Lost customer?
Lost reputation?
Probably like everyone else…
• 82% of customers give up on a lost payment transaction*
• 25% of users leave if load time > 4 s**
• 1% reduced sale per 100 ms load time**
• 0.5 s longer load time è 20% reduced income***
Insufficient capacity costs money for service owners!
What do you do …
• if your hotel search takes 5 secs for each hotel?
• if the web session crashes during payment?
* JupiterResearch ** Amazon ***Google
2
Stakeholders
Resource management objectives
(Energy) Efficiency
Performance Reliability
3
Today’s IT operations
Motivation: software complexity
4
Motivation: faults
Question: what is the probability of
a hard drive failure?
In my laptop?
Will happen every few years,
hopefully not right now…
In a large data center?
More than 100k nodes
Will happen during this talk!
Motivation: personnel costs
Question: How many servers can be handled by a system
administrator?
Very old question…
Some numbers:
10 - very complex systems
~300 - standard large-scale organization
Several 1000s – virtualized data center
26k (Facebook 2013)
Higher-level management and better abstractions are needed
Alternative: exponential increase in need for systems management
5
Motivation: costs
(Semi-)autonomous resource
management
6
The autonomic approach
• Autonomic computing
– Named after autonomic nervous
system
– Systems manage themselves
according to admin goals
– Self-governing operation of entire
system, not just parts of it
– New components integrate
effortlessly - as a new cell
establishes itself in the body
Autonomic Computing
• IBM initiative in early 2000’s
• Landmark paper published 2003
in IEEE Computer by Kephart and Chess
@ IBM
• Active research field since,
during 2003-2013:
– 200 conferences/workshops
– 8000+ papers
• Lots of funding
– EC FP6, FP7, H2020
– WASP
• Industry uptake
– Many big IT vendors & startups
• Key point
– Self-management of IT systems
7
Self-management?
• Four aspects of self-management
– Self-configuration
• Configure themselves automatically
• High-level policies (what is desired, not how)
– Self-optimization
• Continually seek ways to improve their operation
• Hundreds of tunable parameters
– Self-healing
• Handle faults and errors
• Analyze information from logs and monitors
– Self-protection
• Malicious attacks
• Cascading failures
• Admin mistakes
The MAPE loop
• Fundamental architecture
– Managed element(s)
• Server, database, storage
system, etc.
– Autonomic manager
• Responsible for:
– Providing its service
– Managing behavior
according to goals
Interacting with other
autonomic elements
interactions among autonomic elements as it will
from the internal self-management of the individual
autonomic elements—just as the social intelligence
of an ant colony arises largely from the interactions
among individual ants. A distributed, service-ori-
ented infrastructure will support autonomic ele-
ments and their interactions.
As Figure 2 shows, an autonomic element will
typically consist of one or more managed elements
coupled with a single autonomic manager that con-
trols and represents them. The managed element
will essentially be equivalent to what is found in
ordinary nonautonomic systems, although it can
be adapted to enable the autonomic manager to
of this information, the autonomic m
relieve humans of the responsibility of d
aging the managed element.
Fully autonomic computing is likely
designers gradually add increasingly s
autonomic managers to existing manag
Ultimately, the distinction between the
manager and the managed element m
merely conceptual rather than archite
may melt away—leaving fully integr
nomic elements with well-defined be
interfaces, but also with few constrai
internal structure.
Each autonomic element will be res
managing its own internal state and b
for managing its interactions with an e
that consists largely of signals and me
other elements and the external world. A
internal behavior and its relationship
elements will be driven by goals that
has embedded in it, by other elemen
authority over it, or by subcontracts
ments with its tacit or explicit consent.
may require assistance from other
achieve its goals. If so, it will be resp
obtaining necessary resources from oth
and for dealing with exception cases,
failure of a required resource.
Autonomic elements will function at
from individual computing compone
disk drives to small-scale computing s
as workstations or servers to entire
enterprises in the largest autonomic sys
the global economy.
At the lower levels, an autonomic ele
of internal behaviors and relationship
elements, and the set of elements with
interact, may be relatively limited and
Particularly at the level of individual c
well-established techniques—many o
under the rubric of fault tolerance—ha
development of elements that rarely f
one important aspect of being autonom
Autonomic manager
Knowledge
Managed element
Analyze Plan
Monitor Execute
Figure 2. Structure of an autonomic element. Elements interact with other
elements and with human programmers via their autonomic managers.
8
Specifying goals (1/3)
• Rules
– Often simple condition-action pairs
• If something happens, do this
• If something else happens, do that
• …
– Can use more complex languages to express states,
context, etc.
– Explicit enumeration tedious
– Very limited ability to express complex actions
Specifying goals (2/3)
• Utility functions
– Mathematical expressions
– Maps system state to scalar value
– Represents high-level objectives
– What parts of system state to include?
– What should function look like?
9
Specifying goals (3/3)
• Policies
– (higher-level) descriptions of goals and constraints
for operation
– How to map to lower-level behavior?
– Composition of multiple policies
– What high-level language to use?
• Turing-complete?
• No widely used languages available today
• Human operators used to explicit steering
– Not used to indirect goal specification
Autonomic management
techniques - requirements
• Robustness
– Keep things working
– Minimize oscillations or behavioral changes
• Scalability
– Internet-scale: millions of servers and networks,
even more autonomic agents (50 billion devices?)
• Adaptive to changing workloads
– Some methods reliable for certain load patterns, but unstable
once the load or system dynamics change
• Performance
– Need to make decisions fast enough to react timely
– Optimal solutions vs. approximations
• Simplicity
– Key to adoption
– Complex models vs. model-free?
– Learning phase required before deployment?
10
Gradual transition to autonomic?
1. Collect and aggregate information
– Input do human administrators’ decision-making
2. Decision-support systems suggesting possible
actions by humans
3. Autonomic systems entrusted with lower-level
decisions
4. Over time, less frequent and more high-level
decisions by operator
– Carried out by numerous autonomic actions
at lower level
The nature
of the challenge
11
Capacity Planning is Hard
Capacity Planning is Hard
12
Extreme scale
• Enorma byggnader med servrar,
lagringsutrustning, nätverk, kylning
• En fabrik för IT-tjänster
25
13
Extreme load variations
Wikipedia:Michael Jackson’s wiki page at
the time of his death and funeral service
0
100000
200000
300000
400000
500000
600000
700000
800000
900000
1e+06
May09
23
Jun09
06
Jun09
20
Jul09
04
Jul09
18
Aug09
01
Aug09
15
Aug09
29
Sep09
12
Requests
Date
Load Collateral
Challenges for today’s Clouds
• Extreme scale
• Extreme load variations
• Low level of determinism and predictability
• No hard performance guarantees
• Data centers consume a lot of energy
• Data centers have low utilization
Need for better resource management
14
Resource management challenge
• Robustness & performance
• Cost- & energy efficiency
Approach
Autonomic resource management based on
control, analytics, learning, and optimization
Analyze Plan
Monitor Execute
Knowledge
Sensors and
actuators of
managed object
30
How much and what type
of resources to allocate and
when and where to deploy them?
15
www.cloudresearch.org
32
www.cloudresearch.org
16
Anomalies vs. bottlenecks
O. Ibidunmoye, F. Hernandez-Rodriguez, and E. Elmroth. Performance Anomaly Detection and
Bottleneck Identification, ACM Computing Surveys, Vol. 48, No. 1, Article no. 4, 2015.
O. Ibidunmoye, A. Rezaie, and E. Elmroth. Adaptive Anomaly Detection in Performance Metric Streams.
IEEE Transactions on Network and Service Management, Accepted, 2017.
O. Ibidunmoye, E.B. Lakew, and E. Elmroth. A Black-box Approach for Detecting Systems Anomalies in
Virtualized Enviroments. The 2017 International Conference on Cloud and Autonomic Computing (ICCAC
2017), IEEE Computer Society, Accepted, 2017.
Datacenter Landscape Graphs and Coloring
O. Ibidunmoye, T. Metsch, V. Bayon-Molino, E. Elmroth. Performance Anomaly Detection using
Datacenter Landscape Graphs, IWQoS, 2016.
T. Metsch, O. Ibidunmoye, V. Bayon-Molino, J. Butler, F. Hernández-Rodriguez, and E. Elmroth. "Apex
Lake: A Framework for Enabling Smart Orchestration." In Proceedings of the Industrial Track of the 16th
International Middleware Conference, paper 1, ACM, 2015.
17
www.cloudresearch.org
Capacity autoscaling-Aspects of the problem
We need to understand
the workloads!
18
Day
Requests
0 5 10 15 20 25 30
0
25M
50M
75M
100M
Workload Decomposition
Day
Requests
0 5 10 15 20 25 30
−50M
−25M
0
25M
50M
Day
Requests
0 5 10 15 20 25 30
−50M
−25M
0
25M
50M
Day
Requests
0 5 10 15 20 25 30
0
25M
50M
75M
100M
+
+ Seasonality Residuals
Trend
Wikipedia, January 2013,
daily seasonality
Sample control theoretic model
G/G/N queue with variable N (#VMs)
Horizontal Capacity Autoscaling
38
A. Ali-Eldin, M. Kihl, J. Tordsson, and E. Elmroth. Efficient Provisioning of Bursty Scientific
Workloads on the Cloud Using Adaptive Elasticity Control, In Proceedings of the 3rd Workshop
on Scientific Cloud Computing (ScienceCloud 2012), ACM New York, pp. 31-40, 2012.
A. Ali-Eldin, J. Tordsson, and E. Elmroth. An Adaptive Hybrid Elasticity Controller for Cloud
Infrastructures, The 13th IEEE/IFIP Network Operations and Management Symposium
(NOMS 2012), IEEE, pp. 204-212, 2012.
19
Proactive scaling for bursty workload
39
Proactive scaling for strong seasonality
20
Several Autoscaling Methods + Auto selection
A. Ali-Eldin, J. Tordsson, M. Kihl, and E. Elmroth. WAC: A Workload Analysis and
Classification Tool for On-line Selection of Cloud Auto-scaling Methods, submitted.
Controlling Average Response Time
through Vertical Scaling
42
E.B. Lakew, A.V. Papadopoulos, M. Maggio, C. Klein, and E. Elmroth. KPI-agnostic Control for Fine-Grained
Vertical Elasticity. In Proceedings of The 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid
Computing (CCGrid 2017), pp. 589-598, 2017.
E.B. Lakew, C. Klein, F. Hernandez-Rodriguez and E. Elmroth. Towards Faster Response Time Models for
Vertical Elasticity. In The 6th Cloud Control Workshop, part of the Proceedings of the 2014 IEEE Conference on
Utility and Cloud Computing (UCC 2014), pp. 560-565, 2014.
A couple of seconds control interval
21
Controlling Tail Response Time
43
Response Time Controller:
f (tail response time) -> average response time
Then
Capacity Controller:
f(average response time) -> capacity
E.g., ensuring 95% of requests meet target response time
Unifying CPU and Memory Control
44
S. Farokhi, P. Jamshidi, E.B. Lakew, I. Brandic, and E. Elmroth. A Hybrid Cloud Controller for Vertical Memory
Elasticity: A Control-theoretic Approach. Future Generation Computer Systems, Elsevier, Vol. 65, pp. 57-72, 2016.
S. Farokhi, E.B. Lakew, C. Klein, I. Brandic, and E. Elmroth. Coordinating CPU and Memory Elasticity Controllers to
Meet Service Response Time Constraints, The 2015 International Conference on Cloud and Autonomic Computing
(ICCAC), IEEE Computer Society, pp. 69-80, 2015.
22
Autoscaler subsystems
Core subsystems (required but pluggable for replacement).
Metronome: drives the execution: periodic resize iterations - sets the new desired size on the cloudpool endpoint.
Monitoring subsystem: metric streamer collecting data from a metric store (such as OpenTSDB) and a system
historian (capturing monitoring and performance data from the autoscaler itself in(configurable) metric store.
Prediction subsystem: predicts the machine pool size needed
Cloudpool proxy: local proxy for sending commands to a remote cloudpool endpoint over the cloudpool REST API.
Alerter: notifies the outside world about interesting events that are raised on the autoscaler's event bus.
Supports aditional Add-on subsystems e.g., for accounting or high-availability
www.cloudresearch.org
23
VM placement
47
• Map VMs to resources
• After admission
• After scaling
• To reconsolidate
• Across datacenters
(Geo-placement)
• e.g., linear programming problem
• Within datacenter
• Load mixing
• Multi-dimensional multi-knapsack problem
VM Geo-Placement
Modeling (Cost Goals)
Minimize TIP = H ⇤
l
X
j=1
m
X
k=1
Pjk(
n
X
i=1
xijk)
Subject to
TIC = H ⇤
l
X
j=1
Cj(
n
X
i=1
m
X
k=1
xijk)
n
X
i=1
( i ⇤ i) > Threshold (1)
8i 2 [1..n] :
l
X
j=1
m
X
k=1
xijk = 1 (2)
8k 2 [1..m] :
LOCmin  (
n
X
i=1
l
X
j=1
xijk)/n  LOCmax (3)
Total cost
Capacity constraints
Load balance
constraints
W. Li, J. Tordsson, E. Elmroth. Modelling for Dynamic Cloud Scheduling via Migration of Virtual
Machines, 2011 Third IEEE International Conference on Cloud Computing Technology and Science
(Cloudcom 2011), IEEE Computer Society, pp. 163-171, 2011.
D. Espling, L. Larsson, W. Li, J. Tordsson, and E. Elmroth. Modeling and Placement of Structured
Cloud Services, IEEE Transactions on Cloud Computing, Vol. 4, No. 4, pp. 429-439, 2016.
24
Intra Datacenter Placement
• Workload mixing (time & space)
• Multi-dimensional, multi-knapsack
• Application Specific
• Heterogeneous
hardware
W. Li, J. Tordsson, and E. Elmroth. Virtual Machine Placement for Predictable and Time-
Constrained Peak Loads. In Proceedings of the 8th International Workshop on Economics of
Grids, Clouds, Systems, and Services (GECON 2011), Lecture notes of Computer Science,
Springer-Verlag, Vol. 7150, pp. 120-134, 2012.
Relaxed box model virtualization
For enhanced workload mixing (space)
P. Svärd, J. Tordsson, B. Hudzia, E. Elmroth. Hecatonchire: Towards Multi-Host
Virtual Machines by Server Disaggregation. In Euro-Par 2014: Parallel Processing
Workshops, Lecture Notes in Computer Science, Vol. 8806, pp 519-529, 2014.
25
Decentralized Placement
M. Sedaghat, F. Hernandez-Rodriguez, E. Elmroth, and G. Sarunas. Divide the Task, Multiply the
Outcome: Cooperative VM Consolidation, In Proceedings of The 6th IEEE International
Conference on Cloud Computing Technology and Science (CloudCom 2014), pp. 300-305, 2014.
M. Sedaghat, F. Hernandez-Rodriguez, and E. Elmroth. Autonomic Resource Allocation for Cloud
Data Centers: A Peer to Peer Approach. The ACM Cloud and Autonomic Computing Conference
(CAC'14), pp. 131-140, 2014.
Replication control for fault tolerance
• Multi-task jobs in presence of correlated failures
• Ensure that specified number of tasks complete
with certain probability
• Both
• #replicas
• placement
M. Sedaghat, E. Wadbro, J. Wilkes, S. De Luna, O. Seleznjev, and E. Elmroth. Die-Hard:
Reliable Scheduling to Survive Correlated Failures in Cloud Data Centers, IEEE/ACM Inter-
national Symposium on Cluster, Cloud and Grid Computing, CCGrid 2016, pp. 52-59, 2016.
26
Live VM migration (without service interruption)
Pre Post Hybrid
Continuous service ( )
Resource usage
Robustness
Predictability
Transparency
P. Svärd, S. Walsh, B. Hudzia, J. Tordsson, and E. Elmroth. Principles and Performance
Characteristics of Algorithms for Live VM Migration. ACM Operating Systems Review,
Vol. 49, No. 1, pp. 142-155, 2015.
P. Svärd, B. Hudzia, J. Tordsson, and E. Elmroth. Evaluation of Delta Compression
Techniques for Efficient Live Migration of Large Virtual Machines, ACM SIGPLAN Notices,
Vol. 46, No. 7, ACM New York, NY, USA, pp. 111-120, 2011.
Pre-copy migration Post-copy migration
54
www.cloudresearch.org
27
Energy-efficient management
S. K. Tesfatsion, E. Wadbro, J. Tordsson, A Combined Frequency Scaling and Application Elasticity
Approach for Energy-Efficient Clouds, IEEE Transactions on Cloud Computing, 2014.
Z. Li, S. Tesfatsion, S. Bastani, A. Hassan, E. Elmroth, M. Kihl, and R. Ranjan, A Survey on Modeling
Energy Consumption of Cloud Applications: Deconstruction, State of the Art, and Trade-off Debates.
IEEE Transactions on Sustainable Computing, Accepted, 2017.
Energy-efficient management
Performance-power trade-off
0
5
10
15
20
25
30
35
0 120 240 360 480 600 720 840 960 1080 1200 1320 1440 1560 1680 1800
Throughput
(fps)
Time (sec)
Frequency
VM
Core
Combined
Target
(a) Performance.
100
150
200
250
300
350
400
0 120 240 360 480 600 720 840 960 1080 1200 1320 1440 1560 1680 1800
Power
(watt)
Time (sec)
Frequency
VM
Core
Combined
(b) Power usage.
Figure 5. Achieved performance and power for four different policies.
10
12
14
16
18
20
22
24
26
0 240 480 720 960 1200 1440 1680 1920 2160 2400 2640
Throughput
(fps)
Time (sec)
Target
Lower power savings:α-0.1, γ-0.9
Higher power savings: α-0.9, γ-0.1
(a) Performance.
240
250
260
270
280
290
300
310
320
330
0 240 480 720 960 1200 1440 1680 1920 2160 2400 2640
Power
(watt)
Time (sec)
Lower power savings:α-0.1, γ-0.9
Higher power savings: α-0.9, γ-0.1
(b) Power usage.
0
5
10
15
20
25
30
35
0 120 240 360 480 600 720 840 960 1080 1200 1320 1440 1560 1680 1800
Throughput
(fps)
Time (sec)
Frequency
VM
Core
Combined
Target
(a) Performance.
100
150
200
250
300
350
400
0 120 240 360 480 600 720 840 960 1080 1200 1320 1440 1560 1680 1800
Power
(watt)
Time (sec)
Frequency
VM
Core
Combined
(b) Power usage.
Figure 5. Achieved performance and power for four different policies.
10
12
14
16
18
20
22
24
26
0 240 480 720 960 1200 1440 1680 1920 2160 2400 2640
Throughput
(fps)
Time (sec)
Target
Lower power savings:α-0.1, γ-0.9
Higher power savings: α-0.9, γ-0.1
(a) Performance.
240
250
260
270
280
290
300
310
320
330
0 240 480 720 960 1200 1440 1680 1920 2160 2400 2640
Power
(watt)
Time (sec)
Lower power savings:α-0.1, γ-0.9
Higher power savings: α-0.9, γ-0.1
(b) Power usage.
Throughput Power
S. K. Tesfatsion, E. Wadbro, J. Tordsson, A Combined Frequency Scaling and Application Elasticity
Approach for Energy-Efficient Clouds, IEEE Transactions on Cloud Computing, 2014.
Z. Li, S. Tesfatsion, S. Bastani, A. Hassan, E. Elmroth, M. Kihl, and R. Ranjan, A Survey on Modeling
Energy Consumption of Cloud Applications: Deconstruction, State of the Art, and Trade-off Debates.
IEEE Transactions on Sustainable Computing, Accepted, 2017.
28
www.cloudresearch.org
Dynamic Resource Rationing
Where to cut when resources are insufficient?
Two approaches
1. Strict QoS-level
adherence
2. Overall cost-benefit
with QoS-level weights
• Constrained optimization
• Substantial dependency
on KPI-type (e.g. response vs. throughput)
• System feedback on KPI and dimmer effect
• Ideally combined with brownout and self-driven capping
E.B. Lakew, C. Klein, F. Hernandez-Rodriguez and E. Elmroth. Performance-Based Service Differentiation in Clouds,
In Proceedings of the 15th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing (CCGrid
2015), IEEE Computer Society, pp. 505-514, 2015.
L. Tomas, E.B. Lakew, and E. Elmroth. Service Level and Performance Aware Dynamic Resource Allocation in
Overbooked Data Centers, The 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing
(CCGrid 2016), pp. 42-51, 2016.
A.V. Papadopoulos, J. Krzywda, E. Elmroth, and M. Maggio. Power-Aware Cloud Brownout: response time and power
consumption control, In Proceedings of the 56th IEEE Conference on Decision and Control, Accepted, 2017.
M. Shahrad, C. Klein, L. Zheng, M. Chiang, E. Elmroth, and David Wentzlaff. Incentivizing Self-Resource-Capping with
Graceful Degradation, in Proceedings of the ACM Symposium on Cloud Computing 2017 (SoCC '17), Accepted, 2017.
29
Addressing
mission critical
and
Internet of Things
applications
Single datacenters and edge clouds
30
Single datacenters and edge clouds
W. Tärneberg, A. Mehta, E. Wadbro, J. Tordsson, J. Eker, M. Kihl, and E. Elmroth. Dynamic Application
Placement in the Telco-cloud, Future Generation Computer Systems, Elsevier, Vol. 70, pp. 163-177, 2017.
J. Krzywda, W. Tärneberg, P-O. Östberg, M. Kihl, and E. Elmroth. Telco Clouds: Modelling and Simulation,
Proceedings of the 5th International Conference on Cloud Computing and Services Science (CLOSER 2015),
SCITEPRESS, pp. 597-609, 2015.
A. Mehta, W. Tärneberg, C Klein, J. Tordsson, M. Kihl, E. Elmroth. How beneficial are intermediate layer Data
Centers in Mobile Edge Networks? In Foundations and Applications of Self* Systems (FAS* 2016), 2016.
A. Mehta, R. Baddour, H. Gustafsson, F. Svensson, and E. Elmroth. Calvin Constrained - A Framework for IoT
Applications in Heterogeneous Environments, The 37th IEEE International Conference on Distributed
Computing (ICDCS 2017), pp. 1063-1073, 2017.
Controlling end-user performance
and network load
W. Tärneberg, A. Mehta, E. Wadbro, J. Tordsson, J. Eker, M. Kihl, and E. Elmroth. Dynamic Application
Placement in the Telco-cloud, Future Generation Computer Systems, Elsevier, Vol. 70, pp. 163-177, 2017.
J. Krzywda, W. Tärneberg, P-O. Östberg, M. Kihl, and E. Elmroth. Telco Clouds: Modelling and Simulation,
Proceedings of the 5th International Conference on Cloud Computing and Services Science (CLOSER 2015),
SCITEPRESS, pp. 597-609, 2015.
A. Mehta, W. Tärneberg, C Klein, J. Tordsson, M. Kihl, E. Elmroth. How beneficial are intermediate layer Data
Centers in Mobile Edge Networks? In Foundations and Applications of Self* Systems (FAS* 2016), 2016.
A. Mehta, R. Baddour, H. Gustafsson, F. Svensson, and E. Elmroth. Calvin Constrained - A Framework for IoT
Applications in Heterogeneous Environments, The 37th IEEE International Conference on Distributed
Computing (ICDCS 2017), pp. 1063-1073, 2017.
Assume these are self-driving cars,
supported by on-line traffic control
31
Controlling end-user performance
and network load
W. Tärneberg, A. Mehta, E. Wadbro, J. Tordsson, J. Eker, M. Kihl, and E. Elmroth. Dynamic Application
Placement in the Telco-cloud, Future Generation Computer Systems, Elsevier, Vol. 70, pp. 163-177, 2017.
J. Krzywda, W. Tärneberg, P-O. Östberg, M. Kihl, and E. Elmroth. Telco Clouds: Modelling and Simulation,
Proceedings of the 5th International Conference on Cloud Computing and Services Science (CLOSER 2015),
SCITEPRESS, pp. 597-609, 2015.
A. Mehta, W. Tärneberg, C Klein, J. Tordsson, M. Kihl, E. Elmroth. How beneficial are intermediate layer Data
Centers in Mobile Edge Networks? In Foundations and Applications of Self* Systems (FAS* 2016), 2016.
A. Mehta, R. Baddour, H. Gustafsson, F. Svensson, and E. Elmroth. Calvin Constrained - A Framework for IoT
Applications in Heterogeneous Environments, The 37th IEEE International Conference on Distributed
Computing (ICDCS 2017), pp. 1063-1073, 2017.
Future Computer
Systems will be
defined in Software,
not in Hardware
32
Software-Defined Infrastructures
• Massive scale disaggregated hardware
• Dynamic definition (and redefinition) of virtual system
• Arbitrarily large “imbalance” between virtual systems’ CPU-
memory-network
• Less constraints in resource management optimization
• Higher density
• Greater flexibility
• Allows for easier programming models
G. Goumas, K. Nikas, E.B. Lakew, C. Kotselidis, A. Attwood, E. Elmroth, M. Flouris, N. Foutris, J.
Goodacre, D. Grohmann, V. Karakostas, P. Koutsourakis, M. Kersten, M. Lujàn, E. Rustad, J.
Thomson, L. Tomás, A. Vesterkjaer, J. Webber, Y. Zhang, and N. Koziris. ACTiCLOUD: Enabling the
Next Generation of Cloud Applications. The 37th IEEE International Conference on Distributed
Computing (ICDCS 2017), pp. 1836-1845, 2017.
Additional challenges for SDIs
• All ”traditional” resource allocation problems still relevant
• Vertical scaling can be performed on much larger scale!
• Enhanced by non-uniform performance characteristics
• Additional resource management for applications’ virtual
systems (VSys) after resources are assigned
• Hide latencies, move compute to data or data to
compute, trade-offs for performance – consistency
• Feedback between VSys management and the outer SDI
resource allocation
G. Goumas, K. Nikas, E.B. Lakew, C. Kotselidis, A. Attwood, E. Elmroth, M. Flouris, N. Foutris, J. Goodacre,
D. Grohmann, V. Karakostas, P. Koutsourakis, M. Kersten, M. Lujàn, E. Rustad, J. Thomson, L. Tomás, A.
Vesterkjaer, J. Webber, Y. Zhang, and N. Koziris. ACTiCLOUD: Enabling the Next Generation of Cloud
Applications. The 37th IEEE International Conference on Distributed Computing (ICDCS 2017), pp. 1836-
1845, 2017.
33
Proposal for SDI management
cloud automation. revolutionized.
This training material is part of the FogGuru project that has
received funding from the European Union’s Horizon 2020
research and innovation programme under the Marie
Skłodowska-Curie grant agreement No 765452. The
information and views set out in this material are those of the
author(s) and do not necessarily reflect the official opinion of
the European Union. Neither the European Union institutions
and bodies nor any person acting on their behalf may be held
responsible for the use which may be made of the information
contained therein.

More Related Content

What's hot

SeGW Whitepaper from Radisys
SeGW Whitepaper from RadisysSeGW Whitepaper from Radisys
SeGW Whitepaper from RadisysShah Sheikh
 
Gridcomputingppt
GridcomputingpptGridcomputingppt
Gridcomputingpptnavjasser
 
Introduction of grid computing
Introduction of grid computingIntroduction of grid computing
Introduction of grid computingPooja Dixit
 
Get Cloud Resources to the IoT Edge with Fog Computing
Get Cloud Resources to the IoT Edge with Fog ComputingGet Cloud Resources to the IoT Edge with Fog Computing
Get Cloud Resources to the IoT Edge with Fog ComputingBiren Gandhi
 
What is Edge Computing and Why does it matter in IoT?
What is Edge Computing and Why does it matter in IoT?What is Edge Computing and Why does it matter in IoT?
What is Edge Computing and Why does it matter in IoT?Sameer Ahmed
 
Application-Aware Big Data Deduplication in Cloud Environment
Application-Aware Big Data Deduplication in Cloud EnvironmentApplication-Aware Big Data Deduplication in Cloud Environment
Application-Aware Big Data Deduplication in Cloud EnvironmentSafayet Hossain
 
Edge-Fog Cloud: Scaling IoT computations on the edge
Edge-Fog Cloud: Scaling IoT computations on the edgeEdge-Fog Cloud: Scaling IoT computations on the edge
Edge-Fog Cloud: Scaling IoT computations on the edgeNitinder Mohan
 
2014 Technology_Disruption_Forum_SmartThings
2014 Technology_Disruption_Forum_SmartThings2014 Technology_Disruption_Forum_SmartThings
2014 Technology_Disruption_Forum_SmartThingsCOMPUTEX TAIPEI
 
Sustainability and fog computing applications, advantages and challenges
Sustainability and fog computing applications, advantages and challengesSustainability and fog computing applications, advantages and challenges
Sustainability and fog computing applications, advantages and challengesAbdulMajidFarooqi
 
Unit i introduction to grid computing
Unit i   introduction to grid computingUnit i   introduction to grid computing
Unit i introduction to grid computingsudha kar
 
Data processing in Cyber-Physical Systems
Data processing in Cyber-Physical SystemsData processing in Cyber-Physical Systems
Data processing in Cyber-Physical SystemsBob Marcus
 
FogFlow: Cloud-Edge Orchestrator in FIWARE
FogFlow: Cloud-Edge Orchestrator in FIWAREFogFlow: Cloud-Edge Orchestrator in FIWARE
FogFlow: Cloud-Edge Orchestrator in FIWAREBin Cheng
 
Towards the Intelligent Internet of Everything
Towards the Intelligent Internet of EverythingTowards the Intelligent Internet of Everything
Towards the Intelligent Internet of EverythingRECAP Project
 

What's hot (20)

SeGW Whitepaper from Radisys
SeGW Whitepaper from RadisysSeGW Whitepaper from Radisys
SeGW Whitepaper from Radisys
 
Fog computing
Fog computingFog computing
Fog computing
 
Gridcomputingppt
GridcomputingpptGridcomputingppt
Gridcomputingppt
 
Introduction of grid computing
Introduction of grid computingIntroduction of grid computing
Introduction of grid computing
 
Get Cloud Resources to the IoT Edge with Fog Computing
Get Cloud Resources to the IoT Edge with Fog ComputingGet Cloud Resources to the IoT Edge with Fog Computing
Get Cloud Resources to the IoT Edge with Fog Computing
 
Fog computing
Fog computingFog computing
Fog computing
 
What is Edge Computing and Why does it matter in IoT?
What is Edge Computing and Why does it matter in IoT?What is Edge Computing and Why does it matter in IoT?
What is Edge Computing and Why does it matter in IoT?
 
Application-Aware Big Data Deduplication in Cloud Environment
Application-Aware Big Data Deduplication in Cloud EnvironmentApplication-Aware Big Data Deduplication in Cloud Environment
Application-Aware Big Data Deduplication in Cloud Environment
 
Autonomic computer
Autonomic computerAutonomic computer
Autonomic computer
 
Edge-Fog Cloud: Scaling IoT computations on the edge
Edge-Fog Cloud: Scaling IoT computations on the edgeEdge-Fog Cloud: Scaling IoT computations on the edge
Edge-Fog Cloud: Scaling IoT computations on the edge
 
2014 Technology_Disruption_Forum_SmartThings
2014 Technology_Disruption_Forum_SmartThings2014 Technology_Disruption_Forum_SmartThings
2014 Technology_Disruption_Forum_SmartThings
 
4. the grid evolution
4. the grid evolution4. the grid evolution
4. the grid evolution
 
Sustainability and fog computing applications, advantages and challenges
Sustainability and fog computing applications, advantages and challengesSustainability and fog computing applications, advantages and challenges
Sustainability and fog computing applications, advantages and challenges
 
Unit i introduction to grid computing
Unit i   introduction to grid computingUnit i   introduction to grid computing
Unit i introduction to grid computing
 
Data processing in Cyber-Physical Systems
Data processing in Cyber-Physical SystemsData processing in Cyber-Physical Systems
Data processing in Cyber-Physical Systems
 
Fog computing
Fog computingFog computing
Fog computing
 
From IoT Devices to Cloud
From IoT Devices to CloudFrom IoT Devices to Cloud
From IoT Devices to Cloud
 
FogFlow: Cloud-Edge Orchestrator in FIWARE
FogFlow: Cloud-Edge Orchestrator in FIWAREFogFlow: Cloud-Edge Orchestrator in FIWARE
FogFlow: Cloud-Edge Orchestrator in FIWARE
 
Towards the Intelligent Internet of Everything
Towards the Intelligent Internet of EverythingTowards the Intelligent Internet of Everything
Towards the Intelligent Internet of Everything
 
Fog Computing
Fog ComputingFog Computing
Fog Computing
 

Similar to Intelligent Cloud Automation

Autonomic Computing by- Sandeep Jadhav
Autonomic Computing by- Sandeep JadhavAutonomic Computing by- Sandeep Jadhav
Autonomic Computing by- Sandeep JadhavSandep Jadhav
 
Rise of the machines -- Owasp israel -- June 2014 meetup
Rise of the machines -- Owasp israel -- June 2014 meetupRise of the machines -- Owasp israel -- June 2014 meetup
Rise of the machines -- Owasp israel -- June 2014 meetupShlomo Yona
 
Brighttalk brining it all together - final
Brighttalk   brining it all together - finalBrighttalk   brining it all together - final
Brighttalk brining it all together - finalAndrew White
 
Meetup 10 here&now_megatriscomp_design_methodparti_v1
Meetup 10 here&now_megatriscomp_design_methodparti_v1Meetup 10 here&now_megatriscomp_design_methodparti_v1
Meetup 10 here&now_megatriscomp_design_methodparti_v1Francesco Rago
 
Meetup 10 here&now: Megatris Comp design method (Part 1)
Meetup 10 here&now: Megatris Comp design method (Part 1)Meetup 10 here&now: Megatris Comp design method (Part 1)
Meetup 10 here&now: Megatris Comp design method (Part 1)Megatris Comp
 
On the Application of AI for Failure Management: Problems, Solutions and Algo...
On the Application of AI for Failure Management: Problems, Solutions and Algo...On the Application of AI for Failure Management: Problems, Solutions and Algo...
On the Application of AI for Failure Management: Problems, Solutions and Algo...Jorge Cardoso
 
04.project billing system
04.project billing system04.project billing system
04.project billing systemgirivaishali
 
Artificial Intelligence: Agent Technology
Artificial Intelligence: Agent TechnologyArtificial Intelligence: Agent Technology
Artificial Intelligence: Agent TechnologyThe Integral Worm
 
Artificial Intelligence Primer
Artificial Intelligence PrimerArtificial Intelligence Primer
Artificial Intelligence PrimerImam Hoque
 
INTERNAL Assign no 207( JAIPUR NATIONAL UNI)
INTERNAL Assign no   207( JAIPUR NATIONAL UNI)INTERNAL Assign no   207( JAIPUR NATIONAL UNI)
INTERNAL Assign no 207( JAIPUR NATIONAL UNI)Partha_bappa
 
Ecm implementation planning_workshop_hospital_sample
Ecm implementation planning_workshop_hospital_sampleEcm implementation planning_workshop_hospital_sample
Ecm implementation planning_workshop_hospital_sampleChristopher Wynder
 
CWIN17 Utrecht / cg u services - frank van der wal
CWIN17 Utrecht / cg u services - frank van der walCWIN17 Utrecht / cg u services - frank van der wal
CWIN17 Utrecht / cg u services - frank van der walCapgemini
 
5 Reasons Observability of your Mainframe and IBM i is Critical for IT
5 Reasons Observability of your Mainframe and IBM i is Critical for IT 5 Reasons Observability of your Mainframe and IBM i is Critical for IT
5 Reasons Observability of your Mainframe and IBM i is Critical for IT Precisely
 
Building Event Driven Systems
Building Event Driven SystemsBuilding Event Driven Systems
Building Event Driven SystemsWSO2
 

Similar to Intelligent Cloud Automation (20)

Autonomic Computing by- Sandeep Jadhav
Autonomic Computing by- Sandeep JadhavAutonomic Computing by- Sandeep Jadhav
Autonomic Computing by- Sandeep Jadhav
 
Rise of the machines -- Owasp israel -- June 2014 meetup
Rise of the machines -- Owasp israel -- June 2014 meetupRise of the machines -- Owasp israel -- June 2014 meetup
Rise of the machines -- Owasp israel -- June 2014 meetup
 
Brighttalk brining it all together - final
Brighttalk   brining it all together - finalBrighttalk   brining it all together - final
Brighttalk brining it all together - final
 
Meetup 10 here&now_megatriscomp_design_methodparti_v1
Meetup 10 here&now_megatriscomp_design_methodparti_v1Meetup 10 here&now_megatriscomp_design_methodparti_v1
Meetup 10 here&now_megatriscomp_design_methodparti_v1
 
Meetup 10 here&now: Megatris Comp design method (Part 1)
Meetup 10 here&now: Megatris Comp design method (Part 1)Meetup 10 here&now: Megatris Comp design method (Part 1)
Meetup 10 here&now: Megatris Comp design method (Part 1)
 
On the Application of AI for Failure Management: Problems, Solutions and Algo...
On the Application of AI for Failure Management: Problems, Solutions and Algo...On the Application of AI for Failure Management: Problems, Solutions and Algo...
On the Application of AI for Failure Management: Problems, Solutions and Algo...
 
computer Unit 8
computer Unit 8computer Unit 8
computer Unit 8
 
Distributed Systems in Data Engineering
Distributed Systems in Data EngineeringDistributed Systems in Data Engineering
Distributed Systems in Data Engineering
 
Machine learning
Machine learningMachine learning
Machine learning
 
04.project billing system
04.project billing system04.project billing system
04.project billing system
 
Artificial Intelligence: Agent Technology
Artificial Intelligence: Agent TechnologyArtificial Intelligence: Agent Technology
Artificial Intelligence: Agent Technology
 
Artificial Intelligence Primer
Artificial Intelligence PrimerArtificial Intelligence Primer
Artificial Intelligence Primer
 
INTERNAL Assign no 207( JAIPUR NATIONAL UNI)
INTERNAL Assign no   207( JAIPUR NATIONAL UNI)INTERNAL Assign no   207( JAIPUR NATIONAL UNI)
INTERNAL Assign no 207( JAIPUR NATIONAL UNI)
 
I learning lot
I learning lotI learning lot
I learning lot
 
Gov civilworkshop
Gov civilworkshopGov civilworkshop
Gov civilworkshop
 
MIS.pptx
MIS.pptxMIS.pptx
MIS.pptx
 
Ecm implementation planning_workshop_hospital_sample
Ecm implementation planning_workshop_hospital_sampleEcm implementation planning_workshop_hospital_sample
Ecm implementation planning_workshop_hospital_sample
 
CWIN17 Utrecht / cg u services - frank van der wal
CWIN17 Utrecht / cg u services - frank van der walCWIN17 Utrecht / cg u services - frank van der wal
CWIN17 Utrecht / cg u services - frank van der wal
 
5 Reasons Observability of your Mainframe and IBM i is Critical for IT
5 Reasons Observability of your Mainframe and IBM i is Critical for IT 5 Reasons Observability of your Mainframe and IBM i is Critical for IT
5 Reasons Observability of your Mainframe and IBM i is Critical for IT
 
Building Event Driven Systems
Building Event Driven SystemsBuilding Event Driven Systems
Building Event Driven Systems
 

More from FogGuru MSCA Project

The magical recipe for speaking in public
The magical recipe for speaking in publicThe magical recipe for speaking in public
The magical recipe for speaking in publicFogGuru MSCA Project
 
Introduction to the economics of innovation
Introduction to the economics of innovationIntroduction to the economics of innovation
Introduction to the economics of innovationFogGuru MSCA Project
 
Introduction to entrepreneurial finances
Introduction to entrepreneurial financesIntroduction to entrepreneurial finances
Introduction to entrepreneurial financesFogGuru MSCA Project
 
Financing Innovation and Intellectual property
Financing Innovation and Intellectual property Financing Innovation and Intellectual property
Financing Innovation and Intellectual property FogGuru MSCA Project
 
Creating Competitive Advantage: Resource and Capabilities
Creating Competitive Advantage: Resource and Capabilities Creating Competitive Advantage: Resource and Capabilities
Creating Competitive Advantage: Resource and Capabilities FogGuru MSCA Project
 
Business growth: material for exercises
Business growth: material for exercisesBusiness growth: material for exercises
Business growth: material for exercisesFogGuru MSCA Project
 
Business growth: material for discussions
Business growth: material for discussions  Business growth: material for discussions
Business growth: material for discussions FogGuru MSCA Project
 
Management, organization and leadership
Management, organization and leadershipManagement, organization and leadership
Management, organization and leadershipFogGuru MSCA Project
 
Writing code well: tools, tips and tricks
Writing code well: tools, tips and tricks Writing code well: tools, tips and tricks
Writing code well: tools, tips and tricks FogGuru MSCA Project
 
How to carry out bibliographic research
How to carry out bibliographic research How to carry out bibliographic research
How to carry out bibliographic research FogGuru MSCA Project
 
Guidelines for empirical evaluations
Guidelines for empirical evaluationsGuidelines for empirical evaluations
Guidelines for empirical evaluationsFogGuru MSCA Project
 
Business case 1: Soft mobility in Rennes Metropole
Business case 1: Soft mobility in Rennes Metropole Business case 1: Soft mobility in Rennes Metropole
Business case 1: Soft mobility in Rennes Metropole FogGuru MSCA Project
 

More from FogGuru MSCA Project (20)

Assignments
AssignmentsAssignments
Assignments
 
The magical recipe for speaking in public
The magical recipe for speaking in publicThe magical recipe for speaking in public
The magical recipe for speaking in public
 
Introduction to the economics of innovation
Introduction to the economics of innovationIntroduction to the economics of innovation
Introduction to the economics of innovation
 
Introduction to entrepreneurial finances
Introduction to entrepreneurial financesIntroduction to entrepreneurial finances
Introduction to entrepreneurial finances
 
Financing Innovation and Intellectual property
Financing Innovation and Intellectual property Financing Innovation and Intellectual property
Financing Innovation and Intellectual property
 
Creating Competitive Advantage: Resource and Capabilities
Creating Competitive Advantage: Resource and Capabilities Creating Competitive Advantage: Resource and Capabilities
Creating Competitive Advantage: Resource and Capabilities
 
Business growth: material for exercises
Business growth: material for exercisesBusiness growth: material for exercises
Business growth: material for exercises
 
Business growth: material for discussions
Business growth: material for discussions  Business growth: material for discussions
Business growth: material for discussions
 
Scale-ups and large companies
Scale-ups and large companiesScale-ups and large companies
Scale-ups and large companies
 
Management, organization and leadership
Management, organization and leadershipManagement, organization and leadership
Management, organization and leadership
 
Key strategies for growth
Key strategies for growthKey strategies for growth
Key strategies for growth
 
Financing growth
Financing growthFinancing growth
Financing growth
 
Machine Learning: exercises
Machine Learning: exercises Machine Learning: exercises
Machine Learning: exercises
 
Introduction to Machine Learning
Introduction to Machine Learning Introduction to Machine Learning
Introduction to Machine Learning
 
Control of computing systems
Control of computing systemsControl of computing systems
Control of computing systems
 
Writing code well: tools, tips and tricks
Writing code well: tools, tips and tricks Writing code well: tools, tips and tricks
Writing code well: tools, tips and tricks
 
How to make a presentation
How to make a presentationHow to make a presentation
How to make a presentation
 
How to carry out bibliographic research
How to carry out bibliographic research How to carry out bibliographic research
How to carry out bibliographic research
 
Guidelines for empirical evaluations
Guidelines for empirical evaluationsGuidelines for empirical evaluations
Guidelines for empirical evaluations
 
Business case 1: Soft mobility in Rennes Metropole
Business case 1: Soft mobility in Rennes Metropole Business case 1: Soft mobility in Rennes Metropole
Business case 1: Soft mobility in Rennes Metropole
 

Recently uploaded

Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 

Recently uploaded (20)

Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 

Intelligent Cloud Automation

  • 1. 1 cloud automation. revolutionized. Intelligent Cloud Automation - A Research perspective on (semi-)autonomous cloud management Erik Elmroth erik.elmroth@elastisys.com Unavailable or slow Internet services … Lost sale? Lost customer? Lost reputation? Probably like everyone else… • 82% of customers give up on a lost payment transaction* • 25% of users leave if load time > 4 s** • 1% reduced sale per 100 ms load time** • 0.5 s longer load time è 20% reduced income*** Insufficient capacity costs money for service owners! What do you do … • if your hotel search takes 5 secs for each hotel? • if the web session crashes during payment? * JupiterResearch ** Amazon ***Google
  • 2. 2 Stakeholders Resource management objectives (Energy) Efficiency Performance Reliability
  • 4. 4 Motivation: faults Question: what is the probability of a hard drive failure? In my laptop? Will happen every few years, hopefully not right now… In a large data center? More than 100k nodes Will happen during this talk! Motivation: personnel costs Question: How many servers can be handled by a system administrator? Very old question… Some numbers: 10 - very complex systems ~300 - standard large-scale organization Several 1000s – virtualized data center 26k (Facebook 2013) Higher-level management and better abstractions are needed Alternative: exponential increase in need for systems management
  • 6. 6 The autonomic approach • Autonomic computing – Named after autonomic nervous system – Systems manage themselves according to admin goals – Self-governing operation of entire system, not just parts of it – New components integrate effortlessly - as a new cell establishes itself in the body Autonomic Computing • IBM initiative in early 2000’s • Landmark paper published 2003 in IEEE Computer by Kephart and Chess @ IBM • Active research field since, during 2003-2013: – 200 conferences/workshops – 8000+ papers • Lots of funding – EC FP6, FP7, H2020 – WASP • Industry uptake – Many big IT vendors & startups • Key point – Self-management of IT systems
  • 7. 7 Self-management? • Four aspects of self-management – Self-configuration • Configure themselves automatically • High-level policies (what is desired, not how) – Self-optimization • Continually seek ways to improve their operation • Hundreds of tunable parameters – Self-healing • Handle faults and errors • Analyze information from logs and monitors – Self-protection • Malicious attacks • Cascading failures • Admin mistakes The MAPE loop • Fundamental architecture – Managed element(s) • Server, database, storage system, etc. – Autonomic manager • Responsible for: – Providing its service – Managing behavior according to goals Interacting with other autonomic elements interactions among autonomic elements as it will from the internal self-management of the individual autonomic elements—just as the social intelligence of an ant colony arises largely from the interactions among individual ants. A distributed, service-ori- ented infrastructure will support autonomic ele- ments and their interactions. As Figure 2 shows, an autonomic element will typically consist of one or more managed elements coupled with a single autonomic manager that con- trols and represents them. The managed element will essentially be equivalent to what is found in ordinary nonautonomic systems, although it can be adapted to enable the autonomic manager to of this information, the autonomic m relieve humans of the responsibility of d aging the managed element. Fully autonomic computing is likely designers gradually add increasingly s autonomic managers to existing manag Ultimately, the distinction between the manager and the managed element m merely conceptual rather than archite may melt away—leaving fully integr nomic elements with well-defined be interfaces, but also with few constrai internal structure. Each autonomic element will be res managing its own internal state and b for managing its interactions with an e that consists largely of signals and me other elements and the external world. A internal behavior and its relationship elements will be driven by goals that has embedded in it, by other elemen authority over it, or by subcontracts ments with its tacit or explicit consent. may require assistance from other achieve its goals. If so, it will be resp obtaining necessary resources from oth and for dealing with exception cases, failure of a required resource. Autonomic elements will function at from individual computing compone disk drives to small-scale computing s as workstations or servers to entire enterprises in the largest autonomic sys the global economy. At the lower levels, an autonomic ele of internal behaviors and relationship elements, and the set of elements with interact, may be relatively limited and Particularly at the level of individual c well-established techniques—many o under the rubric of fault tolerance—ha development of elements that rarely f one important aspect of being autonom Autonomic manager Knowledge Managed element Analyze Plan Monitor Execute Figure 2. Structure of an autonomic element. Elements interact with other elements and with human programmers via their autonomic managers.
  • 8. 8 Specifying goals (1/3) • Rules – Often simple condition-action pairs • If something happens, do this • If something else happens, do that • … – Can use more complex languages to express states, context, etc. – Explicit enumeration tedious – Very limited ability to express complex actions Specifying goals (2/3) • Utility functions – Mathematical expressions – Maps system state to scalar value – Represents high-level objectives – What parts of system state to include? – What should function look like?
  • 9. 9 Specifying goals (3/3) • Policies – (higher-level) descriptions of goals and constraints for operation – How to map to lower-level behavior? – Composition of multiple policies – What high-level language to use? • Turing-complete? • No widely used languages available today • Human operators used to explicit steering – Not used to indirect goal specification Autonomic management techniques - requirements • Robustness – Keep things working – Minimize oscillations or behavioral changes • Scalability – Internet-scale: millions of servers and networks, even more autonomic agents (50 billion devices?) • Adaptive to changing workloads – Some methods reliable for certain load patterns, but unstable once the load or system dynamics change • Performance – Need to make decisions fast enough to react timely – Optimal solutions vs. approximations • Simplicity – Key to adoption – Complex models vs. model-free? – Learning phase required before deployment?
  • 10. 10 Gradual transition to autonomic? 1. Collect and aggregate information – Input do human administrators’ decision-making 2. Decision-support systems suggesting possible actions by humans 3. Autonomic systems entrusted with lower-level decisions 4. Over time, less frequent and more high-level decisions by operator – Carried out by numerous autonomic actions at lower level The nature of the challenge
  • 11. 11 Capacity Planning is Hard Capacity Planning is Hard
  • 12. 12 Extreme scale • Enorma byggnader med servrar, lagringsutrustning, nätverk, kylning • En fabrik för IT-tjänster 25
  • 13. 13 Extreme load variations Wikipedia:Michael Jackson’s wiki page at the time of his death and funeral service 0 100000 200000 300000 400000 500000 600000 700000 800000 900000 1e+06 May09 23 Jun09 06 Jun09 20 Jul09 04 Jul09 18 Aug09 01 Aug09 15 Aug09 29 Sep09 12 Requests Date Load Collateral Challenges for today’s Clouds • Extreme scale • Extreme load variations • Low level of determinism and predictability • No hard performance guarantees • Data centers consume a lot of energy • Data centers have low utilization Need for better resource management
  • 14. 14 Resource management challenge • Robustness & performance • Cost- & energy efficiency Approach Autonomic resource management based on control, analytics, learning, and optimization Analyze Plan Monitor Execute Knowledge Sensors and actuators of managed object 30 How much and what type of resources to allocate and when and where to deploy them?
  • 16. 16 Anomalies vs. bottlenecks O. Ibidunmoye, F. Hernandez-Rodriguez, and E. Elmroth. Performance Anomaly Detection and Bottleneck Identification, ACM Computing Surveys, Vol. 48, No. 1, Article no. 4, 2015. O. Ibidunmoye, A. Rezaie, and E. Elmroth. Adaptive Anomaly Detection in Performance Metric Streams. IEEE Transactions on Network and Service Management, Accepted, 2017. O. Ibidunmoye, E.B. Lakew, and E. Elmroth. A Black-box Approach for Detecting Systems Anomalies in Virtualized Enviroments. The 2017 International Conference on Cloud and Autonomic Computing (ICCAC 2017), IEEE Computer Society, Accepted, 2017. Datacenter Landscape Graphs and Coloring O. Ibidunmoye, T. Metsch, V. Bayon-Molino, E. Elmroth. Performance Anomaly Detection using Datacenter Landscape Graphs, IWQoS, 2016. T. Metsch, O. Ibidunmoye, V. Bayon-Molino, J. Butler, F. Hernández-Rodriguez, and E. Elmroth. "Apex Lake: A Framework for Enabling Smart Orchestration." In Proceedings of the Industrial Track of the 16th International Middleware Conference, paper 1, ACM, 2015.
  • 17. 17 www.cloudresearch.org Capacity autoscaling-Aspects of the problem We need to understand the workloads!
  • 18. 18 Day Requests 0 5 10 15 20 25 30 0 25M 50M 75M 100M Workload Decomposition Day Requests 0 5 10 15 20 25 30 −50M −25M 0 25M 50M Day Requests 0 5 10 15 20 25 30 −50M −25M 0 25M 50M Day Requests 0 5 10 15 20 25 30 0 25M 50M 75M 100M + + Seasonality Residuals Trend Wikipedia, January 2013, daily seasonality Sample control theoretic model G/G/N queue with variable N (#VMs) Horizontal Capacity Autoscaling 38 A. Ali-Eldin, M. Kihl, J. Tordsson, and E. Elmroth. Efficient Provisioning of Bursty Scientific Workloads on the Cloud Using Adaptive Elasticity Control, In Proceedings of the 3rd Workshop on Scientific Cloud Computing (ScienceCloud 2012), ACM New York, pp. 31-40, 2012. A. Ali-Eldin, J. Tordsson, and E. Elmroth. An Adaptive Hybrid Elasticity Controller for Cloud Infrastructures, The 13th IEEE/IFIP Network Operations and Management Symposium (NOMS 2012), IEEE, pp. 204-212, 2012.
  • 19. 19 Proactive scaling for bursty workload 39 Proactive scaling for strong seasonality
  • 20. 20 Several Autoscaling Methods + Auto selection A. Ali-Eldin, J. Tordsson, M. Kihl, and E. Elmroth. WAC: A Workload Analysis and Classification Tool for On-line Selection of Cloud Auto-scaling Methods, submitted. Controlling Average Response Time through Vertical Scaling 42 E.B. Lakew, A.V. Papadopoulos, M. Maggio, C. Klein, and E. Elmroth. KPI-agnostic Control for Fine-Grained Vertical Elasticity. In Proceedings of The 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid 2017), pp. 589-598, 2017. E.B. Lakew, C. Klein, F. Hernandez-Rodriguez and E. Elmroth. Towards Faster Response Time Models for Vertical Elasticity. In The 6th Cloud Control Workshop, part of the Proceedings of the 2014 IEEE Conference on Utility and Cloud Computing (UCC 2014), pp. 560-565, 2014. A couple of seconds control interval
  • 21. 21 Controlling Tail Response Time 43 Response Time Controller: f (tail response time) -> average response time Then Capacity Controller: f(average response time) -> capacity E.g., ensuring 95% of requests meet target response time Unifying CPU and Memory Control 44 S. Farokhi, P. Jamshidi, E.B. Lakew, I. Brandic, and E. Elmroth. A Hybrid Cloud Controller for Vertical Memory Elasticity: A Control-theoretic Approach. Future Generation Computer Systems, Elsevier, Vol. 65, pp. 57-72, 2016. S. Farokhi, E.B. Lakew, C. Klein, I. Brandic, and E. Elmroth. Coordinating CPU and Memory Elasticity Controllers to Meet Service Response Time Constraints, The 2015 International Conference on Cloud and Autonomic Computing (ICCAC), IEEE Computer Society, pp. 69-80, 2015.
  • 22. 22 Autoscaler subsystems Core subsystems (required but pluggable for replacement). Metronome: drives the execution: periodic resize iterations - sets the new desired size on the cloudpool endpoint. Monitoring subsystem: metric streamer collecting data from a metric store (such as OpenTSDB) and a system historian (capturing monitoring and performance data from the autoscaler itself in(configurable) metric store. Prediction subsystem: predicts the machine pool size needed Cloudpool proxy: local proxy for sending commands to a remote cloudpool endpoint over the cloudpool REST API. Alerter: notifies the outside world about interesting events that are raised on the autoscaler's event bus. Supports aditional Add-on subsystems e.g., for accounting or high-availability www.cloudresearch.org
  • 23. 23 VM placement 47 • Map VMs to resources • After admission • After scaling • To reconsolidate • Across datacenters (Geo-placement) • e.g., linear programming problem • Within datacenter • Load mixing • Multi-dimensional multi-knapsack problem VM Geo-Placement Modeling (Cost Goals) Minimize TIP = H ⇤ l X j=1 m X k=1 Pjk( n X i=1 xijk) Subject to TIC = H ⇤ l X j=1 Cj( n X i=1 m X k=1 xijk) n X i=1 ( i ⇤ i) > Threshold (1) 8i 2 [1..n] : l X j=1 m X k=1 xijk = 1 (2) 8k 2 [1..m] : LOCmin  ( n X i=1 l X j=1 xijk)/n  LOCmax (3) Total cost Capacity constraints Load balance constraints W. Li, J. Tordsson, E. Elmroth. Modelling for Dynamic Cloud Scheduling via Migration of Virtual Machines, 2011 Third IEEE International Conference on Cloud Computing Technology and Science (Cloudcom 2011), IEEE Computer Society, pp. 163-171, 2011. D. Espling, L. Larsson, W. Li, J. Tordsson, and E. Elmroth. Modeling and Placement of Structured Cloud Services, IEEE Transactions on Cloud Computing, Vol. 4, No. 4, pp. 429-439, 2016.
  • 24. 24 Intra Datacenter Placement • Workload mixing (time & space) • Multi-dimensional, multi-knapsack • Application Specific • Heterogeneous hardware W. Li, J. Tordsson, and E. Elmroth. Virtual Machine Placement for Predictable and Time- Constrained Peak Loads. In Proceedings of the 8th International Workshop on Economics of Grids, Clouds, Systems, and Services (GECON 2011), Lecture notes of Computer Science, Springer-Verlag, Vol. 7150, pp. 120-134, 2012. Relaxed box model virtualization For enhanced workload mixing (space) P. Svärd, J. Tordsson, B. Hudzia, E. Elmroth. Hecatonchire: Towards Multi-Host Virtual Machines by Server Disaggregation. In Euro-Par 2014: Parallel Processing Workshops, Lecture Notes in Computer Science, Vol. 8806, pp 519-529, 2014.
  • 25. 25 Decentralized Placement M. Sedaghat, F. Hernandez-Rodriguez, E. Elmroth, and G. Sarunas. Divide the Task, Multiply the Outcome: Cooperative VM Consolidation, In Proceedings of The 6th IEEE International Conference on Cloud Computing Technology and Science (CloudCom 2014), pp. 300-305, 2014. M. Sedaghat, F. Hernandez-Rodriguez, and E. Elmroth. Autonomic Resource Allocation for Cloud Data Centers: A Peer to Peer Approach. The ACM Cloud and Autonomic Computing Conference (CAC'14), pp. 131-140, 2014. Replication control for fault tolerance • Multi-task jobs in presence of correlated failures • Ensure that specified number of tasks complete with certain probability • Both • #replicas • placement M. Sedaghat, E. Wadbro, J. Wilkes, S. De Luna, O. Seleznjev, and E. Elmroth. Die-Hard: Reliable Scheduling to Survive Correlated Failures in Cloud Data Centers, IEEE/ACM Inter- national Symposium on Cluster, Cloud and Grid Computing, CCGrid 2016, pp. 52-59, 2016.
  • 26. 26 Live VM migration (without service interruption) Pre Post Hybrid Continuous service ( ) Resource usage Robustness Predictability Transparency P. Svärd, S. Walsh, B. Hudzia, J. Tordsson, and E. Elmroth. Principles and Performance Characteristics of Algorithms for Live VM Migration. ACM Operating Systems Review, Vol. 49, No. 1, pp. 142-155, 2015. P. Svärd, B. Hudzia, J. Tordsson, and E. Elmroth. Evaluation of Delta Compression Techniques for Efficient Live Migration of Large Virtual Machines, ACM SIGPLAN Notices, Vol. 46, No. 7, ACM New York, NY, USA, pp. 111-120, 2011. Pre-copy migration Post-copy migration 54 www.cloudresearch.org
  • 27. 27 Energy-efficient management S. K. Tesfatsion, E. Wadbro, J. Tordsson, A Combined Frequency Scaling and Application Elasticity Approach for Energy-Efficient Clouds, IEEE Transactions on Cloud Computing, 2014. Z. Li, S. Tesfatsion, S. Bastani, A. Hassan, E. Elmroth, M. Kihl, and R. Ranjan, A Survey on Modeling Energy Consumption of Cloud Applications: Deconstruction, State of the Art, and Trade-off Debates. IEEE Transactions on Sustainable Computing, Accepted, 2017. Energy-efficient management Performance-power trade-off 0 5 10 15 20 25 30 35 0 120 240 360 480 600 720 840 960 1080 1200 1320 1440 1560 1680 1800 Throughput (fps) Time (sec) Frequency VM Core Combined Target (a) Performance. 100 150 200 250 300 350 400 0 120 240 360 480 600 720 840 960 1080 1200 1320 1440 1560 1680 1800 Power (watt) Time (sec) Frequency VM Core Combined (b) Power usage. Figure 5. Achieved performance and power for four different policies. 10 12 14 16 18 20 22 24 26 0 240 480 720 960 1200 1440 1680 1920 2160 2400 2640 Throughput (fps) Time (sec) Target Lower power savings:α-0.1, γ-0.9 Higher power savings: α-0.9, γ-0.1 (a) Performance. 240 250 260 270 280 290 300 310 320 330 0 240 480 720 960 1200 1440 1680 1920 2160 2400 2640 Power (watt) Time (sec) Lower power savings:α-0.1, γ-0.9 Higher power savings: α-0.9, γ-0.1 (b) Power usage. 0 5 10 15 20 25 30 35 0 120 240 360 480 600 720 840 960 1080 1200 1320 1440 1560 1680 1800 Throughput (fps) Time (sec) Frequency VM Core Combined Target (a) Performance. 100 150 200 250 300 350 400 0 120 240 360 480 600 720 840 960 1080 1200 1320 1440 1560 1680 1800 Power (watt) Time (sec) Frequency VM Core Combined (b) Power usage. Figure 5. Achieved performance and power for four different policies. 10 12 14 16 18 20 22 24 26 0 240 480 720 960 1200 1440 1680 1920 2160 2400 2640 Throughput (fps) Time (sec) Target Lower power savings:α-0.1, γ-0.9 Higher power savings: α-0.9, γ-0.1 (a) Performance. 240 250 260 270 280 290 300 310 320 330 0 240 480 720 960 1200 1440 1680 1920 2160 2400 2640 Power (watt) Time (sec) Lower power savings:α-0.1, γ-0.9 Higher power savings: α-0.9, γ-0.1 (b) Power usage. Throughput Power S. K. Tesfatsion, E. Wadbro, J. Tordsson, A Combined Frequency Scaling and Application Elasticity Approach for Energy-Efficient Clouds, IEEE Transactions on Cloud Computing, 2014. Z. Li, S. Tesfatsion, S. Bastani, A. Hassan, E. Elmroth, M. Kihl, and R. Ranjan, A Survey on Modeling Energy Consumption of Cloud Applications: Deconstruction, State of the Art, and Trade-off Debates. IEEE Transactions on Sustainable Computing, Accepted, 2017.
  • 28. 28 www.cloudresearch.org Dynamic Resource Rationing Where to cut when resources are insufficient? Two approaches 1. Strict QoS-level adherence 2. Overall cost-benefit with QoS-level weights • Constrained optimization • Substantial dependency on KPI-type (e.g. response vs. throughput) • System feedback on KPI and dimmer effect • Ideally combined with brownout and self-driven capping E.B. Lakew, C. Klein, F. Hernandez-Rodriguez and E. Elmroth. Performance-Based Service Differentiation in Clouds, In Proceedings of the 15th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing (CCGrid 2015), IEEE Computer Society, pp. 505-514, 2015. L. Tomas, E.B. Lakew, and E. Elmroth. Service Level and Performance Aware Dynamic Resource Allocation in Overbooked Data Centers, The 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid 2016), pp. 42-51, 2016. A.V. Papadopoulos, J. Krzywda, E. Elmroth, and M. Maggio. Power-Aware Cloud Brownout: response time and power consumption control, In Proceedings of the 56th IEEE Conference on Decision and Control, Accepted, 2017. M. Shahrad, C. Klein, L. Zheng, M. Chiang, E. Elmroth, and David Wentzlaff. Incentivizing Self-Resource-Capping with Graceful Degradation, in Proceedings of the ACM Symposium on Cloud Computing 2017 (SoCC '17), Accepted, 2017.
  • 29. 29 Addressing mission critical and Internet of Things applications Single datacenters and edge clouds
  • 30. 30 Single datacenters and edge clouds W. Tärneberg, A. Mehta, E. Wadbro, J. Tordsson, J. Eker, M. Kihl, and E. Elmroth. Dynamic Application Placement in the Telco-cloud, Future Generation Computer Systems, Elsevier, Vol. 70, pp. 163-177, 2017. J. Krzywda, W. Tärneberg, P-O. Östberg, M. Kihl, and E. Elmroth. Telco Clouds: Modelling and Simulation, Proceedings of the 5th International Conference on Cloud Computing and Services Science (CLOSER 2015), SCITEPRESS, pp. 597-609, 2015. A. Mehta, W. Tärneberg, C Klein, J. Tordsson, M. Kihl, E. Elmroth. How beneficial are intermediate layer Data Centers in Mobile Edge Networks? In Foundations and Applications of Self* Systems (FAS* 2016), 2016. A. Mehta, R. Baddour, H. Gustafsson, F. Svensson, and E. Elmroth. Calvin Constrained - A Framework for IoT Applications in Heterogeneous Environments, The 37th IEEE International Conference on Distributed Computing (ICDCS 2017), pp. 1063-1073, 2017. Controlling end-user performance and network load W. Tärneberg, A. Mehta, E. Wadbro, J. Tordsson, J. Eker, M. Kihl, and E. Elmroth. Dynamic Application Placement in the Telco-cloud, Future Generation Computer Systems, Elsevier, Vol. 70, pp. 163-177, 2017. J. Krzywda, W. Tärneberg, P-O. Östberg, M. Kihl, and E. Elmroth. Telco Clouds: Modelling and Simulation, Proceedings of the 5th International Conference on Cloud Computing and Services Science (CLOSER 2015), SCITEPRESS, pp. 597-609, 2015. A. Mehta, W. Tärneberg, C Klein, J. Tordsson, M. Kihl, E. Elmroth. How beneficial are intermediate layer Data Centers in Mobile Edge Networks? In Foundations and Applications of Self* Systems (FAS* 2016), 2016. A. Mehta, R. Baddour, H. Gustafsson, F. Svensson, and E. Elmroth. Calvin Constrained - A Framework for IoT Applications in Heterogeneous Environments, The 37th IEEE International Conference on Distributed Computing (ICDCS 2017), pp. 1063-1073, 2017. Assume these are self-driving cars, supported by on-line traffic control
  • 31. 31 Controlling end-user performance and network load W. Tärneberg, A. Mehta, E. Wadbro, J. Tordsson, J. Eker, M. Kihl, and E. Elmroth. Dynamic Application Placement in the Telco-cloud, Future Generation Computer Systems, Elsevier, Vol. 70, pp. 163-177, 2017. J. Krzywda, W. Tärneberg, P-O. Östberg, M. Kihl, and E. Elmroth. Telco Clouds: Modelling and Simulation, Proceedings of the 5th International Conference on Cloud Computing and Services Science (CLOSER 2015), SCITEPRESS, pp. 597-609, 2015. A. Mehta, W. Tärneberg, C Klein, J. Tordsson, M. Kihl, E. Elmroth. How beneficial are intermediate layer Data Centers in Mobile Edge Networks? In Foundations and Applications of Self* Systems (FAS* 2016), 2016. A. Mehta, R. Baddour, H. Gustafsson, F. Svensson, and E. Elmroth. Calvin Constrained - A Framework for IoT Applications in Heterogeneous Environments, The 37th IEEE International Conference on Distributed Computing (ICDCS 2017), pp. 1063-1073, 2017. Future Computer Systems will be defined in Software, not in Hardware
  • 32. 32 Software-Defined Infrastructures • Massive scale disaggregated hardware • Dynamic definition (and redefinition) of virtual system • Arbitrarily large “imbalance” between virtual systems’ CPU- memory-network • Less constraints in resource management optimization • Higher density • Greater flexibility • Allows for easier programming models G. Goumas, K. Nikas, E.B. Lakew, C. Kotselidis, A. Attwood, E. Elmroth, M. Flouris, N. Foutris, J. Goodacre, D. Grohmann, V. Karakostas, P. Koutsourakis, M. Kersten, M. Lujàn, E. Rustad, J. Thomson, L. Tomás, A. Vesterkjaer, J. Webber, Y. Zhang, and N. Koziris. ACTiCLOUD: Enabling the Next Generation of Cloud Applications. The 37th IEEE International Conference on Distributed Computing (ICDCS 2017), pp. 1836-1845, 2017. Additional challenges for SDIs • All ”traditional” resource allocation problems still relevant • Vertical scaling can be performed on much larger scale! • Enhanced by non-uniform performance characteristics • Additional resource management for applications’ virtual systems (VSys) after resources are assigned • Hide latencies, move compute to data or data to compute, trade-offs for performance – consistency • Feedback between VSys management and the outer SDI resource allocation G. Goumas, K. Nikas, E.B. Lakew, C. Kotselidis, A. Attwood, E. Elmroth, M. Flouris, N. Foutris, J. Goodacre, D. Grohmann, V. Karakostas, P. Koutsourakis, M. Kersten, M. Lujàn, E. Rustad, J. Thomson, L. Tomás, A. Vesterkjaer, J. Webber, Y. Zhang, and N. Koziris. ACTiCLOUD: Enabling the Next Generation of Cloud Applications. The 37th IEEE International Conference on Distributed Computing (ICDCS 2017), pp. 1836- 1845, 2017.
  • 33. 33 Proposal for SDI management cloud automation. revolutionized.
  • 34. This training material is part of the FogGuru project that has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No 765452. The information and views set out in this material are those of the author(s) and do not necessarily reflect the official opinion of the European Union. Neither the European Union institutions and bodies nor any person acting on their behalf may be held responsible for the use which may be made of the information contained therein.