Cloud Resource Management is becoming increasingly challenging with the advent of hyperscale computing and the proliferation of heterogeneous hardware. Meanwhile, resource utilisation continues to remain low resulting in high energy consumption per executed instruction. This presentation by Prof. John Morrison suggests a self-organised approach to resource management in an attempt to successfully address these challenges.
This presentation was given at the CloudLightning Conference held in conjunction with NC4 2017 in Dublin City University on 11th April 2017.
3. Dissemination Level: Public
http://cloudlightning.eu/
@_cloudlightning
Traditional
Resource
Management
Scope
• Resource
management here is
confined to identifying
and selecting
resources to host the
next task
• Many proprietary
Schemes (Google,
Amazon, Azure) not
disclosed
• Work informed by
strategies adopted in
open-source projects
such as OpenStack
Resource Management
Resource Allocation
identifying and selecting resource
to host next task
Referred to, in literature, as
Resource Scheduling
Other aspects of Resource
Life-Cycle Management not
considered here
6. Dissemination Level: Public
http://cloudlightning.eu/
@_cloudlightning
More than 20 filters are currently available. These can be combined to meet complex
requirements. They are categorized as follows:
Resource-based filters
Decisions made according to available resources: memory, disk, CPU cores.
CoreFilter, DiskFilter and RamFilter.
Image-based filters:
Decisions made according to image properties. Select hosts from CPU architecture, hypervisor,
and VM mode.
Host-based filters:
Decisions made according to grouping criteria: location, availability zone, or designated use.
Net-based filters:
Decisions made according to host IP or subnet.
Custom filters:
Users defined custom filter.
OpenStack Filtering Options
7. Dissemination Level: Public
http://cloudlightning.eu/
@_cloudlightning
The LeastCost Algorithm schedules instance creation on available
hosts in the order of the weight assigned to each host.
The input is a set of objective-functions, called the 'cost-functions'. These
cost functions usually consider four parameters: CPU, RAM, disk storage
and network bandwidth.
Each host calculates a combined weight for each cost-function:
∑(weight of cost function * score returned by this function for the host)
The host with the least cost is selected for provisioning.
OpenStack Weighting
8. Dissemination Level: Public
http://cloudlightning.eu/
@_cloudlightning
Weighting
Each time the scheduler
selects a host, it records
the consumed resources,
subsequent selections are
adjusted accordingly.
This is important because
weight is computed for
each requested instance.
All weights are normalized
before being summed and
the host with the least
weight is given the highest
priority.
9. Dissemination Level: Public
http://cloudlightning.eu/
@_cloudlightning
Summary
Characteristics
of the OS
Filter
Scheduling
Strategy
Sophisticated Strategies
Scheduling time increases
with filter complexity
Scheduling time increases
with host population size
Filtering options determined
by application
Weights are preconfigured
into the scheduler, thus
statically determining its
behaviour
10. Dissemination Level: Public
http://cloudlightning.eu/
@_cloudlightning
The Cloud isn’t
what it
used to be
It’s becoming a victim of its
own success.
Complexity and costs are
rising
Low server utilization
Overprovisioning
High energy costs
More diverse and complex
applications
Heterogeneous resources
Control theory tells us that
centralized management is
ineffective at scale
11. Dissemination Level: Public
http://cloudlightning.eu/
@_cloudlightning
Perceived
Objectives Customer Level Objectives
Make cloud computing more accessible
Make cloud computing more efficient
Move towards “ease of everything”
Provider level objectives
Re-establish control over their IaaS offerings
Facilitate better power management
Enable fast resource provisioning for quicker
service initiation
Enable seamless exploitation of heterogeneous
hardware
Exploit faster and cheaper service delivery
offered by hardware accelerators
Employ different heterogeneous hardware
types for different services or for different
invocations of the same service
13. Dissemination Level: Public
http://cloudlightning.eu/
@_cloudlightning
Resources
In a heterogeneous cloud, there will be many different types of compute
resources.
These resources may be available individually or they may be bundled into
subsystems.
Individual resources and subsystems may have pre-installed software stacks
They may be physically located on interconnects with different characteristics
A CL-Resource is a generic term used to refer to any of the above
CL-Resources can thus be bare metal; virtual machines, containers, networked
commodity or specialized hardware, servers with accelerators such as GPUs,
MICs and FPGAs; pre-built HPC environments
In response to a service request, the CL system identifies specific CL-
Resources to be used for the delivery of that service.
Workflows of services may be implemented on a mixture of CL-Resources
– one resource type per service
14. Dissemination Level: Public
http://cloudlightning.eu/
@_cloudlightning
Towards a
Dynamic
Filtering
Mechanism
• Facilitate scalability
with increasing host
population size and
filtering complexity
Hierarchically Structure the host space
to reflect host properties
Hardware type, co-location on low-
latency networks, proximity to storage
servers, etc.
Decentralize Resource Management
decisions
16. Dissemination Level: Public
http://cloudlightning.eu/
@_cloudlightning
CloudLightning
Layers
• pSwitches can
transfer management
of vRMs
• vRMs can transfer
management of
physical resources
• pRouters typically
distinguish between
different hardware
types and, in that
instance, cannot
transfer management
of underlying
pSwitches
Entities in the same layer, self-
manage.
Under certain constraints, they
are able to transfer management
of lower-level components, and
thus self-organize.
Re-organizing strategies are
designed to maximize the
functional requirements of the
applications and the non-
functional requirements of the
system.
17. Dissemination Level: Public
http://cloudlightning.eu/
@_cloudlightning
vRack Manager Types and Groups
vRack Managers are typed to
reflect differences in the CL-Resources under their control
constrain how vRack Manager Groups are formed and self-organized
leverage resource specific optimization opportunities resulting from grouping
vRack Managers together
18. Dissemination Level: Public
http://cloudlightning.eu/
@_cloudlightning
Towards a
Dynamic
Filtering
Mechanism
• Allow weights to
change dynamically
to reflect evolving
conditions and
concerns
• Facilitate scalability
with increasing host
population size and
filtering complexity
Hierarchically Structure the host space to
reflect host properties
Hardware type, co-location on low-latency
networks, proximity to storage servers, etc.
Decentralize Resource Management decisions
Dynamically calculate a fitness view
(Perception) per locale and propagate
upwards
Dynamically calculate a driving force (Impetus)
per locale and propagate downwards.
Define a Suitability Index per locale by
combining local Perception and Impetus
Maximize the Suitability Index everywhere
19. Dissemination Level: Public
http://cloudlightning.eu/
@_cloudlightning
Heterogeneous Resource Fabric (Hardware)-
VRM Level: routing by resource info and SI
pSwitch Level: routing by resource info and SI
pRouter Level: routing by resource info and SI
Cell Manager
Resource
Aggregation
What’s available?
What’s most suitable?
Resource Request
Resource Allocation
(locally decided)
Perception
Impetus
Impetus
Impetus
Perception
Perception
Perception
based on
Weighted
Assessment
Functions
Resource
Aggregation
Resource
Aggregation
Functional
Requirements
Non-Functional
Requirements
Resource Request
Resource Request
Resource Request
20. Dissemination Level: Public
http://cloudlightning.eu/
@_cloudlightning
• Impetus is a vehicle for Directed Evolution
• SI embodies non-functional requirements
– Energy efficiency, management overhead,
performance, etc.
• Routing maximizes both the functional
requirements of the application and the non-
functional requirements of the system
• Routing allows for resource selection without
reservation
– a common feature of decentralised selection
algorithms
• Routing does not guarantee required resource
directly, but likely after reorganisation.
23. Dissemination Level: Public
http://cloudlightning.eu/
@_cloudlightning
In Conclusion
We contend that
Self-
Organization
facilitates
Separation of the concerns of the IaaS
consumer and the CSP
Creation of a service oriented architecture for
the emerging heterogeneous cloud
Control over energy consumption by improved
IaaS management
Improved service delivery
Leveraging of heterogeneity
Bringing of HPC to the cloud
Resource management in hyper-scale cloud
deployments
24. Dissemination Level: Public
http://cloudlightning.eu/
@_cloudlightning
CloudLightning
Approach
• CloudLightning proposes a
novel architecture for
provisioning
heterogeneous cloud
resources to deliver
services, specified by the
user, using a bespoke
service description
language.
01
Complexity
CloudLightning uses self-
organisation and self-
management to manage
complexity effectively.
02
Heterogeneous
Resources
CloudLightning was
specifically for
heterogeneous hardware
03
IaaS
Access
04
Energy Efficiency
05
Resource
Utilisation
CloudLightning
uses dynamic
workload and
resource
management to
increase the
efficiency of
resource utilisation.
06
Service
Deployment
The CloudLightning
deployment
mechanism
simplifies the
operational
overhead for non-
technical users
Achieved through
heterogeneous resources,
reducing overprovisioning,
maximising VM/server density
and turning off idle servers
Clear service interface through
separation of concerns between
consumer and
provider.