My presentation for Annual Award at Mathematical Institute of the Serbian
Academy of Sciences and Arts in the field of computing for PhD. students.
Cloud computing is facing some serious latency issues due to huge volumes of data that need to be transferred from the place where data is generated to the cloud. For some types of applications, this is not acceptable.
One of the possible solutions to this problem is the idea to bring cloud services closer to the edge of the network, where data origi- nates. This idea is called edge computing, and it is advertised that it dramatically reduces the network latency as a bridge that links the users and the clouds, and as such, it makes the foundation for future interconnected applications.
Edge computing is a relatively new area of research and still faces many challenges like geo-organization and a clear separation of concerns, but also remote configuration, well defined native applications model, and limited node capacity. Because of these issues, edge computing is hard to be offered as a service for future real-time user-centric applications.
This thesis presents the dynamic organization of geo-distributed edge nodes into micro data-centers and forming micro-clouds to cover any arbitrary area and expand capacity, availability, and reliability. We use a cloud organization as an influence with adaptations for a different environment with a clear separation of concerns, and native applications model that can leverage the newly formed system.
We argue that the presented model can be integrated into existing solutions or used as a base for the development of future systems.
Furthermore, we give a clear separation of concerns for the proposed model. With the separation of concerns setup, edge-native applications model, and a unified node organization, we are moving towards the idea of edge computing as a service, like any other utility in cloud computing.
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Dynamic formation of distributed micro clouds
1. Dynamic formation
of the distributed µ clouds
Miloš Simić
— Ph. D. Thesis —
University of Novi Sad
Faculty of Technical Sciences
2. Intro Thesis background Distributed clouds model Distributed clouds as software Conclusion
Table of contents
I Research questions
I Cloud computing infrastructure and its problems
I Distributed clouds and micro clouds model
I Conclusion
1 / 26
3. Intro Thesis background Distributed clouds model Distributed clouds as software Conclusion
List of publications
I Papers in the journals from the SCI list directly related to the thesis:
1. Simić M., Prokić I., Dedeić J., G. Sladić and Milosavljević B., ”Towards Edge
Computing as a Service: Dynamic Formation of the Micro Data-Centers,” in IEEE
Access, vol. 9, pp. 114468-114484, 2021, doi: 10.1109/ACCESS.2021.3104475
– 8 citations as per Google Scholar
2. Simić, M.; Sladić, G.; Zarić, M.; Markoski, B. Infrastructure as Software in Micro
Clouds at the Edge. Sensors 2021, 21, 7001 https://doi.org/10.3390/s21217001
– 1 citation as per Google Scholar
2 / 26
4. Intro Thesis background Distributed clouds model Distributed clouds as software Conclusion
Background
I The entire research in the domain of three main components:
1. Distributed systems and its applications:
I Cloud computing
I Big Data
I Service-oriented architectures
2. Infrastructure as software and its applications:
I Automation
I Infrastructure abstraction
3. Containers and their possibilities in software systems
I They are crucial for understanding the concept of distributed clouds – µ clouds
3 / 26
5. Intro Thesis background Distributed clouds model Distributed clouds as software Conclusion
Cloud computing
I Vogels et al. describe cloud computing as an aggregation of computing
resources as a utility, and software as a service
I Clouds provide services, clients use them to avoid huge investments
– creating and maintaining big data centers (expensive)
I Clients only pay for their usage time – pay as you go model
I Centralization of resources – economy of scale, lower the administration costs
I BUT data must to be moved to the cloud – introduces high latency
I Infrastructure elements: nodes, clusters, regions, availability zones
I Services are built on top, to fully utilize infrastructure – scalability and resilience
4 / 26
6. Intro Thesis background Distributed clouds model Distributed clouds as software Conclusion
Cloud computing problems
I Possible application requirements: real-time processing, sensitive data restrictions,
functional dependencies on cloud availability
– pushed to its limits, centralization brings more harm than good
5 / 26
7. Intro Thesis background Distributed clouds model Distributed clouds as software Conclusion
Distributed clouds
I The cloud is usually far away from end
devices – data centers are built on
specific locations
I Sparse deployment lead to high
latency
I Latency-sensitive applications will
have a hard time
I Relax the cloud, and move some tasks
from the cloud
I Computation and storage
in proximity to data sources
I Organize nodes locally – extend
resources beyond the single node
or group of nodes
6 / 26
8. Intro Thesis background Distributed clouds model Distributed clouds as software Conclusion
Research Questions
1. Can geo-distributed nodes be organized in a similar way to the cloud,
but adopted for the different environment, with clear separation of concerns
and familiar applications model for users forming the micro cloud model?
2. Can these organized nodes, micro clouds, be offered as a service
based on the pay as you go model, to the developers and researchers
so that they can develop new human-centered applications?
3. Can a model be created in such a way that it is formally correct,
easy to extend, understand and reason about?
7 / 26
9. Intro Thesis background Distributed clouds model Distributed clouds as software Conclusion
Related work
I Platform models:
I Kubernetes – an (updated) open-source variant of Google orchestrator Borg
I Nebula
I OpenStack
I Nodes organization:
I Zone-based organization
I Micro data centers – serve nearby population (theory)
I Nano data centers – serve single household (using SDN)
I Drop computing – ad hoc formation
I Processing (Big Data):
I Data locality – moving computation to the data, instead of vice versa (cloud)
I Lambda architectures
I Infrastructure definition as code
8 / 26
10. Intro Thesis background Distributed clouds model Distributed clouds as software Conclusion
State of the art problems
I The current state of the art lack dynamic organization and (re)configuration
support for geo-distributed pools of resources serving local requests first
I The existing orchestrator engines operate on a cluster level
– ”treating servers as cattle, not pets” (i.e. numerous servers built using
automated tools designed for failure, where no server is irreplaceable)
I Kubernetes allows multi-cluster deployments, handling these clusters as disposable
– ”treating clusters as cattle, not pets” (i.e., numerous clusters built using
automated tools designed for failure, where no cluster is irreplaceable)
9 / 26
11. Intro Thesis background Distributed clouds model Distributed clouds as software Conclusion
Proposal
I Rely on a similar proven strategy – adapt the existing model for different use-cases
I The cloud architecture is separated into building blocks
making the system easier to understand, maintain and operate – use that
I Let’s observe micro clouds as geo-distributed systems with dispersed users
I Geo-distribution means in proximity to some large populations,
where micro clouds serve user requests locally first
I Forming micro clouds depends on local population usage and needs – adaptive
I Reduce traditional data centers fixed costs with agility – grow and shrink
resources as needed (dynamically)
10 / 26
12. Intro Thesis background Distributed clouds model Distributed clouds as software Conclusion
Organization of the nodes
I A group of nodes virtually and/or geographically separated,
working together to provide the same service to clients, forms a cluster
I A concept used to describe a set of clusters (potentially) scattered
over an arbitrary geographic region, is a region
I Highest logical concept that is composed of a minimum of one region
and could span over multiple regions is a topology
I To lower the latency, vast distances between clusters, regions and topologies
should be strongly avoided in normal circumstances!
Edge centric computing Cloud computing
Topology (logical) Cloud provider (logical)
Region (logical) Region (physical)
Cluster (physical) Zone (physical)
11 / 26
13. Intro Thesis background Distributed clouds model Distributed clouds as software Conclusion
Separation of concerns
Cloud
Cluster 1 Cluster n
Region 1
***
Topology 1
Cluster 1 Cluster n
Region n
***
Cluster 1 Cluster n
Region 1
Topology n
Cluster 1 Cluster n
Region n
***
***
Core
ECC
Clients
Services
Devices
Resources
*** *** ***
***
(Edge-centric computing as a service architecture with separation of concerns.)
12 / 26
14. Intro Thesis background Distributed clouds model Distributed clouds as software Conclusion
Micro clouds
I Ephemeral cloud-like structures
serving local requests first, before
reaching to the traditional cloud
I Size and existence are defined by the
local population’s needs
I Clients have the illusion to
communicate with the clouds –
hidden complexity
I Designed for failure using automated
tools
I Collaborate with traditional clouds –
allows biologically inspired computing
Tier 1 - clients
Tier 2 - micro clouds
Tier 3 - clouds
More
Less
Slow
Fast
Resources
Response
time 13 / 26
15. Intro Thesis background Distributed clouds model Distributed clouds as software Conclusion
Who can join the system?
I To be a part of the system, a node must satisfy four simple rules:
1. Run an operating system with a file system
2. Be able to run some application isolation tool (e.g. container or unikernel engine)
3. Have available resources for utilization
4. Have internet connection
I Simple yet powerful rules are here to help – increased demand for resources
that the currently available infrastructure cannot support
I Inclusion of volunteer nodes into the system can be allowed
to depreciate load for an indefinite period of time
14 / 26
16. Intro Thesis background Distributed clouds model Distributed clouds as software Conclusion
But what is a node?
I Nodes can be heterogeneous by their nature
I Formally, every node in the system could be described as a tuple
si = (L, R, A, I)
I L is a set of ordered key-value pairs representing node labels
or server-specific features (e.g., os:linux, arch:arm, isolation:containers, etc.)
I R is representing node resources (e.g. kind, total, free, used, etc.)
I A is representing running applications (e.g. labels, resources, configrations, etc.)
I I represents a set of general node informations (e.g. name, location, IP address, id,
etc.)
15 / 26
17. Intro Thesis background Distributed clouds model Distributed clouds as software Conclusion
Formation of micro clouds and the protocols
I Nodes are organized into micro clouds dynamically, by abstracting
infrastructure to the level of software – infrastructure as software
I The proposed system relies on four protocols:
1. healthcheck protocol informs the system about the state of every node –
background operation (message order not important)
2. cluster formation protocol forms new clusters – mutate operation (atomic,
immutable, and idempotent)
3. Idempotency check protocol prevent mutate operation if infrastructure already
exists
4. list detail protocol shows the current state of the system to the user – list operation
(dashboards and tunable level of details)
16 / 26
18. Intro Thesis background Distributed clouds model Distributed clouds as software Conclusion
How to prove this?
If there is a bug in a distributed algorithm,
no matter how improbable it may seem,
it’s not a question of whether it will appear,
it’s a question of when it will appear.
(Leslie Lamport, Heidelberg Laureate Forum 2021,
https://www.youtube.com/watch?v=KVs3YFKqclU)
(Leslie Lamport, Turing award amongst
others)
17 / 26
19. Intro Thesis background Distributed clouds model Distributed clouds as software Conclusion
Formally specifying our protocols
I Formal analysis
I Multiparty asynchronous session types
– too rigorous
I Explicit connection actions in
multiparty session types
I We were searching for:
I a formalism that is proven correct
I and expressive enough
I but also easy to follow
(twitter.com/heidiann360/status/1332711011451867139)
I Joint work with colleagues from Chair of Mathematics FTN
18 / 26
20. Intro Thesis background Distributed clouds model Distributed clouds as software Conclusion
Proof of concept implementation – enter constellations
I We see two possible scenarios for
implementation:
I a stand-alone implementation –
using open source tools
I integration within existing tools,
as a node organizer and register (e.g.
Kubernetes node organizator)
I Do not lock users in specific
provider ecosystem
Gateway
Queue
State service
events
Nodes service
events
Command push
service
rpc
Authentication &
Authorization Service
rpc rpc
rpc
rpc
rpc
rpc
rpc
Log service
log events
rpc
http/s
log events
CLI
(Proof of concept implementation – constellations
project (c12s))
19 / 26
21. Intro Thesis background Distributed clouds model Distributed clouds as software Conclusion
Remote formation is challenging
I Infrastructure deployment will not happen overnight – key is to simplify
remote micro clouds management
I The infrastructure needs to be constantly deployed and maintained –
prevent configuration drift (the systems become different over time)
I Abstract infrastructure to the software level – available tools, principles, and
techniques
I Micro clouds are adaptable infrastructure model
I Such elasticity requires software level infrastructure abstraction – infrastructure
programming
I Infrastructure definition is versioned, automated, applied repeatedly, and
consistently every time – minimize configuration drift
20 / 26
22. Intro Thesis background Distributed clouds model Distributed clouds as software Conclusion
The reconciler pattern
I System is dealing with changes using
the reconciler pattern
I Track resources using two simple
states:
1. expected state – desired state
2. current state – actual state
I Reconciliation loop ensures
that the desired state is maintained
I Every node provides current state –
healthcheck protocol
I Extension is simple
Cloud
Services
State
Reconcile loop
DevOps/SREs
Interacts
Free nodes
Micro clouds
Clients
Interacts
health-check
registration
health-check
commands
state
coordination
21 / 26
23. Intro Thesis background Distributed clouds model Distributed clouds as software Conclusion
Challenge the state of the art
I This research goes one step further, allowing the creation of numerous micro
clouds as disposable, using automated tools where no micro cloud is irreplaceable
— ”treating micro clouds as cattle, not pets” (i.e. numerous micro clouds built
using automated tools designed for failure, where no micro cloud is irreplaceable)
I Allowing the dynamic and descriptive organization pools of resources
in form of disposable geo-distributed micro clouds in proximity to users
I More dimensions for operation and optimization of infrastructure and services
I Allowing IaaS, CaaS, PaaS, SaaS models closer to the data sources
I Users get a unique ability to dynamically and selectively control the
information sent to the cloud
22 / 26
24. Intro Thesis background Distributed clouds model Distributed clouds as software Conclusion
Use cases
Cluster C
Region A [COVID 19 affected]
Cluster A
Cluster B
Cluster D
Topology A
Vehicle tracking, routing and
Patient
services
recognition frontend services
monitoring
alerting
prediction
frontend
Region B [Not affected]
Cluster A
Utility
frontend services
General purpose
frontend services
Cluster C
Power grid
frontend services
Power grid
frontend services
Cluster B
Utility
frontend services
Cluster D
Utility
frontend services
Unutilized
Nodes
Volunteer
Nodes
Backend
Services
Sync important
transfer patient
Micro cloud scope
Traditional cloud scope
depersonalized
Target, prioritize,
and route
[local]
[global]
data in real-time
data
data
Medical personal
Research
Patients
alerting, predictions
real-time data
and warnings
Patients
historical data
depersonalized
data extraction
Ingraft
and transfer to
typical cloud apps
23 / 26
25. Intro Thesis background Distributed clouds model Distributed clouds as software Conclusion
Concluding remarks
The dynamic organization of geo-distributed nodes into micro clouds, forming
distributed clouds
I Nodes are organized into micro clouds dynamically – inspired by the cloud
architecture
I Creation of numerous micro clouds designed for failure using automated tools
where no micro cloud is irreplaceable – ”treating micro clouds as cattle, not
pets.”
I Serving local requests first, and from optimal location
I Abstracted infrastructure to the level of software – infrastructure as software
I Moving computation to the data, instead of vice versa
”Containers are great, let’s run them everywhere.” – Brian Dorsey, Google Cloud Platform.
24 / 26
26. Intro Thesis background Distributed clouds model Distributed clouds as software Conclusion
Work recognition
I Invited speaker courtesy of professor Nobuko Yoshida University of Oxford
(Imperial Collage at the presentation time) – present usage of multiparty
asynchronous session types in new research area
I Invited speaker courtesy of professor Michel Kinsy Arizona State University –
present adaptive infrastrucutres
I All thesis results will be directly applied to Horizon Europe TaRDIS Project
I TaRDIS focuses on supporting the correct and efficient development of applications
for swarms and decentralised distributed systems, by combining a novel programming
paradigm with a toolbox for supporting the development and execution of
applications.
25 / 26
27. Intro Thesis background Distributed clouds model Distributed clouds as software Conclusion
Questions?
Thank you for your attention!
26 / 26