SlideShare a Scribd company logo
TRAINING THE NEXT GENERATION OF EUROPEAN FOG COMPUTING EXPERTS
Container orchestration in
geo-distributed cloud computing platforms
Keynote at HotCloudPerf
April 20th 2021
Mulugeta Ayalew Tamiru, Guillaume Pierre, Johan Tordsson and Erik Elmroth
Elastisys AB & Université de Rennes 1
1
Geo-distributed cloud platforms
2
Fault tolerance Proximity
Resource aggregation Regulatory compliance
Goal: reliably deploy software across the full platform
▪ Containers everywhere
• To abstract ourselves from heterogeneity of the host hardware +
hypervisors
▪ Deploy potentially large numbers of containers
• If necessary: burst to a public cloud
▪ Control container placements
• Manually
• Semi-automatically: “as close as possible from X”
• Automatically: load-balanced across all locations
3
Kubernetes Federation (KubeFed)
▪ Resource management and
application deployment on
multiple Kubernetes clusters
(member clusters) from a
single control plane (host
cluster)
▪ BUT: KubeFed was not
specifically designed for
worldwide geo-distribution
4
Experimental setup
5
▪ 1 host cluster and 5 member
clusters with Kubernetes 1.14
▪ Each cluster with a master
and five worker nodes
▪ Host cluster nodes: 4vCPUs,
16GB RAM
▪ Member cluster nodes:
4vCPUs, 4 GB RAM
▪ Simple nginx web server app
Problem -- Instability
6
Stability
Impact of network configuration on stability
7
AVERAGE NO . OF TIMEOUT ERRORS PER MINUTE (N ) AND STABILITY (υ) OF THE UNCONTROLLED
SYSTEM FOR THE THREE EVALUATION SCENARIOS .
Network delay/ packet
loss rate increased
Cluster failure
Network delay/ packet
loss rate restored
Cluster restored
KubeFed configuration parameters
8
Parameter Default
Cluster Available Delay 20s
Cluster Unavailable Delay 60s
Leader Elect Lease Duration 15s
Leader Elect Renew Deadline 10s
Leader Elect Retry Period 5s
Cluster Health Check Timeout 3s
Cluster Health Check Period 10s
Cluster Health Check Failure
Threshold
3
Stability vs. failure detection delay
9
Solution -- Controller to adjust CHCT at run-time
10
Results -- Stationary scenario
11
Results -- Network variability scenario
12
Network delay/ packet
loss rate increased
Network delay/ packet
loss rate restored
Results -- Cluster failure scenario
13
Cluster failure Cluster
restored
(Temporary) conclusion
▪ We observe significant instability in KubeFed-based
geo-distributed fog platforms due to:
• poor network conditions
• default / static configuration parameters
▪ We designed a proportional controller to adjust CHCT at
run-time
• Improves the system stability from 83–92% with no controller to
99.5–100% using the controller
Mulugeta Tamiru, Guillaume Pierre, Johan Tordsson, Erik Elmroth. Instability in Geo-Distributed Kubernetes Federation:
Causes and Mitigation. In Proceedings of IEEE MASCOTS, Nov 2020.
14
Now that we fixed the instability problem, is KubeFed ready
to manage large-scale geo-distributed platforms?
Note quite: in KubeFed, any deployment request is pushed to the
requested cluster regardless of the resource availability in this cluster.
15
Let’s replay 1 hour of
Google cluster trace,
distribute jobs to one out
of 5 clusters according to
a binomial distribution:
▪ 3 overloaded clusters
▪ 2 mostly idle clusters
Problems to address
▪ Make sure applications are not deployed in overloaded clusters
• Even if this requires choosing another cluster automatically…
▪ Support application autoscaling in multi-cluster environments
• Vary the number of replicas within a single cluster…
• … or across multiple clusters
▪ Allow the system to burst out to a public cloud in case of resource
overload
• And retract public-cloud resources as early as possible
▪ Seamlessly integrate in existing KubeFed platforms
16
17
Deploy mcd-app-1 across two clusters
which receive most network traffic
Make sure end-user requests are
distributed across both clusters
18
Autoscale the application deployment
to maintain reasonable CPU usage
Dynamically provision more resources
from the public cloud if necessary
19
Conclusion
Geo-distributed Kubernetes federations are now:
▪ Stable
▪ Resource availability aware
▪ Network traffic and network latency aware
▪ Burstable between available clusters, and to the public cloud
mck8s is available: https://github.com/moule3053/mck8s
Mulugeta Tamiru, Guillaume Pierre, Johan Tordsson, Erik Elmroth. mck8s: an orchestration platform for geo-distributed
multi-cluster environments. In Proceedings of ICCCN, Jul 2021.
20
The FogGuru project has received funding from the European Union’s
Horizon 2020 research and innovation programme under the Marie
Skłodowska-Curie grant 765452.
TRAINING THE NEXT GENERATION
OF EUROPEAN FOG COMPUTING EXPERTS
www.fogguru.eu
21

More Related Content

What's hot

A Guide to Data Versioning with MapR Snapshots
A Guide to Data Versioning with MapR SnapshotsA Guide to Data Versioning with MapR Snapshots
A Guide to Data Versioning with MapR SnapshotsIan Downard
 
Smart Data Center Design
Smart Data Center DesignSmart Data Center Design
Smart Data Center DesignSimScale
 
Superframe Scheduling with Beacon Enable Mode in Wireless Industrial Networks
Superframe Scheduling with Beacon Enable Mode in Wireless Industrial NetworksSuperframe Scheduling with Beacon Enable Mode in Wireless Industrial Networks
Superframe Scheduling with Beacon Enable Mode in Wireless Industrial NetworksOka Danil
 
Deadline Monotonic Scheduling to Reduce Overhead of Superframe in ISA100.11a
Deadline Monotonic Scheduling to Reduce Overhead of Superframe in ISA100.11aDeadline Monotonic Scheduling to Reduce Overhead of Superframe in ISA100.11a
Deadline Monotonic Scheduling to Reduce Overhead of Superframe in ISA100.11aOka Danil
 
Optimization of graph storage using GoFFish
Optimization of graph storage using GoFFishOptimization of graph storage using GoFFish
Optimization of graph storage using GoFFishAnushree Prasanna Kumar
 
Network simulator 2
Network simulator 2Network simulator 2
Network simulator 2shwetha mk
 
Energy Audit aaS with OPNFV
Energy Audit aaS with OPNFVEnergy Audit aaS with OPNFV
Energy Audit aaS with OPNFVOPNFV
 
Eventual Consistency - JUG DA
Eventual Consistency - JUG DAEventual Consistency - JUG DA
Eventual Consistency - JUG DASusanne Braun
 
DEEP-mon: Dynamic and Energy Efficient Power monitoring for container-based i...
DEEP-mon: Dynamic and Energy Efficient Power monitoring for container-based i...DEEP-mon: Dynamic and Energy Efficient Power monitoring for container-based i...
DEEP-mon: Dynamic and Energy Efficient Power monitoring for container-based i...NECST Lab @ Politecnico di Milano
 
Experiences with High-bandwidth Networks
Experiences with High-bandwidth NetworksExperiences with High-bandwidth Networks
Experiences with High-bandwidth Networksbalmanme
 
Low Power High-Performance Computing on the BeagleBoard Platform
Low Power High-Performance Computing on the BeagleBoard PlatformLow Power High-Performance Computing on the BeagleBoard Platform
Low Power High-Performance Computing on the BeagleBoard Platforma3labdsp
 
Virtual Clusters for (RDF) Stream Processing
Virtual Clusters for (RDF) Stream ProcessingVirtual Clusters for (RDF) Stream Processing
Virtual Clusters for (RDF) Stream ProcessingAlejandro Llaves
 
IoT Workload Distribution Impact Between Edge and Cloud Computing in a Smart ...
IoT Workload Distribution Impact Between Edge and Cloud Computing in a Smart ...IoT Workload Distribution Impact Between Edge and Cloud Computing in a Smart ...
IoT Workload Distribution Impact Between Edge and Cloud Computing in a Smart ...Otávio Carvalho
 
Revisiting Sensor MAC for Periodic Monitoring: Why Should Transmitters Be Ear...
Revisiting Sensor MAC for Periodic Monitoring: Why Should Transmitters Be Ear...Revisiting Sensor MAC for Periodic Monitoring: Why Should Transmitters Be Ear...
Revisiting Sensor MAC for Periodic Monitoring: Why Should Transmitters Be Ear...deawoo Kim
 
Detecting Lateral Movement with a Compute-Intense Graph Kernel
Detecting Lateral Movement with a Compute-Intense Graph KernelDetecting Lateral Movement with a Compute-Intense Graph Kernel
Detecting Lateral Movement with a Compute-Intense Graph KernelData Works MD
 
A tutorial on GreenCloud
A tutorial on GreenCloudA tutorial on GreenCloud
A tutorial on GreenCloudHabibur Rahman
 
A Scalable Dataflow Implementation of Curran's Approximation Algorithm
A Scalable Dataflow Implementation of Curran's Approximation AlgorithmA Scalable Dataflow Implementation of Curran's Approximation Algorithm
A Scalable Dataflow Implementation of Curran's Approximation AlgorithmNECST Lab @ Politecnico di Milano
 

What's hot (20)

A Guide to Data Versioning with MapR Snapshots
A Guide to Data Versioning with MapR SnapshotsA Guide to Data Versioning with MapR Snapshots
A Guide to Data Versioning with MapR Snapshots
 
2019 swan-cs3
2019 swan-cs32019 swan-cs3
2019 swan-cs3
 
Smart Data Center Design
Smart Data Center DesignSmart Data Center Design
Smart Data Center Design
 
Superframe Scheduling with Beacon Enable Mode in Wireless Industrial Networks
Superframe Scheduling with Beacon Enable Mode in Wireless Industrial NetworksSuperframe Scheduling with Beacon Enable Mode in Wireless Industrial Networks
Superframe Scheduling with Beacon Enable Mode in Wireless Industrial Networks
 
Deadline Monotonic Scheduling to Reduce Overhead of Superframe in ISA100.11a
Deadline Monotonic Scheduling to Reduce Overhead of Superframe in ISA100.11aDeadline Monotonic Scheduling to Reduce Overhead of Superframe in ISA100.11a
Deadline Monotonic Scheduling to Reduce Overhead of Superframe in ISA100.11a
 
Optimization of graph storage using GoFFish
Optimization of graph storage using GoFFishOptimization of graph storage using GoFFish
Optimization of graph storage using GoFFish
 
Network simulator 2
Network simulator 2Network simulator 2
Network simulator 2
 
Energy Audit aaS with OPNFV
Energy Audit aaS with OPNFVEnergy Audit aaS with OPNFV
Energy Audit aaS with OPNFV
 
Eventual Consistency - JUG DA
Eventual Consistency - JUG DAEventual Consistency - JUG DA
Eventual Consistency - JUG DA
 
Dc project 1
Dc project 1Dc project 1
Dc project 1
 
DEEP-mon: Dynamic and Energy Efficient Power monitoring for container-based i...
DEEP-mon: Dynamic and Energy Efficient Power monitoring for container-based i...DEEP-mon: Dynamic and Energy Efficient Power monitoring for container-based i...
DEEP-mon: Dynamic and Energy Efficient Power monitoring for container-based i...
 
Experiences with High-bandwidth Networks
Experiences with High-bandwidth NetworksExperiences with High-bandwidth Networks
Experiences with High-bandwidth Networks
 
Low Power High-Performance Computing on the BeagleBoard Platform
Low Power High-Performance Computing on the BeagleBoard PlatformLow Power High-Performance Computing on the BeagleBoard Platform
Low Power High-Performance Computing on the BeagleBoard Platform
 
Virtual Clusters for (RDF) Stream Processing
Virtual Clusters for (RDF) Stream ProcessingVirtual Clusters for (RDF) Stream Processing
Virtual Clusters for (RDF) Stream Processing
 
Clone cloud
Clone cloudClone cloud
Clone cloud
 
IoT Workload Distribution Impact Between Edge and Cloud Computing in a Smart ...
IoT Workload Distribution Impact Between Edge and Cloud Computing in a Smart ...IoT Workload Distribution Impact Between Edge and Cloud Computing in a Smart ...
IoT Workload Distribution Impact Between Edge and Cloud Computing in a Smart ...
 
Revisiting Sensor MAC for Periodic Monitoring: Why Should Transmitters Be Ear...
Revisiting Sensor MAC for Periodic Monitoring: Why Should Transmitters Be Ear...Revisiting Sensor MAC for Periodic Monitoring: Why Should Transmitters Be Ear...
Revisiting Sensor MAC for Periodic Monitoring: Why Should Transmitters Be Ear...
 
Detecting Lateral Movement with a Compute-Intense Graph Kernel
Detecting Lateral Movement with a Compute-Intense Graph KernelDetecting Lateral Movement with a Compute-Intense Graph Kernel
Detecting Lateral Movement with a Compute-Intense Graph Kernel
 
A tutorial on GreenCloud
A tutorial on GreenCloudA tutorial on GreenCloud
A tutorial on GreenCloud
 
A Scalable Dataflow Implementation of Curran's Approximation Algorithm
A Scalable Dataflow Implementation of Curran's Approximation AlgorithmA Scalable Dataflow Implementation of Curran's Approximation Algorithm
A Scalable Dataflow Implementation of Curran's Approximation Algorithm
 

Similar to Container orchestration in geo-distributed cloud computing platforms

Red Hat multi-cluster management & what's new in OpenShift
Red Hat multi-cluster management & what's new in OpenShiftRed Hat multi-cluster management & what's new in OpenShift
Red Hat multi-cluster management & what's new in OpenShiftKangaroot
 
OpenNebulaConf 2013 - Keynote: Enabling Scientific Workflows on FermiCloud us...
OpenNebulaConf 2013 - Keynote: Enabling Scientific Workflows on FermiCloud us...OpenNebulaConf 2013 - Keynote: Enabling Scientific Workflows on FermiCloud us...
OpenNebulaConf 2013 - Keynote: Enabling Scientific Workflows on FermiCloud us...OpenNebula Project
 
Enabling Scientific Workflows on FermiCloud using OpenNebula
Enabling Scientific Workflows on FermiCloud using OpenNebulaEnabling Scientific Workflows on FermiCloud using OpenNebula
Enabling Scientific Workflows on FermiCloud using OpenNebulaNETWAYS
 
MayaData Datastax webinar - Operating Cassandra on Kubernetes with the help ...
MayaData  Datastax webinar - Operating Cassandra on Kubernetes with the help ...MayaData  Datastax webinar - Operating Cassandra on Kubernetes with the help ...
MayaData Datastax webinar - Operating Cassandra on Kubernetes with the help ...MayaData Inc
 
ddsf-student-presentation_756205.pptx
ddsf-student-presentation_756205.pptxddsf-student-presentation_756205.pptx
ddsf-student-presentation_756205.pptxssuser498be2
 
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storageWebinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storageMayaData Inc
 
DevOps Fest 2019. Stanislav Kolenkin. Сonnecting pool Kubernetes clusters: Fe...
DevOps Fest 2019. Stanislav Kolenkin. Сonnecting pool Kubernetes clusters: Fe...DevOps Fest 2019. Stanislav Kolenkin. Сonnecting pool Kubernetes clusters: Fe...
DevOps Fest 2019. Stanislav Kolenkin. Сonnecting pool Kubernetes clusters: Fe...DevOps_Fest
 
Docker Swarm secrets for creating great FIWARE platforms
Docker Swarm secrets for creating great FIWARE platformsDocker Swarm secrets for creating great FIWARE platforms
Docker Swarm secrets for creating great FIWARE platformsFederico Michele Facca
 
Dataservices based on mesos and kafka kostiantyn bokhan dataconf 21 04 18
Dataservices based on mesos and kafka kostiantyn bokhan dataconf 21 04 18Dataservices based on mesos and kafka kostiantyn bokhan dataconf 21 04 18
Dataservices based on mesos and kafka kostiantyn bokhan dataconf 21 04 18Olga Zinkevych
 
Introducing github.com/open-cluster-management – How to deliver apps across c...
Introducing github.com/open-cluster-management – How to deliver apps across c...Introducing github.com/open-cluster-management – How to deliver apps across c...
Introducing github.com/open-cluster-management – How to deliver apps across c...Michael Elder
 
Disaster Recovery Options Running Apache Kafka in Kubernetes with Rema Subra...
 Disaster Recovery Options Running Apache Kafka in Kubernetes with Rema Subra... Disaster Recovery Options Running Apache Kafka in Kubernetes with Rema Subra...
Disaster Recovery Options Running Apache Kafka in Kubernetes with Rema Subra...HostedbyConfluent
 
State of Linux Containers for HPC
State of Linux Containers for HPCState of Linux Containers for HPC
State of Linux Containers for HPCinside-BigData.com
 
Supporting bioinformatics applications with hybrid multi-cloud services
Supporting bioinformatics applications with hybrid multi-cloud servicesSupporting bioinformatics applications with hybrid multi-cloud services
Supporting bioinformatics applications with hybrid multi-cloud servicesAhmed Abdullah
 
Introduction to containers, k8s, Microservices & Cloud Native
Introduction to containers, k8s, Microservices & Cloud NativeIntroduction to containers, k8s, Microservices & Cloud Native
Introduction to containers, k8s, Microservices & Cloud NativeTerry Wang
 
Simulation of Heterogeneous Cloud Infrastructures
Simulation of Heterogeneous Cloud InfrastructuresSimulation of Heterogeneous Cloud Infrastructures
Simulation of Heterogeneous Cloud InfrastructuresCloudLightning
 
Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...
Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...
Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...DataStax
 
Kubernetes Introduction
Kubernetes IntroductionKubernetes Introduction
Kubernetes IntroductionMiloš Zubal
 

Similar to Container orchestration in geo-distributed cloud computing platforms (20)

Red Hat multi-cluster management & what's new in OpenShift
Red Hat multi-cluster management & what's new in OpenShiftRed Hat multi-cluster management & what's new in OpenShift
Red Hat multi-cluster management & what's new in OpenShift
 
OpenNebulaConf 2013 - Keynote: Enabling Scientific Workflows on FermiCloud us...
OpenNebulaConf 2013 - Keynote: Enabling Scientific Workflows on FermiCloud us...OpenNebulaConf 2013 - Keynote: Enabling Scientific Workflows on FermiCloud us...
OpenNebulaConf 2013 - Keynote: Enabling Scientific Workflows on FermiCloud us...
 
Enabling Scientific Workflows on FermiCloud using OpenNebula
Enabling Scientific Workflows on FermiCloud using OpenNebulaEnabling Scientific Workflows on FermiCloud using OpenNebula
Enabling Scientific Workflows on FermiCloud using OpenNebula
 
MayaData Datastax webinar - Operating Cassandra on Kubernetes with the help ...
MayaData  Datastax webinar - Operating Cassandra on Kubernetes with the help ...MayaData  Datastax webinar - Operating Cassandra on Kubernetes with the help ...
MayaData Datastax webinar - Operating Cassandra on Kubernetes with the help ...
 
ddsf-student-presentation_756205.pptx
ddsf-student-presentation_756205.pptxddsf-student-presentation_756205.pptx
ddsf-student-presentation_756205.pptx
 
Autopilot : Securing Cloud Native Storage
Autopilot : Securing Cloud Native StorageAutopilot : Securing Cloud Native Storage
Autopilot : Securing Cloud Native Storage
 
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storageWebinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
 
DevOps Fest 2019. Stanislav Kolenkin. Сonnecting pool Kubernetes clusters: Fe...
DevOps Fest 2019. Stanislav Kolenkin. Сonnecting pool Kubernetes clusters: Fe...DevOps Fest 2019. Stanislav Kolenkin. Сonnecting pool Kubernetes clusters: Fe...
DevOps Fest 2019. Stanislav Kolenkin. Сonnecting pool Kubernetes clusters: Fe...
 
cluster compuing
cluster compuingcluster compuing
cluster compuing
 
Docker Swarm secrets for creating great FIWARE platforms
Docker Swarm secrets for creating great FIWARE platformsDocker Swarm secrets for creating great FIWARE platforms
Docker Swarm secrets for creating great FIWARE platforms
 
Dataservices based on mesos and kafka kostiantyn bokhan dataconf 21 04 18
Dataservices based on mesos and kafka kostiantyn bokhan dataconf 21 04 18Dataservices based on mesos and kafka kostiantyn bokhan dataconf 21 04 18
Dataservices based on mesos and kafka kostiantyn bokhan dataconf 21 04 18
 
Introducing github.com/open-cluster-management – How to deliver apps across c...
Introducing github.com/open-cluster-management – How to deliver apps across c...Introducing github.com/open-cluster-management – How to deliver apps across c...
Introducing github.com/open-cluster-management – How to deliver apps across c...
 
Cloud computing components
Cloud computing componentsCloud computing components
Cloud computing components
 
Disaster Recovery Options Running Apache Kafka in Kubernetes with Rema Subra...
 Disaster Recovery Options Running Apache Kafka in Kubernetes with Rema Subra... Disaster Recovery Options Running Apache Kafka in Kubernetes with Rema Subra...
Disaster Recovery Options Running Apache Kafka in Kubernetes with Rema Subra...
 
State of Linux Containers for HPC
State of Linux Containers for HPCState of Linux Containers for HPC
State of Linux Containers for HPC
 
Supporting bioinformatics applications with hybrid multi-cloud services
Supporting bioinformatics applications with hybrid multi-cloud servicesSupporting bioinformatics applications with hybrid multi-cloud services
Supporting bioinformatics applications with hybrid multi-cloud services
 
Introduction to containers, k8s, Microservices & Cloud Native
Introduction to containers, k8s, Microservices & Cloud NativeIntroduction to containers, k8s, Microservices & Cloud Native
Introduction to containers, k8s, Microservices & Cloud Native
 
Simulation of Heterogeneous Cloud Infrastructures
Simulation of Heterogeneous Cloud InfrastructuresSimulation of Heterogeneous Cloud Infrastructures
Simulation of Heterogeneous Cloud Infrastructures
 
Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...
Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...
Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...
 
Kubernetes Introduction
Kubernetes IntroductionKubernetes Introduction
Kubernetes Introduction
 

More from FogGuru MSCA Project

The magical recipe for speaking in public
The magical recipe for speaking in publicThe magical recipe for speaking in public
The magical recipe for speaking in publicFogGuru MSCA Project
 
Introduction to the economics of innovation
Introduction to the economics of innovationIntroduction to the economics of innovation
Introduction to the economics of innovationFogGuru MSCA Project
 
Introduction to entrepreneurial finances
Introduction to entrepreneurial financesIntroduction to entrepreneurial finances
Introduction to entrepreneurial financesFogGuru MSCA Project
 
Financing Innovation and Intellectual property
Financing Innovation and Intellectual property Financing Innovation and Intellectual property
Financing Innovation and Intellectual property FogGuru MSCA Project
 
Creating Competitive Advantage: Resource and Capabilities
Creating Competitive Advantage: Resource and Capabilities Creating Competitive Advantage: Resource and Capabilities
Creating Competitive Advantage: Resource and Capabilities FogGuru MSCA Project
 
Business growth: material for exercises
Business growth: material for exercisesBusiness growth: material for exercises
Business growth: material for exercisesFogGuru MSCA Project
 
Business growth: material for discussions
Business growth: material for discussions  Business growth: material for discussions
Business growth: material for discussions FogGuru MSCA Project
 
Management, organization and leadership
Management, organization and leadershipManagement, organization and leadership
Management, organization and leadershipFogGuru MSCA Project
 
Writing code well: tools, tips and tricks
Writing code well: tools, tips and tricks Writing code well: tools, tips and tricks
Writing code well: tools, tips and tricks FogGuru MSCA Project
 
How to carry out bibliographic research
How to carry out bibliographic research How to carry out bibliographic research
How to carry out bibliographic research FogGuru MSCA Project
 
Guidelines for empirical evaluations
Guidelines for empirical evaluationsGuidelines for empirical evaluations
Guidelines for empirical evaluationsFogGuru MSCA Project
 
Business case 1: Soft mobility in Rennes Metropole
Business case 1: Soft mobility in Rennes Metropole Business case 1: Soft mobility in Rennes Metropole
Business case 1: Soft mobility in Rennes Metropole FogGuru MSCA Project
 

More from FogGuru MSCA Project (20)

Assignments
AssignmentsAssignments
Assignments
 
The magical recipe for speaking in public
The magical recipe for speaking in publicThe magical recipe for speaking in public
The magical recipe for speaking in public
 
Introduction to the economics of innovation
Introduction to the economics of innovationIntroduction to the economics of innovation
Introduction to the economics of innovation
 
Introduction to entrepreneurial finances
Introduction to entrepreneurial financesIntroduction to entrepreneurial finances
Introduction to entrepreneurial finances
 
Financing Innovation and Intellectual property
Financing Innovation and Intellectual property Financing Innovation and Intellectual property
Financing Innovation and Intellectual property
 
Creating Competitive Advantage: Resource and Capabilities
Creating Competitive Advantage: Resource and Capabilities Creating Competitive Advantage: Resource and Capabilities
Creating Competitive Advantage: Resource and Capabilities
 
Business growth: material for exercises
Business growth: material for exercisesBusiness growth: material for exercises
Business growth: material for exercises
 
Business growth: material for discussions
Business growth: material for discussions  Business growth: material for discussions
Business growth: material for discussions
 
Scale-ups and large companies
Scale-ups and large companiesScale-ups and large companies
Scale-ups and large companies
 
Management, organization and leadership
Management, organization and leadershipManagement, organization and leadership
Management, organization and leadership
 
Key strategies for growth
Key strategies for growthKey strategies for growth
Key strategies for growth
 
Financing growth
Financing growthFinancing growth
Financing growth
 
Machine Learning: exercises
Machine Learning: exercises Machine Learning: exercises
Machine Learning: exercises
 
Introduction to Machine Learning
Introduction to Machine Learning Introduction to Machine Learning
Introduction to Machine Learning
 
Writing code well: tools, tips and tricks
Writing code well: tools, tips and tricks Writing code well: tools, tips and tricks
Writing code well: tools, tips and tricks
 
How to make a presentation
How to make a presentationHow to make a presentation
How to make a presentation
 
How to carry out bibliographic research
How to carry out bibliographic research How to carry out bibliographic research
How to carry out bibliographic research
 
Guidelines for empirical evaluations
Guidelines for empirical evaluationsGuidelines for empirical evaluations
Guidelines for empirical evaluations
 
Ethics and Personal Data
Ethics and Personal DataEthics and Personal Data
Ethics and Personal Data
 
Business case 1: Soft mobility in Rennes Metropole
Business case 1: Soft mobility in Rennes Metropole Business case 1: Soft mobility in Rennes Metropole
Business case 1: Soft mobility in Rennes Metropole
 

Recently uploaded

PLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. StartupsPLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. StartupsStefano
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
 
AI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří KarpíšekAI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří KarpíšekCzechDreamin
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Thierry Lestable
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaRTTS
 
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)Julian Hyde
 
The architecture of Generative AI for enterprises.pdf
The architecture of Generative AI for enterprises.pdfThe architecture of Generative AI for enterprises.pdf
The architecture of Generative AI for enterprises.pdfalexjohnson7307
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupCatarinaPereira64715
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...Product School
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...Product School
 
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptxUnpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptxDavid Michel
 
Introduction to Open Source RAG and RAG Evaluation
Introduction to Open Source RAG and RAG EvaluationIntroduction to Open Source RAG and RAG Evaluation
Introduction to Open Source RAG and RAG EvaluationZilliz
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backElena Simperl
 
Speed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in MinutesSpeed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in Minutesconfluent
 
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi IbrahimzadeFree and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi IbrahimzadeCzechDreamin
 
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...CzechDreamin
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesBhaskar Mitra
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor TurskyiFwdays
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Product School
 
Agentic RAG What it is its types applications and implementation.pdf
Agentic RAG What it is its types applications and implementation.pdfAgentic RAG What it is its types applications and implementation.pdf
Agentic RAG What it is its types applications and implementation.pdfChristopherTHyatt
 

Recently uploaded (20)

PLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. StartupsPLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. Startups
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
AI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří KarpíšekAI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří Karpíšek
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
 
The architecture of Generative AI for enterprises.pdf
The architecture of Generative AI for enterprises.pdfThe architecture of Generative AI for enterprises.pdf
The architecture of Generative AI for enterprises.pdf
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptxUnpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
 
Introduction to Open Source RAG and RAG Evaluation
Introduction to Open Source RAG and RAG EvaluationIntroduction to Open Source RAG and RAG Evaluation
Introduction to Open Source RAG and RAG Evaluation
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
Speed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in MinutesSpeed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in Minutes
 
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi IbrahimzadeFree and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
 
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
Agentic RAG What it is its types applications and implementation.pdf
Agentic RAG What it is its types applications and implementation.pdfAgentic RAG What it is its types applications and implementation.pdf
Agentic RAG What it is its types applications and implementation.pdf
 

Container orchestration in geo-distributed cloud computing platforms

  • 1. TRAINING THE NEXT GENERATION OF EUROPEAN FOG COMPUTING EXPERTS Container orchestration in geo-distributed cloud computing platforms Keynote at HotCloudPerf April 20th 2021 Mulugeta Ayalew Tamiru, Guillaume Pierre, Johan Tordsson and Erik Elmroth Elastisys AB & Université de Rennes 1 1
  • 2. Geo-distributed cloud platforms 2 Fault tolerance Proximity Resource aggregation Regulatory compliance
  • 3. Goal: reliably deploy software across the full platform ▪ Containers everywhere • To abstract ourselves from heterogeneity of the host hardware + hypervisors ▪ Deploy potentially large numbers of containers • If necessary: burst to a public cloud ▪ Control container placements • Manually • Semi-automatically: “as close as possible from X” • Automatically: load-balanced across all locations 3
  • 4. Kubernetes Federation (KubeFed) ▪ Resource management and application deployment on multiple Kubernetes clusters (member clusters) from a single control plane (host cluster) ▪ BUT: KubeFed was not specifically designed for worldwide geo-distribution 4
  • 5. Experimental setup 5 ▪ 1 host cluster and 5 member clusters with Kubernetes 1.14 ▪ Each cluster with a master and five worker nodes ▪ Host cluster nodes: 4vCPUs, 16GB RAM ▪ Member cluster nodes: 4vCPUs, 4 GB RAM ▪ Simple nginx web server app
  • 7. Impact of network configuration on stability 7 AVERAGE NO . OF TIMEOUT ERRORS PER MINUTE (N ) AND STABILITY (υ) OF THE UNCONTROLLED SYSTEM FOR THE THREE EVALUATION SCENARIOS . Network delay/ packet loss rate increased Cluster failure Network delay/ packet loss rate restored Cluster restored
  • 8. KubeFed configuration parameters 8 Parameter Default Cluster Available Delay 20s Cluster Unavailable Delay 60s Leader Elect Lease Duration 15s Leader Elect Renew Deadline 10s Leader Elect Retry Period 5s Cluster Health Check Timeout 3s Cluster Health Check Period 10s Cluster Health Check Failure Threshold 3
  • 9. Stability vs. failure detection delay 9
  • 10. Solution -- Controller to adjust CHCT at run-time 10
  • 11. Results -- Stationary scenario 11
  • 12. Results -- Network variability scenario 12 Network delay/ packet loss rate increased Network delay/ packet loss rate restored
  • 13. Results -- Cluster failure scenario 13 Cluster failure Cluster restored
  • 14. (Temporary) conclusion ▪ We observe significant instability in KubeFed-based geo-distributed fog platforms due to: • poor network conditions • default / static configuration parameters ▪ We designed a proportional controller to adjust CHCT at run-time • Improves the system stability from 83–92% with no controller to 99.5–100% using the controller Mulugeta Tamiru, Guillaume Pierre, Johan Tordsson, Erik Elmroth. Instability in Geo-Distributed Kubernetes Federation: Causes and Mitigation. In Proceedings of IEEE MASCOTS, Nov 2020. 14
  • 15. Now that we fixed the instability problem, is KubeFed ready to manage large-scale geo-distributed platforms? Note quite: in KubeFed, any deployment request is pushed to the requested cluster regardless of the resource availability in this cluster. 15 Let’s replay 1 hour of Google cluster trace, distribute jobs to one out of 5 clusters according to a binomial distribution: ▪ 3 overloaded clusters ▪ 2 mostly idle clusters
  • 16. Problems to address ▪ Make sure applications are not deployed in overloaded clusters • Even if this requires choosing another cluster automatically… ▪ Support application autoscaling in multi-cluster environments • Vary the number of replicas within a single cluster… • … or across multiple clusters ▪ Allow the system to burst out to a public cloud in case of resource overload • And retract public-cloud resources as early as possible ▪ Seamlessly integrate in existing KubeFed platforms 16
  • 17. 17 Deploy mcd-app-1 across two clusters which receive most network traffic Make sure end-user requests are distributed across both clusters
  • 18. 18 Autoscale the application deployment to maintain reasonable CPU usage Dynamically provision more resources from the public cloud if necessary
  • 19. 19
  • 20. Conclusion Geo-distributed Kubernetes federations are now: ▪ Stable ▪ Resource availability aware ▪ Network traffic and network latency aware ▪ Burstable between available clusters, and to the public cloud mck8s is available: https://github.com/moule3053/mck8s Mulugeta Tamiru, Guillaume Pierre, Johan Tordsson, Erik Elmroth. mck8s: an orchestration platform for geo-distributed multi-cluster environments. In Proceedings of ICCCN, Jul 2021. 20
  • 21. The FogGuru project has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant 765452. TRAINING THE NEXT GENERATION OF EUROPEAN FOG COMPUTING EXPERTS www.fogguru.eu 21