SlideShare a Scribd company logo
1 of 32
Download to read offline
Self scaling
multi cloud
nomad workloads
Cloud Engineer @ Seaplane.io
@attachmentgenie
Bram
Vogelaar
We have all been there…
What would it take
to keep the
platform running
(no matter what)?
And more importantly….
Vertical vs Horizontal Scaling
Absorb additional pressure and pay for
what you need
The aim becomes 100% 99.999% uptime
Task Scheduler
Spec Sheet
Open-source tool for dynamic workload
scheduling
Batch, containerized, and non-containerized
applications.
Jobs written in (H)ashiCorp (C)onfiguration
(L)anguage
https://www.nomadproject.io/
CODE EDITOR
Lets build awesome product
job "lorem-ipsum" {
group ”frontend" {
network {
port "http" { to = ”3000” }
}
service {
name = ”lorem"
port = ”http"
}
task "server" {
driver = "docker"
config {
image = ”cicero/lorem-ipsum:v1.0.0"
ports = ["http"]
}
}
CODE EDITOR
Expose it to the world
## traefik.yml
entryPoints:
web:
address: :80
providers:
file:
directory: /etc/traefik.d/dynamic.yml
watch: true
CODE EDITOR
Expose it to the world
## dynamic.yml
http:
routers:
lorem-ipsum:
rule: "Host(`lorem-ipsum.io`)"
service: lorem-ipsum
services:
lorem-ipsum:
loadBalancer:
servers:
- url: http://localhost:26489
CODE EDITOR
Sleep++ step1
job "lorem-ipsum" {
group ”frontend" {
count = 2
constraint {
operator = "distinct_hosts"
value = "true"
}
Your scheduler
should not be an
expensive systemd
Service Discovery + Service Mesh + Scheduler
Service Discovery + Service Mesh
Spec Sheet
Open-Source Service Discovery Tool
Build-in KV store
Service Mesh tool
Uses watchers to have near instant feedback
loops
https://www.consul.io/
CODE EDITOR
Exposing as a service
job "lorem-ipsum" {
group ”frontend" {
service {
name = "www"
tags = ["metrics",”traefik.enable=true”]
port = "http"
check {
name = "alive"
type = "http"
path = "/health"
interval = "10s"
timeout = "2s"
}
}
CODE EDITOR
Expose it to the world
## dynamic.yml
http:
routers:
lorem-ipsum:
rule: "Host(`lorem-ipsum.io`)"
service: lorem-ipsum
services:
lorem-ipsum:
loadBalancer:
servers:
- url: http://lorem-ipsum.service.consul:26489
CODE EDITOR
Sleep++ lazy++
## traefik.yml
entryPoints:
web:
address: :80
providers:
consulCatalog:
defaultRule: "Host(`{{ .Name }}.lorem-ipsum.io`)"
endpoint:
address: 'http://localhost:8500'
exposedByDefault: false
CODE EDITOR
Dynamic Metric Scraping
# prometheus.yml
- job_name: nomad_workload
scrape_interval: 60s
consul_sd_configs:
- server: localhost:8500
tags:
- metrics
CODE EDITOR
Sleep++ Autoscaling
job "lorem-ipsum" {
group ”frontend" {
scaling {
enabled = false
min = 1
max = 20
policy {
cooldown = "20s"
check "avg_sessions" {
source = "prometheus"
query = "sum(traefik_entrypoint_open_connections{entrypoint="lorem-ipsum"}"
strategy "target-value" {
target = 5
}
}
}
}
Data where your users are
CODE EDITOR
Nomad Federation
server {
enabled = true
bootstrap_expect = 3
server_join {
authoritative_region = “mgmt”
retry_join = [
"1.1.1.1", “1.1.1.2”, “1.1.1.3”,
“2.2.2.1”,”2.2.2.2”,”2.2.2.3”
]
retry_max = 3
retry_interval = "15s"
}
}
CODE EDITOR
Sleep++ Separate mgmt from workload
server {
enabled = true
bootstrap_expect = 3
server_join {
authoritative_region = “mgmt”
retry_join = [ "1.1.1.1", “1.1.1.2”, “1.1.1.3”,
“2.2.2.1”,”2.2.2.2”,”2.2.2.3” ]
retry_max = 3
retry_interval = "15s"
}
}
CODE EDITOR
Sleep++ Ingress
resource "ns1_record" "www" {
zone = “lorem-ipsum.io”
domain = "www.lorem-ipsum.io"
type = "CNAME"
ttl = 60
regions {
name = "usa"
meta = {
country = "US"
}
}
answers {
answer = "usa.lorem-ipsum.io"
region = "usa"
}
}
CODE EDITOR
Identify workload region
external_labels:
datacenter: "%{::trusted.extensions.pp_datacenter}"
region: "%{::trusted.extensions.pp_region}" # ${IATA_CODES}
env: "%{::trusted.extensions.pp_environment}"
What would it take
to keep it running
while losing an
entire region
Limit the blast radius
Spread your risks!
US
AWS
GCP
DO
AZ
EU
AWS
GCP
DO
AZ
Not all vendors are
created equally
VPCs vs Network mesh, Base Images etc,etc
CODE EDITOR
Lazy++ Redirect to other regions
## traefik.yml
entryPoints:
web:
address: :80
providers:
consul:
endpoints:
address: 'http://localhost:8500'
consul kv put traefik/http/services/www/loadbalancer/servers/0/url "-X PUT
-d 'http://usa.lorem-ipsum.io'"
CODE EDITOR
Sleep++ Redirect to other regions
curl http://127.0.0.1:8500/v1/query 
--request POST 
--data @- << EOF
{
"Name": "www",
"Service": {
"Service": "www",
"Failover": {
"Datacenters": ["us", "eu"]
}
}
}
EOF
CODE EDITOR
Sleep++ Redirect to other regions
curl http://127.0.0.1:8500/v1/query 
--request POST 
--data @- << EOF
{
"Name": "www",
"Service": {
"Service": "www",
"Failover": {
"NearestN": 2,
"Datacenters": ["us", "eu"]
}
}
}
EOF
Follow the sun?
Follow the moon?
Discover nearest?
Sleep++ Consul watchers
Thank You
bram@attachmentgenie.com | @attachmentgenie | https://www.slideshare.net/attachmentgenie

More Related Content

Similar to Self scaling Multi cloud nomad workloads

Troubleshooting Strategies for CloudStack Installations by Kirk Kosinski
Troubleshooting Strategies for CloudStack Installations by Kirk Kosinski Troubleshooting Strategies for CloudStack Installations by Kirk Kosinski
Troubleshooting Strategies for CloudStack Installations by Kirk Kosinski buildacloud
 
Tools for Solving Performance Issues
Tools for Solving Performance IssuesTools for Solving Performance Issues
Tools for Solving Performance IssuesOdoo
 
Taming event-driven software via formal verification
Taming event-driven software via formal verificationTaming event-driven software via formal verification
Taming event-driven software via formal verificationAdaCore
 
Incrementalism: An Industrial Strategy For Adopting Modern Automation
Incrementalism: An Industrial Strategy For Adopting Modern AutomationIncrementalism: An Industrial Strategy For Adopting Modern Automation
Incrementalism: An Industrial Strategy For Adopting Modern AutomationSean Chittenden
 
Apache MXNet Distributed Training Explained In Depth by Viacheslav Kovalevsky...
Apache MXNet Distributed Training Explained In Depth by Viacheslav Kovalevsky...Apache MXNet Distributed Training Explained In Depth by Viacheslav Kovalevsky...
Apache MXNet Distributed Training Explained In Depth by Viacheslav Kovalevsky...Big Data Spain
 
A Node.js Developer's Guide to Bluemix
A Node.js Developer's Guide to BluemixA Node.js Developer's Guide to Bluemix
A Node.js Developer's Guide to Bluemixibmwebspheresoftware
 
Micro app-framework - NodeLive Boston
Micro app-framework - NodeLive BostonMicro app-framework - NodeLive Boston
Micro app-framework - NodeLive BostonMichael Dawson
 
A framework for self-healing applications – the path to enable auto-remediation
A framework for self-healing applications – the path to enable auto-remediationA framework for self-healing applications – the path to enable auto-remediation
A framework for self-healing applications – the path to enable auto-remediationJürgen Etzlstorfer
 
Monitoring InfluxEnterprise
Monitoring InfluxEnterpriseMonitoring InfluxEnterprise
Monitoring InfluxEnterpriseInfluxData
 
fog or: How I Learned to Stop Worrying and Love the Cloud
fog or: How I Learned to Stop Worrying and Love the Cloudfog or: How I Learned to Stop Worrying and Love the Cloud
fog or: How I Learned to Stop Worrying and Love the CloudWesley Beary
 
using Mithril.js + postgREST to build and consume API's
using Mithril.js + postgREST to build and consume API'susing Mithril.js + postgREST to build and consume API's
using Mithril.js + postgREST to build and consume API'sAntônio Roberto Silva
 
Learn How to Use a Time Series Platform to Monitor All Aspects of Your Kubern...
Learn How to Use a Time Series Platform to Monitor All Aspects of Your Kubern...Learn How to Use a Time Series Platform to Monitor All Aspects of Your Kubern...
Learn How to Use a Time Series Platform to Monitor All Aspects of Your Kubern...DevOps.com
 
Easy Cloud Native Transformation using HashiCorp Nomad
Easy Cloud Native Transformation using HashiCorp NomadEasy Cloud Native Transformation using HashiCorp Nomad
Easy Cloud Native Transformation using HashiCorp NomadBram Vogelaar
 
A miało być tak... bez wycieków
A miało być tak... bez wyciekówA miało być tak... bez wycieków
A miało być tak... bez wyciekówKonrad Kokosa
 
Evolving your Data Access with MongoDB Stitch
Evolving your Data Access with MongoDB StitchEvolving your Data Access with MongoDB Stitch
Evolving your Data Access with MongoDB StitchMongoDB
 
Mastering Spring Boot's Actuator with Madhura Bhave
Mastering Spring Boot's Actuator with Madhura BhaveMastering Spring Boot's Actuator with Madhura Bhave
Mastering Spring Boot's Actuator with Madhura BhaveVMware Tanzu
 
fog or: How I Learned to Stop Worrying and Love the Cloud (OpenStack Edition)
fog or: How I Learned to Stop Worrying and Love the Cloud (OpenStack Edition)fog or: How I Learned to Stop Worrying and Love the Cloud (OpenStack Edition)
fog or: How I Learned to Stop Worrying and Love the Cloud (OpenStack Edition)Wesley Beary
 

Similar to Self scaling Multi cloud nomad workloads (20)

Troubleshooting Strategies for CloudStack Installations by Kirk Kosinski
Troubleshooting Strategies for CloudStack Installations by Kirk Kosinski Troubleshooting Strategies for CloudStack Installations by Kirk Kosinski
Troubleshooting Strategies for CloudStack Installations by Kirk Kosinski
 
Tools for Solving Performance Issues
Tools for Solving Performance IssuesTools for Solving Performance Issues
Tools for Solving Performance Issues
 
Taming event-driven software via formal verification
Taming event-driven software via formal verificationTaming event-driven software via formal verification
Taming event-driven software via formal verification
 
Incrementalism: An Industrial Strategy For Adopting Modern Automation
Incrementalism: An Industrial Strategy For Adopting Modern AutomationIncrementalism: An Industrial Strategy For Adopting Modern Automation
Incrementalism: An Industrial Strategy For Adopting Modern Automation
 
Apache MXNet Distributed Training Explained In Depth by Viacheslav Kovalevsky...
Apache MXNet Distributed Training Explained In Depth by Viacheslav Kovalevsky...Apache MXNet Distributed Training Explained In Depth by Viacheslav Kovalevsky...
Apache MXNet Distributed Training Explained In Depth by Viacheslav Kovalevsky...
 
A Node.js Developer's Guide to Bluemix
A Node.js Developer's Guide to BluemixA Node.js Developer's Guide to Bluemix
A Node.js Developer's Guide to Bluemix
 
Micro app-framework - NodeLive Boston
Micro app-framework - NodeLive BostonMicro app-framework - NodeLive Boston
Micro app-framework - NodeLive Boston
 
Micro app-framework
Micro app-frameworkMicro app-framework
Micro app-framework
 
Zone IDA Proc
Zone IDA ProcZone IDA Proc
Zone IDA Proc
 
A framework for self-healing applications – the path to enable auto-remediation
A framework for self-healing applications – the path to enable auto-remediationA framework for self-healing applications – the path to enable auto-remediation
A framework for self-healing applications – the path to enable auto-remediation
 
Monitoring InfluxEnterprise
Monitoring InfluxEnterpriseMonitoring InfluxEnterprise
Monitoring InfluxEnterprise
 
fog or: How I Learned to Stop Worrying and Love the Cloud
fog or: How I Learned to Stop Worrying and Love the Cloudfog or: How I Learned to Stop Worrying and Love the Cloud
fog or: How I Learned to Stop Worrying and Love the Cloud
 
using Mithril.js + postgREST to build and consume API's
using Mithril.js + postgREST to build and consume API'susing Mithril.js + postgREST to build and consume API's
using Mithril.js + postgREST to build and consume API's
 
Learn How to Use a Time Series Platform to Monitor All Aspects of Your Kubern...
Learn How to Use a Time Series Platform to Monitor All Aspects of Your Kubern...Learn How to Use a Time Series Platform to Monitor All Aspects of Your Kubern...
Learn How to Use a Time Series Platform to Monitor All Aspects of Your Kubern...
 
OneTeam Media Server
OneTeam Media ServerOneTeam Media Server
OneTeam Media Server
 
Easy Cloud Native Transformation using HashiCorp Nomad
Easy Cloud Native Transformation using HashiCorp NomadEasy Cloud Native Transformation using HashiCorp Nomad
Easy Cloud Native Transformation using HashiCorp Nomad
 
A miało być tak... bez wycieków
A miało być tak... bez wyciekówA miało być tak... bez wycieków
A miało być tak... bez wycieków
 
Evolving your Data Access with MongoDB Stitch
Evolving your Data Access with MongoDB StitchEvolving your Data Access with MongoDB Stitch
Evolving your Data Access with MongoDB Stitch
 
Mastering Spring Boot's Actuator with Madhura Bhave
Mastering Spring Boot's Actuator with Madhura BhaveMastering Spring Boot's Actuator with Madhura Bhave
Mastering Spring Boot's Actuator with Madhura Bhave
 
fog or: How I Learned to Stop Worrying and Love the Cloud (OpenStack Edition)
fog or: How I Learned to Stop Worrying and Love the Cloud (OpenStack Edition)fog or: How I Learned to Stop Worrying and Love the Cloud (OpenStack Edition)
fog or: How I Learned to Stop Worrying and Love the Cloud (OpenStack Edition)
 

More from Bram Vogelaar

Cost reconciliation in a post CMDB world
Cost reconciliation in a post CMDB worldCost reconciliation in a post CMDB world
Cost reconciliation in a post CMDB worldBram Vogelaar
 
Scraping metrics for fun and profit
Scraping metrics for fun and profitScraping metrics for fun and profit
Scraping metrics for fun and profitBram Vogelaar
 
10 things i learned building nomad-packs
10 things i learned building nomad-packs10 things i learned building nomad-packs
10 things i learned building nomad-packsBram Vogelaar
 
10 things I learned building Nomad packs
10 things I learned building Nomad packs10 things I learned building Nomad packs
10 things I learned building Nomad packsBram Vogelaar
 
Easy Cloud Native Transformation with Nomad
Easy Cloud Native Transformation with NomadEasy Cloud Native Transformation with Nomad
Easy Cloud Native Transformation with NomadBram Vogelaar
 
Observability; a gentle introduction
Observability; a gentle introductionObservability; a gentle introduction
Observability; a gentle introductionBram Vogelaar
 
Running Trusted Payload with Nomad and Waypoint
Running Trusted Payload with Nomad and WaypointRunning Trusted Payload with Nomad and Waypoint
Running Trusted Payload with Nomad and WaypointBram Vogelaar
 
Securing Prometheus exporters using HashiCorp Vault
Securing Prometheus exporters using HashiCorp VaultSecuring Prometheus exporters using HashiCorp Vault
Securing Prometheus exporters using HashiCorp VaultBram Vogelaar
 
CICD using jenkins and Nomad
CICD using jenkins and NomadCICD using jenkins and Nomad
CICD using jenkins and NomadBram Vogelaar
 
Bootstrapping multidc observability stack
Bootstrapping multidc observability stackBootstrapping multidc observability stack
Bootstrapping multidc observability stackBram Vogelaar
 
Running trusted payloads with Nomad and Waypoint
Running trusted payloads with Nomad and WaypointRunning trusted payloads with Nomad and Waypoint
Running trusted payloads with Nomad and WaypointBram Vogelaar
 
Gamification of Chaos Testing
Gamification of Chaos TestingGamification of Chaos Testing
Gamification of Chaos TestingBram Vogelaar
 
Puppet and the HashiStack
Puppet and the HashiStackPuppet and the HashiStack
Puppet and the HashiStackBram Vogelaar
 
Bootstrapping multidc observability stack
Bootstrapping multidc observability stackBootstrapping multidc observability stack
Bootstrapping multidc observability stackBram Vogelaar
 
Creating Reusable Puppet Profiles
Creating Reusable Puppet ProfilesCreating Reusable Puppet Profiles
Creating Reusable Puppet ProfilesBram Vogelaar
 
Gamification of Chaos Testing
Gamification of Chaos TestingGamification of Chaos Testing
Gamification of Chaos TestingBram Vogelaar
 
Autoscaling with hashi_corp_nomad
Autoscaling with hashi_corp_nomadAutoscaling with hashi_corp_nomad
Autoscaling with hashi_corp_nomadBram Vogelaar
 
Observability with Consul Connect
Observability with Consul ConnectObservability with Consul Connect
Observability with Consul ConnectBram Vogelaar
 
Testing your infrastructure with litmus
Testing your infrastructure with litmusTesting your infrastructure with litmus
Testing your infrastructure with litmusBram Vogelaar
 

More from Bram Vogelaar (20)

Cost reconciliation in a post CMDB world
Cost reconciliation in a post CMDB worldCost reconciliation in a post CMDB world
Cost reconciliation in a post CMDB world
 
Scraping metrics for fun and profit
Scraping metrics for fun and profitScraping metrics for fun and profit
Scraping metrics for fun and profit
 
10 things i learned building nomad-packs
10 things i learned building nomad-packs10 things i learned building nomad-packs
10 things i learned building nomad-packs
 
10 things I learned building Nomad packs
10 things I learned building Nomad packs10 things I learned building Nomad packs
10 things I learned building Nomad packs
 
Easy Cloud Native Transformation with Nomad
Easy Cloud Native Transformation with NomadEasy Cloud Native Transformation with Nomad
Easy Cloud Native Transformation with Nomad
 
Uncomplicated Nomad
Uncomplicated NomadUncomplicated Nomad
Uncomplicated Nomad
 
Observability; a gentle introduction
Observability; a gentle introductionObservability; a gentle introduction
Observability; a gentle introduction
 
Running Trusted Payload with Nomad and Waypoint
Running Trusted Payload with Nomad and WaypointRunning Trusted Payload with Nomad and Waypoint
Running Trusted Payload with Nomad and Waypoint
 
Securing Prometheus exporters using HashiCorp Vault
Securing Prometheus exporters using HashiCorp VaultSecuring Prometheus exporters using HashiCorp Vault
Securing Prometheus exporters using HashiCorp Vault
 
CICD using jenkins and Nomad
CICD using jenkins and NomadCICD using jenkins and Nomad
CICD using jenkins and Nomad
 
Bootstrapping multidc observability stack
Bootstrapping multidc observability stackBootstrapping multidc observability stack
Bootstrapping multidc observability stack
 
Running trusted payloads with Nomad and Waypoint
Running trusted payloads with Nomad and WaypointRunning trusted payloads with Nomad and Waypoint
Running trusted payloads with Nomad and Waypoint
 
Gamification of Chaos Testing
Gamification of Chaos TestingGamification of Chaos Testing
Gamification of Chaos Testing
 
Puppet and the HashiStack
Puppet and the HashiStackPuppet and the HashiStack
Puppet and the HashiStack
 
Bootstrapping multidc observability stack
Bootstrapping multidc observability stackBootstrapping multidc observability stack
Bootstrapping multidc observability stack
 
Creating Reusable Puppet Profiles
Creating Reusable Puppet ProfilesCreating Reusable Puppet Profiles
Creating Reusable Puppet Profiles
 
Gamification of Chaos Testing
Gamification of Chaos TestingGamification of Chaos Testing
Gamification of Chaos Testing
 
Autoscaling with hashi_corp_nomad
Autoscaling with hashi_corp_nomadAutoscaling with hashi_corp_nomad
Autoscaling with hashi_corp_nomad
 
Observability with Consul Connect
Observability with Consul ConnectObservability with Consul Connect
Observability with Consul Connect
 
Testing your infrastructure with litmus
Testing your infrastructure with litmusTesting your infrastructure with litmus
Testing your infrastructure with litmus
 

Recently uploaded

Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringMachine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringHironori Washizaki
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Matt Ray
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaHanief Utama
 
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)jennyeacort
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtimeandrehoraa
 
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...OnePlan Solutions
 
Large Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLarge Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLionel Briand
 
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdf
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdfInnovate and Collaborate- Harnessing the Power of Open Source Software.pdf
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdfYashikaSharma391629
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Angel Borroy López
 
Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Mater
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsAhmed Mohamed
 
Cyber security and its impact on E commerce
Cyber security and its impact on E commerceCyber security and its impact on E commerce
Cyber security and its impact on E commercemanigoyal112
 
Understanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM ArchitectureUnderstanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM Architecturerahul_net
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfAlina Yurenko
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...confluent
 
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanyChristoph Pohl
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Natan Silnitsky
 
Salesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZSalesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZABSYZ Inc
 

Recently uploaded (20)

Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringMachine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their Engineering
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief Utama
 
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtime
 
Advantages of Odoo ERP 17 for Your Business
Advantages of Odoo ERP 17 for Your BusinessAdvantages of Odoo ERP 17 for Your Business
Advantages of Odoo ERP 17 for Your Business
 
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
 
Large Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLarge Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and Repair
 
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdf
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdfInnovate and Collaborate- Harnessing the Power of Open Source Software.pdf
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdf
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
 
2.pdf Ejercicios de programación competitiva
2.pdf Ejercicios de programación competitiva2.pdf Ejercicios de programación competitiva
2.pdf Ejercicios de programación competitiva
 
Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML Diagrams
 
Cyber security and its impact on E commerce
Cyber security and its impact on E commerceCyber security and its impact on E commerce
Cyber security and its impact on E commerce
 
Understanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM ArchitectureUnderstanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM Architecture
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
 
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
 
Salesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZSalesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZ
 

Self scaling Multi cloud nomad workloads

  • 2. Cloud Engineer @ Seaplane.io @attachmentgenie Bram Vogelaar
  • 3. We have all been there…
  • 4. What would it take to keep the platform running (no matter what)? And more importantly….
  • 6. Absorb additional pressure and pay for what you need
  • 7. The aim becomes 100% 99.999% uptime
  • 8. Task Scheduler Spec Sheet Open-source tool for dynamic workload scheduling Batch, containerized, and non-containerized applications. Jobs written in (H)ashiCorp (C)onfiguration (L)anguage https://www.nomadproject.io/
  • 9. CODE EDITOR Lets build awesome product job "lorem-ipsum" { group ”frontend" { network { port "http" { to = ”3000” } } service { name = ”lorem" port = ”http" } task "server" { driver = "docker" config { image = ”cicero/lorem-ipsum:v1.0.0" ports = ["http"] } }
  • 10. CODE EDITOR Expose it to the world ## traefik.yml entryPoints: web: address: :80 providers: file: directory: /etc/traefik.d/dynamic.yml watch: true
  • 11. CODE EDITOR Expose it to the world ## dynamic.yml http: routers: lorem-ipsum: rule: "Host(`lorem-ipsum.io`)" service: lorem-ipsum services: lorem-ipsum: loadBalancer: servers: - url: http://localhost:26489
  • 12. CODE EDITOR Sleep++ step1 job "lorem-ipsum" { group ”frontend" { count = 2 constraint { operator = "distinct_hosts" value = "true" }
  • 13. Your scheduler should not be an expensive systemd Service Discovery + Service Mesh + Scheduler
  • 14. Service Discovery + Service Mesh Spec Sheet Open-Source Service Discovery Tool Build-in KV store Service Mesh tool Uses watchers to have near instant feedback loops https://www.consul.io/
  • 15. CODE EDITOR Exposing as a service job "lorem-ipsum" { group ”frontend" { service { name = "www" tags = ["metrics",”traefik.enable=true”] port = "http" check { name = "alive" type = "http" path = "/health" interval = "10s" timeout = "2s" } }
  • 16. CODE EDITOR Expose it to the world ## dynamic.yml http: routers: lorem-ipsum: rule: "Host(`lorem-ipsum.io`)" service: lorem-ipsum services: lorem-ipsum: loadBalancer: servers: - url: http://lorem-ipsum.service.consul:26489
  • 17. CODE EDITOR Sleep++ lazy++ ## traefik.yml entryPoints: web: address: :80 providers: consulCatalog: defaultRule: "Host(`{{ .Name }}.lorem-ipsum.io`)" endpoint: address: 'http://localhost:8500' exposedByDefault: false
  • 18. CODE EDITOR Dynamic Metric Scraping # prometheus.yml - job_name: nomad_workload scrape_interval: 60s consul_sd_configs: - server: localhost:8500 tags: - metrics
  • 19. CODE EDITOR Sleep++ Autoscaling job "lorem-ipsum" { group ”frontend" { scaling { enabled = false min = 1 max = 20 policy { cooldown = "20s" check "avg_sessions" { source = "prometheus" query = "sum(traefik_entrypoint_open_connections{entrypoint="lorem-ipsum"}" strategy "target-value" { target = 5 } } } }
  • 20. Data where your users are
  • 21. CODE EDITOR Nomad Federation server { enabled = true bootstrap_expect = 3 server_join { authoritative_region = “mgmt” retry_join = [ "1.1.1.1", “1.1.1.2”, “1.1.1.3”, “2.2.2.1”,”2.2.2.2”,”2.2.2.3” ] retry_max = 3 retry_interval = "15s" } }
  • 22. CODE EDITOR Sleep++ Separate mgmt from workload server { enabled = true bootstrap_expect = 3 server_join { authoritative_region = “mgmt” retry_join = [ "1.1.1.1", “1.1.1.2”, “1.1.1.3”, “2.2.2.1”,”2.2.2.2”,”2.2.2.3” ] retry_max = 3 retry_interval = "15s" } }
  • 23. CODE EDITOR Sleep++ Ingress resource "ns1_record" "www" { zone = “lorem-ipsum.io” domain = "www.lorem-ipsum.io" type = "CNAME" ttl = 60 regions { name = "usa" meta = { country = "US" } } answers { answer = "usa.lorem-ipsum.io" region = "usa" } }
  • 24. CODE EDITOR Identify workload region external_labels: datacenter: "%{::trusted.extensions.pp_datacenter}" region: "%{::trusted.extensions.pp_region}" # ${IATA_CODES} env: "%{::trusted.extensions.pp_environment}"
  • 25. What would it take to keep it running while losing an entire region Limit the blast radius
  • 27. Not all vendors are created equally VPCs vs Network mesh, Base Images etc,etc
  • 28. CODE EDITOR Lazy++ Redirect to other regions ## traefik.yml entryPoints: web: address: :80 providers: consul: endpoints: address: 'http://localhost:8500' consul kv put traefik/http/services/www/loadbalancer/servers/0/url "-X PUT -d 'http://usa.lorem-ipsum.io'"
  • 29. CODE EDITOR Sleep++ Redirect to other regions curl http://127.0.0.1:8500/v1/query --request POST --data @- << EOF { "Name": "www", "Service": { "Service": "www", "Failover": { "Datacenters": ["us", "eu"] } } } EOF
  • 30. CODE EDITOR Sleep++ Redirect to other regions curl http://127.0.0.1:8500/v1/query --request POST --data @- << EOF { "Name": "www", "Service": { "Service": "www", "Failover": { "NearestN": 2, "Datacenters": ["us", "eu"] } } } EOF
  • 31. Follow the sun? Follow the moon? Discover nearest? Sleep++ Consul watchers
  • 32. Thank You bram@attachmentgenie.com | @attachmentgenie | https://www.slideshare.net/attachmentgenie