SlideShare a Scribd company logo
Why and How to Run Your
Own Gitlab Runner Fleet
Casey Zednick
Principal Software Engineer – DE-Tools
©2022 F5
2
1. Learning briefly about Gitlab.
2. Exploring the benefits of running your own Gitlab
runners.
3. Understanding costs.
4. Designing an auto-scale runner fleet.
5. Creating and meeting service level objectives (SLOs).
6. Listing of resources.
Agenda
©2022 F5
3
What is Gitlab?
• Gitlab is a complete development platform that
supports git repositories, issue tracking, code
reviews, and CI/CD
• Comes in four main offerings
• Community Edition - Open-Source Software
(MIT licenses)
B A C K G R O U N D
©2022 F5
4
What are Gitlab Runners?
• Run the actual jobs of Gitlab’s continuous
integration / continuous delivery (CI/CD) pipelines.
• Two main parts:
• runners - talks to the API and executors.
• executors – do the actual CI/CD work.
• For example, runners pickup tasks, such as
building, testing, and deploying the code, and
given them to executors, which run the actual
commands.
B A C K G R O U N D
©2022 F5
5
Why Run Your Own Runners?
B E N E F I T S
Availability • Gitlab’s SaaS runners are in GCP. If your resources are in AWS, Azure, etc… emergent
cross-cloud issues impact availability
• Ability to monitor, diagnose, and mitigate emergent conditions without support ticket
response times
Security • No chance of your pipeline variables being leaked to other tenants due to runner bugs or
misconfiguration
• Control of the runner and executor hosts to enable increased supply chain security
Features • Enable right-sizing VMs to job workloads instead of using Gitlab’s single shared type
• Run jobs on non-Intel VMs
Cost • Run jobs for less than Gitlab’s $10 for 1,000 minutes ($0.60 per hour)
©2022 F5
6
Understanding Costs of Running a Fleet
Understanding how much it costs to run your own
runner fleet isn’t easy
You might think you can look at how many hours of
job time you need and multiple that by your VM
costs
In practice, it’s not this simple. Nevertheless, I’ll
show you how to forecast your expected total costs,
so you can make an informed decision
C O S T S
Fixed costs:
• Personnel time (omitted in this model)
• Compute and storage for runners
• Compute and storage for central metrics/logs
Dynamic costs:
• Compute for executors (the VMs running the
actual jobs)
• Storage for job artifacts
©2022 F5
7
Forecasting Fleet Costs
C O S T S
©2022 F5
8
Optimizing Fleet Compute Costs
C O S T S
config.toml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
[[runners.machine.autoscaling]]
Periods = ["* * * * * * *"] # Default no idle runners
IdleCount = 0
IdleTime = 600
Timezone = "America/Los_Angeles"
[[runners.machine.autoscaling]]
Periods = ["* * 8-17 * * mon-fri *"] # Pacific business hours
IdleCount = 40
IdleCountMin = 5
IdleScaleFactor = 1.5 # Means that current number of Idle machines will be 1.5*in-
use machines
IdleTime = 1200
Timezone = "America/Los_Angeles”
[[runners.machine.autoscaling]]
Periods = ["* * 10-15 * * mon-fri *"] # Pacific peak development
IdleCount = 60
IdleCountMin = 10
IdleScaleFactor = 1.5 # Means that current number of Idle machines will be 1.5*in-
use machines
IdleTime = 1200
Timezone = "America/Los_Angeles"
• Scale idle count by time of day
• Right size runners: our least expensive
executor is 8 times cheaper than our
most expensive
• For most expensive runners keep zero
idle count as idle count costs add up
quickly
©2022 F5
9
Understanding the Key Parts of Fleet Design
• Gitlab Instance (On-Prem / SaaS)
• Hosts source code and job artifacts
• API for runners to get pipeline jobs
• Runners
• Retrieve jobs from the Gitlab API and give them to
executors
• Controls how many executors are available for
jobs
• Executors
• Run actual workloads - think go build in Docker
container
D E S I G N
©2022 F5
10
Hosting the Runner (Control)
D E S I G N
deployment.yaml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-autoscale-uswest3-small-runner
namespace: nginx-autoscale-uswest3-small-runner
labels:
app: nginx-autoscale-uswest3-small-runner
spec:
strategy:
type: Recreate
replicas: 1
selector:
matchLabels:
app: nginx-autoscale-uswest3-small-runner
template:
metadata:
labels:
app: nginx-autoscale-uswest3-small-runner
spec:
containers:
- name: nginx-autoscale-uswest3-small-runner
image: gitlab/gitlab-runner:v14.5.2
• Long running daemon process
• Doesn’t have to share the same
network or cloud as the
docker+machine executors
• One runner can control many
executors.
• We use one runner per executor and
deploy it in its own k8s namespace to
allow for more control in config updates
©2022 F5
11
Understanding Autoscaling Executors (Compute)
• Kubernetes
• Quick job starts
• Scales using k8s horizontal pod autoscaling
• Problematic for Docker-in-Docker (DIND) use
• docker+machine (what I’m covering)
• Quickish job starts with tuned idle pool
• Orchestrates cloud VMs
• High isolation and support for DIND
D E S I G N
docker+machine VM lifecycle
1. VM provisioned via cloud APIs
2. VM OS updated and configure via ssh
3. VM’s Docker pulls job’s Docker container
4. Setup scripts ran in container
5. Source code checked out
6. Pipeline job script: code ran
7. Container exits
8. VM destroyed or returned to idle pool
©2022 F5
12
Hey, my pipeline job…
Ensuring your fleet meets your needs by using site reliability engineering (SRE)
©2022 F5
13
Establishing Service Level Objectives (SLOs)
S E R V I C E L E V E L O B J E C T I V E S ( S L O S )
Volume • How many jobs must you support per hour? 50, 100s, 1,000?
• Our volume SLO is 45% of our compute capacity 99% of the time
Availability • How long can a job wait before you consider the system down?
• Our availability SLO is 99% of jobs start in under 300 seconds
Latency • How quickly do you developers need to have their jobs serviced?
• Our latency SLO is 95% of our job start in 20 seconds
Errors • How many errors are normal?
• Our error SLO is less than 12 failed jobs per minute. Might sound high but in our experience,
everything is good until it isn’t
©2022 F5
14
Gathering Service Level Indicators (SLIs)
• Gitlab’s API
• Job queued_duration
• Job status success, failed, etc.
• Gitlab Runner Logs
• Cloud API errors
• General orchestration info
• Gitlab Runners Prometheus Endpoint
• http://localhost:9252/metrics
• Good scaling information
• http:/
S E R V I C E L E V E L O B J E C T I V E S ( S L O S )
Note: many low-level metrics like docker
pull times or times at various executor
stages aren’t exposed in easy-to-use
fashion. :/
©2022 F5
15
Visualizing Service Level Indicators (SLIs)
S E R V I C E L E V E L O B J E C T I V E S ( S L O S )
©2022 F5
16
Resources
• Gitlab Runner Overview
• Gitlab Fleet Scaling
• Excel runner cost model
R E S O U R C E S

More Related Content

Similar to Why use Gitlab

Continuos Integration and Delivery: from Zero to Hero with TeamCity, Docker a...
Continuos Integration and Delivery: from Zero to Hero with TeamCity, Docker a...Continuos Integration and Delivery: from Zero to Hero with TeamCity, Docker a...
Continuos Integration and Delivery: from Zero to Hero with TeamCity, Docker a...
Lean IT Consulting
 
Free GitOps Workshop
Free GitOps WorkshopFree GitOps Workshop
Free GitOps Workshop
Weaveworks
 
Kubernetes Failure Stories, or: How to Crash Your Cluster - ContainerDays EU ...
Kubernetes Failure Stories, or: How to Crash Your Cluster - ContainerDays EU ...Kubernetes Failure Stories, or: How to Crash Your Cluster - ContainerDays EU ...
Kubernetes Failure Stories, or: How to Crash Your Cluster - ContainerDays EU ...
Henning Jacobs
 
Running Airflow Workflows as ETL Processes on Hadoop
Running Airflow Workflows as ETL Processes on HadoopRunning Airflow Workflows as ETL Processes on Hadoop
Running Airflow Workflows as ETL Processes on Hadoop
clairvoyantllc
 
Data(?)Ops with CircleCI
Data(?)Ops with CircleCIData(?)Ops with CircleCI
Data(?)Ops with CircleCI
Jinwoong Kim
 
Gitlab ci, cncf.sk
Gitlab ci, cncf.skGitlab ci, cncf.sk
Gitlab ci, cncf.sk
Juraj Hantak
 
Reduce SRE Stress: Minimizing Service Downtime with Grafana, InfluxDB and Tel...
Reduce SRE Stress: Minimizing Service Downtime with Grafana, InfluxDB and Tel...Reduce SRE Stress: Minimizing Service Downtime with Grafana, InfluxDB and Tel...
Reduce SRE Stress: Minimizing Service Downtime with Grafana, InfluxDB and Tel...
InfluxData
 
Использование AzureDevOps при разработке микросервисных приложений
Использование AzureDevOps при разработке микросервисных приложенийИспользование AzureDevOps при разработке микросервисных приложений
Использование AzureDevOps при разработке микросервисных приложений
Vitebsk Miniq
 
Free GitOps Workshop (with Intro to Kubernetes & GitOps)
Free GitOps Workshop (with Intro to Kubernetes & GitOps)Free GitOps Workshop (with Intro to Kubernetes & GitOps)
Free GitOps Workshop (with Intro to Kubernetes & GitOps)
Weaveworks
 
MuleSoft Sizing Guidelines - VirtualMuleys
MuleSoft Sizing Guidelines - VirtualMuleysMuleSoft Sizing Guidelines - VirtualMuleys
MuleSoft Sizing Guidelines - VirtualMuleys
Angel Alberici
 
KUDO - Kubernetes Operators, the easy way
KUDO - Kubernetes Operators, the easy wayKUDO - Kubernetes Operators, the easy way
KUDO - Kubernetes Operators, the easy way
Nick Jones
 
CI/CD Pipeline with Kubernetes
CI/CD Pipeline with KubernetesCI/CD Pipeline with Kubernetes
CI/CD Pipeline with Kubernetes
Mukesh Singh
 
Designing a complete ci cd pipeline using argo events, workflow and cd products
Designing a complete ci cd pipeline using argo events, workflow and cd productsDesigning a complete ci cd pipeline using argo events, workflow and cd products
Designing a complete ci cd pipeline using argo events, workflow and cd products
Julian Mazzitelli
 
Devops with Python by Yaniv Cohen DevopShift
Devops with Python by Yaniv Cohen DevopShiftDevops with Python by Yaniv Cohen DevopShift
Devops with Python by Yaniv Cohen DevopShift
Yaniv cohen
 
GCP Meetup #3 - Approaches to Cloud Native Architectures
GCP Meetup #3 - Approaches to Cloud Native ArchitecturesGCP Meetup #3 - Approaches to Cloud Native Architectures
GCP Meetup #3 - Approaches to Cloud Native Architectures
nine
 
Codecoon - A technical Case Study
Codecoon - A technical Case StudyCodecoon - A technical Case Study
Codecoon - A technical Case Study
Michael Lihs
 
Sprint 45 review
Sprint 45 reviewSprint 45 review
Sprint 45 review
ManageIQ
 
The journey of Moving from AWS ELK to GCP Data Pipeline
The journey of Moving from AWS ELK to GCP Data PipelineThe journey of Moving from AWS ELK to GCP Data Pipeline
The journey of Moving from AWS ELK to GCP Data Pipeline
Randy Huang
 
Kubecon seattle 2018 recap - Application Deployment aspects
Kubecon seattle 2018 recap - Application Deployment aspectsKubecon seattle 2018 recap - Application Deployment aspects
Kubecon seattle 2018 recap - Application Deployment aspects
Krishna-Kumar
 
Aws Deployment Tools - Overview, Details, Implementation
Aws Deployment Tools - Overview, Details, ImplementationAws Deployment Tools - Overview, Details, Implementation
Aws Deployment Tools - Overview, Details, Implementation
serkancapkan
 

Similar to Why use Gitlab (20)

Continuos Integration and Delivery: from Zero to Hero with TeamCity, Docker a...
Continuos Integration and Delivery: from Zero to Hero with TeamCity, Docker a...Continuos Integration and Delivery: from Zero to Hero with TeamCity, Docker a...
Continuos Integration and Delivery: from Zero to Hero with TeamCity, Docker a...
 
Free GitOps Workshop
Free GitOps WorkshopFree GitOps Workshop
Free GitOps Workshop
 
Kubernetes Failure Stories, or: How to Crash Your Cluster - ContainerDays EU ...
Kubernetes Failure Stories, or: How to Crash Your Cluster - ContainerDays EU ...Kubernetes Failure Stories, or: How to Crash Your Cluster - ContainerDays EU ...
Kubernetes Failure Stories, or: How to Crash Your Cluster - ContainerDays EU ...
 
Running Airflow Workflows as ETL Processes on Hadoop
Running Airflow Workflows as ETL Processes on HadoopRunning Airflow Workflows as ETL Processes on Hadoop
Running Airflow Workflows as ETL Processes on Hadoop
 
Data(?)Ops with CircleCI
Data(?)Ops with CircleCIData(?)Ops with CircleCI
Data(?)Ops with CircleCI
 
Gitlab ci, cncf.sk
Gitlab ci, cncf.skGitlab ci, cncf.sk
Gitlab ci, cncf.sk
 
Reduce SRE Stress: Minimizing Service Downtime with Grafana, InfluxDB and Tel...
Reduce SRE Stress: Minimizing Service Downtime with Grafana, InfluxDB and Tel...Reduce SRE Stress: Minimizing Service Downtime with Grafana, InfluxDB and Tel...
Reduce SRE Stress: Minimizing Service Downtime with Grafana, InfluxDB and Tel...
 
Использование AzureDevOps при разработке микросервисных приложений
Использование AzureDevOps при разработке микросервисных приложенийИспользование AzureDevOps при разработке микросервисных приложений
Использование AzureDevOps при разработке микросервисных приложений
 
Free GitOps Workshop (with Intro to Kubernetes & GitOps)
Free GitOps Workshop (with Intro to Kubernetes & GitOps)Free GitOps Workshop (with Intro to Kubernetes & GitOps)
Free GitOps Workshop (with Intro to Kubernetes & GitOps)
 
MuleSoft Sizing Guidelines - VirtualMuleys
MuleSoft Sizing Guidelines - VirtualMuleysMuleSoft Sizing Guidelines - VirtualMuleys
MuleSoft Sizing Guidelines - VirtualMuleys
 
KUDO - Kubernetes Operators, the easy way
KUDO - Kubernetes Operators, the easy wayKUDO - Kubernetes Operators, the easy way
KUDO - Kubernetes Operators, the easy way
 
CI/CD Pipeline with Kubernetes
CI/CD Pipeline with KubernetesCI/CD Pipeline with Kubernetes
CI/CD Pipeline with Kubernetes
 
Designing a complete ci cd pipeline using argo events, workflow and cd products
Designing a complete ci cd pipeline using argo events, workflow and cd productsDesigning a complete ci cd pipeline using argo events, workflow and cd products
Designing a complete ci cd pipeline using argo events, workflow and cd products
 
Devops with Python by Yaniv Cohen DevopShift
Devops with Python by Yaniv Cohen DevopShiftDevops with Python by Yaniv Cohen DevopShift
Devops with Python by Yaniv Cohen DevopShift
 
GCP Meetup #3 - Approaches to Cloud Native Architectures
GCP Meetup #3 - Approaches to Cloud Native ArchitecturesGCP Meetup #3 - Approaches to Cloud Native Architectures
GCP Meetup #3 - Approaches to Cloud Native Architectures
 
Codecoon - A technical Case Study
Codecoon - A technical Case StudyCodecoon - A technical Case Study
Codecoon - A technical Case Study
 
Sprint 45 review
Sprint 45 reviewSprint 45 review
Sprint 45 review
 
The journey of Moving from AWS ELK to GCP Data Pipeline
The journey of Moving from AWS ELK to GCP Data PipelineThe journey of Moving from AWS ELK to GCP Data Pipeline
The journey of Moving from AWS ELK to GCP Data Pipeline
 
Kubecon seattle 2018 recap - Application Deployment aspects
Kubecon seattle 2018 recap - Application Deployment aspectsKubecon seattle 2018 recap - Application Deployment aspects
Kubecon seattle 2018 recap - Application Deployment aspects
 
Aws Deployment Tools - Overview, Details, Implementation
Aws Deployment Tools - Overview, Details, ImplementationAws Deployment Tools - Overview, Details, Implementation
Aws Deployment Tools - Overview, Details, Implementation
 

More from abenyeung1

ELK-Stack-Grid-KA-School.pptx
ELK-Stack-Grid-KA-School.pptxELK-Stack-Grid-KA-School.pptx
ELK-Stack-Grid-KA-School.pptx
abenyeung1
 
ELK stack introduction
ELK stack introduction ELK stack introduction
ELK stack introduction
abenyeung1
 
F5 Distributed Cloud.pptx
F5 Distributed Cloud.pptxF5 Distributed Cloud.pptx
F5 Distributed Cloud.pptx
abenyeung1
 
HashiTalk
HashiTalkHashiTalk
HashiTalk
abenyeung1
 
F5 and HashiCorp Multi-Cloud
F5 and HashiCorp Multi-CloudF5 and HashiCorp Multi-Cloud
F5 and HashiCorp Multi-Cloud
abenyeung1
 
7130 layer-1-datasheet
7130 layer-1-datasheet7130 layer-1-datasheet
7130 layer-1-datasheet
abenyeung1
 
Itt provision of wi fi network design and implementation services
Itt   provision of wi fi network design and implementation servicesItt   provision of wi fi network design and implementation services
Itt provision of wi fi network design and implementation services
abenyeung1
 
Ccs 720 xp-datasheet
Ccs 720 xp-datasheetCcs 720 xp-datasheet
Ccs 720 xp-datasheet
abenyeung1
 
Wifi rfp-sample1
Wifi rfp-sample1Wifi rfp-sample1
Wifi rfp-sample1
abenyeung1
 

More from abenyeung1 (9)

ELK-Stack-Grid-KA-School.pptx
ELK-Stack-Grid-KA-School.pptxELK-Stack-Grid-KA-School.pptx
ELK-Stack-Grid-KA-School.pptx
 
ELK stack introduction
ELK stack introduction ELK stack introduction
ELK stack introduction
 
F5 Distributed Cloud.pptx
F5 Distributed Cloud.pptxF5 Distributed Cloud.pptx
F5 Distributed Cloud.pptx
 
HashiTalk
HashiTalkHashiTalk
HashiTalk
 
F5 and HashiCorp Multi-Cloud
F5 and HashiCorp Multi-CloudF5 and HashiCorp Multi-Cloud
F5 and HashiCorp Multi-Cloud
 
7130 layer-1-datasheet
7130 layer-1-datasheet7130 layer-1-datasheet
7130 layer-1-datasheet
 
Itt provision of wi fi network design and implementation services
Itt   provision of wi fi network design and implementation servicesItt   provision of wi fi network design and implementation services
Itt provision of wi fi network design and implementation services
 
Ccs 720 xp-datasheet
Ccs 720 xp-datasheetCcs 720 xp-datasheet
Ccs 720 xp-datasheet
 
Wifi rfp-sample1
Wifi rfp-sample1Wifi rfp-sample1
Wifi rfp-sample1
 

Recently uploaded

快速办理(Vic毕业证书)惠灵顿维多利亚大学毕业证完成信一模一样
快速办理(Vic毕业证书)惠灵顿维多利亚大学毕业证完成信一模一样快速办理(Vic毕业证书)惠灵顿维多利亚大学毕业证完成信一模一样
快速办理(Vic毕业证书)惠灵顿维多利亚大学毕业证完成信一模一样
3a0sd7z3
 
Should Repositories Participate in the Fediverse?
Should Repositories Participate in the Fediverse?Should Repositories Participate in the Fediverse?
Should Repositories Participate in the Fediverse?
Paul Walk
 
快速办理(新加坡SMU毕业证书)新加坡管理大学毕业证文凭证书一模一样
快速办理(新加坡SMU毕业证书)新加坡管理大学毕业证文凭证书一模一样快速办理(新加坡SMU毕业证书)新加坡管理大学毕业证文凭证书一模一样
快速办理(新加坡SMU毕业证书)新加坡管理大学毕业证文凭证书一模一样
3a0sd7z3
 
Ready to Unlock the Power of Blockchain!
Ready to Unlock the Power of Blockchain!Ready to Unlock the Power of Blockchain!
Ready to Unlock the Power of Blockchain!
Toptal Tech
 
怎么办理(umiami毕业证书)美国迈阿密大学毕业证文凭证书实拍图原版一模一样
怎么办理(umiami毕业证书)美国迈阿密大学毕业证文凭证书实拍图原版一模一样怎么办理(umiami毕业证书)美国迈阿密大学毕业证文凭证书实拍图原版一模一样
怎么办理(umiami毕业证书)美国迈阿密大学毕业证文凭证书实拍图原版一模一样
rtunex8r
 
Bengaluru Dreamin' 24 - Personal Branding
Bengaluru Dreamin' 24 - Personal BrandingBengaluru Dreamin' 24 - Personal Branding
Bengaluru Dreamin' 24 - Personal Branding
Tarandeep Singh
 
HijackLoader Evolution: Interactive Process Hollowing
HijackLoader Evolution: Interactive Process HollowingHijackLoader Evolution: Interactive Process Hollowing
HijackLoader Evolution: Interactive Process Hollowing
Donato Onofri
 
manuaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaal
manuaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaalmanuaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaal
manuaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaal
wolfsoftcompanyco
 
成绩单ps(UST毕业证)圣托马斯大学毕业证成绩单快速办理
成绩单ps(UST毕业证)圣托马斯大学毕业证成绩单快速办理成绩单ps(UST毕业证)圣托马斯大学毕业证成绩单快速办理
成绩单ps(UST毕业证)圣托马斯大学毕业证成绩单快速办理
ysasp1
 
办理毕业证(NYU毕业证)纽约大学毕业证成绩单官方原版办理
办理毕业证(NYU毕业证)纽约大学毕业证成绩单官方原版办理办理毕业证(NYU毕业证)纽约大学毕业证成绩单官方原版办理
办理毕业证(NYU毕业证)纽约大学毕业证成绩单官方原版办理
uehowe
 
存档可查的(USC毕业证)南加利福尼亚大学毕业证成绩单制做办理
存档可查的(USC毕业证)南加利福尼亚大学毕业证成绩单制做办理存档可查的(USC毕业证)南加利福尼亚大学毕业证成绩单制做办理
存档可查的(USC毕业证)南加利福尼亚大学毕业证成绩单制做办理
fovkoyb
 
Discover the benefits of outsourcing SEO to India
Discover the benefits of outsourcing SEO to IndiaDiscover the benefits of outsourcing SEO to India
Discover the benefits of outsourcing SEO to India
davidjhones387
 
一比一原版(USYD毕业证)悉尼大学毕业证如何办理
一比一原版(USYD毕业证)悉尼大学毕业证如何办理一比一原版(USYD毕业证)悉尼大学毕业证如何办理
一比一原版(USYD毕业证)悉尼大学毕业证如何办理
k4ncd0z
 
办理新西兰奥克兰大学毕业证学位证书范本原版一模一样
办理新西兰奥克兰大学毕业证学位证书范本原版一模一样办理新西兰奥克兰大学毕业证学位证书范本原版一模一样
办理新西兰奥克兰大学毕业证学位证书范本原版一模一样
xjq03c34
 
办理毕业证(UPenn毕业证)宾夕法尼亚大学毕业证成绩单快速办理
办理毕业证(UPenn毕业证)宾夕法尼亚大学毕业证成绩单快速办理办理毕业证(UPenn毕业证)宾夕法尼亚大学毕业证成绩单快速办理
办理毕业证(UPenn毕业证)宾夕法尼亚大学毕业证成绩单快速办理
uehowe
 
留学挂科(UofM毕业证)明尼苏达大学毕业证成绩单复刻办理
留学挂科(UofM毕业证)明尼苏达大学毕业证成绩单复刻办理留学挂科(UofM毕业证)明尼苏达大学毕业证成绩单复刻办理
留学挂科(UofM毕业证)明尼苏达大学毕业证成绩单复刻办理
uehowe
 

Recently uploaded (16)

快速办理(Vic毕业证书)惠灵顿维多利亚大学毕业证完成信一模一样
快速办理(Vic毕业证书)惠灵顿维多利亚大学毕业证完成信一模一样快速办理(Vic毕业证书)惠灵顿维多利亚大学毕业证完成信一模一样
快速办理(Vic毕业证书)惠灵顿维多利亚大学毕业证完成信一模一样
 
Should Repositories Participate in the Fediverse?
Should Repositories Participate in the Fediverse?Should Repositories Participate in the Fediverse?
Should Repositories Participate in the Fediverse?
 
快速办理(新加坡SMU毕业证书)新加坡管理大学毕业证文凭证书一模一样
快速办理(新加坡SMU毕业证书)新加坡管理大学毕业证文凭证书一模一样快速办理(新加坡SMU毕业证书)新加坡管理大学毕业证文凭证书一模一样
快速办理(新加坡SMU毕业证书)新加坡管理大学毕业证文凭证书一模一样
 
Ready to Unlock the Power of Blockchain!
Ready to Unlock the Power of Blockchain!Ready to Unlock the Power of Blockchain!
Ready to Unlock the Power of Blockchain!
 
怎么办理(umiami毕业证书)美国迈阿密大学毕业证文凭证书实拍图原版一模一样
怎么办理(umiami毕业证书)美国迈阿密大学毕业证文凭证书实拍图原版一模一样怎么办理(umiami毕业证书)美国迈阿密大学毕业证文凭证书实拍图原版一模一样
怎么办理(umiami毕业证书)美国迈阿密大学毕业证文凭证书实拍图原版一模一样
 
Bengaluru Dreamin' 24 - Personal Branding
Bengaluru Dreamin' 24 - Personal BrandingBengaluru Dreamin' 24 - Personal Branding
Bengaluru Dreamin' 24 - Personal Branding
 
HijackLoader Evolution: Interactive Process Hollowing
HijackLoader Evolution: Interactive Process HollowingHijackLoader Evolution: Interactive Process Hollowing
HijackLoader Evolution: Interactive Process Hollowing
 
manuaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaal
manuaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaalmanuaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaal
manuaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaal
 
成绩单ps(UST毕业证)圣托马斯大学毕业证成绩单快速办理
成绩单ps(UST毕业证)圣托马斯大学毕业证成绩单快速办理成绩单ps(UST毕业证)圣托马斯大学毕业证成绩单快速办理
成绩单ps(UST毕业证)圣托马斯大学毕业证成绩单快速办理
 
办理毕业证(NYU毕业证)纽约大学毕业证成绩单官方原版办理
办理毕业证(NYU毕业证)纽约大学毕业证成绩单官方原版办理办理毕业证(NYU毕业证)纽约大学毕业证成绩单官方原版办理
办理毕业证(NYU毕业证)纽约大学毕业证成绩单官方原版办理
 
存档可查的(USC毕业证)南加利福尼亚大学毕业证成绩单制做办理
存档可查的(USC毕业证)南加利福尼亚大学毕业证成绩单制做办理存档可查的(USC毕业证)南加利福尼亚大学毕业证成绩单制做办理
存档可查的(USC毕业证)南加利福尼亚大学毕业证成绩单制做办理
 
Discover the benefits of outsourcing SEO to India
Discover the benefits of outsourcing SEO to IndiaDiscover the benefits of outsourcing SEO to India
Discover the benefits of outsourcing SEO to India
 
一比一原版(USYD毕业证)悉尼大学毕业证如何办理
一比一原版(USYD毕业证)悉尼大学毕业证如何办理一比一原版(USYD毕业证)悉尼大学毕业证如何办理
一比一原版(USYD毕业证)悉尼大学毕业证如何办理
 
办理新西兰奥克兰大学毕业证学位证书范本原版一模一样
办理新西兰奥克兰大学毕业证学位证书范本原版一模一样办理新西兰奥克兰大学毕业证学位证书范本原版一模一样
办理新西兰奥克兰大学毕业证学位证书范本原版一模一样
 
办理毕业证(UPenn毕业证)宾夕法尼亚大学毕业证成绩单快速办理
办理毕业证(UPenn毕业证)宾夕法尼亚大学毕业证成绩单快速办理办理毕业证(UPenn毕业证)宾夕法尼亚大学毕业证成绩单快速办理
办理毕业证(UPenn毕业证)宾夕法尼亚大学毕业证成绩单快速办理
 
留学挂科(UofM毕业证)明尼苏达大学毕业证成绩单复刻办理
留学挂科(UofM毕业证)明尼苏达大学毕业证成绩单复刻办理留学挂科(UofM毕业证)明尼苏达大学毕业证成绩单复刻办理
留学挂科(UofM毕业证)明尼苏达大学毕业证成绩单复刻办理
 

Why use Gitlab

  • 1. Why and How to Run Your Own Gitlab Runner Fleet Casey Zednick Principal Software Engineer – DE-Tools
  • 2. ©2022 F5 2 1. Learning briefly about Gitlab. 2. Exploring the benefits of running your own Gitlab runners. 3. Understanding costs. 4. Designing an auto-scale runner fleet. 5. Creating and meeting service level objectives (SLOs). 6. Listing of resources. Agenda
  • 3. ©2022 F5 3 What is Gitlab? • Gitlab is a complete development platform that supports git repositories, issue tracking, code reviews, and CI/CD • Comes in four main offerings • Community Edition - Open-Source Software (MIT licenses) B A C K G R O U N D
  • 4. ©2022 F5 4 What are Gitlab Runners? • Run the actual jobs of Gitlab’s continuous integration / continuous delivery (CI/CD) pipelines. • Two main parts: • runners - talks to the API and executors. • executors – do the actual CI/CD work. • For example, runners pickup tasks, such as building, testing, and deploying the code, and given them to executors, which run the actual commands. B A C K G R O U N D
  • 5. ©2022 F5 5 Why Run Your Own Runners? B E N E F I T S Availability • Gitlab’s SaaS runners are in GCP. If your resources are in AWS, Azure, etc… emergent cross-cloud issues impact availability • Ability to monitor, diagnose, and mitigate emergent conditions without support ticket response times Security • No chance of your pipeline variables being leaked to other tenants due to runner bugs or misconfiguration • Control of the runner and executor hosts to enable increased supply chain security Features • Enable right-sizing VMs to job workloads instead of using Gitlab’s single shared type • Run jobs on non-Intel VMs Cost • Run jobs for less than Gitlab’s $10 for 1,000 minutes ($0.60 per hour)
  • 6. ©2022 F5 6 Understanding Costs of Running a Fleet Understanding how much it costs to run your own runner fleet isn’t easy You might think you can look at how many hours of job time you need and multiple that by your VM costs In practice, it’s not this simple. Nevertheless, I’ll show you how to forecast your expected total costs, so you can make an informed decision C O S T S Fixed costs: • Personnel time (omitted in this model) • Compute and storage for runners • Compute and storage for central metrics/logs Dynamic costs: • Compute for executors (the VMs running the actual jobs) • Storage for job artifacts
  • 8. ©2022 F5 8 Optimizing Fleet Compute Costs C O S T S config.toml 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 [[runners.machine.autoscaling]] Periods = ["* * * * * * *"] # Default no idle runners IdleCount = 0 IdleTime = 600 Timezone = "America/Los_Angeles" [[runners.machine.autoscaling]] Periods = ["* * 8-17 * * mon-fri *"] # Pacific business hours IdleCount = 40 IdleCountMin = 5 IdleScaleFactor = 1.5 # Means that current number of Idle machines will be 1.5*in- use machines IdleTime = 1200 Timezone = "America/Los_Angeles” [[runners.machine.autoscaling]] Periods = ["* * 10-15 * * mon-fri *"] # Pacific peak development IdleCount = 60 IdleCountMin = 10 IdleScaleFactor = 1.5 # Means that current number of Idle machines will be 1.5*in- use machines IdleTime = 1200 Timezone = "America/Los_Angeles" • Scale idle count by time of day • Right size runners: our least expensive executor is 8 times cheaper than our most expensive • For most expensive runners keep zero idle count as idle count costs add up quickly
  • 9. ©2022 F5 9 Understanding the Key Parts of Fleet Design • Gitlab Instance (On-Prem / SaaS) • Hosts source code and job artifacts • API for runners to get pipeline jobs • Runners • Retrieve jobs from the Gitlab API and give them to executors • Controls how many executors are available for jobs • Executors • Run actual workloads - think go build in Docker container D E S I G N
  • 10. ©2022 F5 10 Hosting the Runner (Control) D E S I G N deployment.yaml 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 apiVersion: apps/v1 kind: Deployment metadata: name: nginx-autoscale-uswest3-small-runner namespace: nginx-autoscale-uswest3-small-runner labels: app: nginx-autoscale-uswest3-small-runner spec: strategy: type: Recreate replicas: 1 selector: matchLabels: app: nginx-autoscale-uswest3-small-runner template: metadata: labels: app: nginx-autoscale-uswest3-small-runner spec: containers: - name: nginx-autoscale-uswest3-small-runner image: gitlab/gitlab-runner:v14.5.2 • Long running daemon process • Doesn’t have to share the same network or cloud as the docker+machine executors • One runner can control many executors. • We use one runner per executor and deploy it in its own k8s namespace to allow for more control in config updates
  • 11. ©2022 F5 11 Understanding Autoscaling Executors (Compute) • Kubernetes • Quick job starts • Scales using k8s horizontal pod autoscaling • Problematic for Docker-in-Docker (DIND) use • docker+machine (what I’m covering) • Quickish job starts with tuned idle pool • Orchestrates cloud VMs • High isolation and support for DIND D E S I G N docker+machine VM lifecycle 1. VM provisioned via cloud APIs 2. VM OS updated and configure via ssh 3. VM’s Docker pulls job’s Docker container 4. Setup scripts ran in container 5. Source code checked out 6. Pipeline job script: code ran 7. Container exits 8. VM destroyed or returned to idle pool
  • 12. ©2022 F5 12 Hey, my pipeline job… Ensuring your fleet meets your needs by using site reliability engineering (SRE)
  • 13. ©2022 F5 13 Establishing Service Level Objectives (SLOs) S E R V I C E L E V E L O B J E C T I V E S ( S L O S ) Volume • How many jobs must you support per hour? 50, 100s, 1,000? • Our volume SLO is 45% of our compute capacity 99% of the time Availability • How long can a job wait before you consider the system down? • Our availability SLO is 99% of jobs start in under 300 seconds Latency • How quickly do you developers need to have their jobs serviced? • Our latency SLO is 95% of our job start in 20 seconds Errors • How many errors are normal? • Our error SLO is less than 12 failed jobs per minute. Might sound high but in our experience, everything is good until it isn’t
  • 14. ©2022 F5 14 Gathering Service Level Indicators (SLIs) • Gitlab’s API • Job queued_duration • Job status success, failed, etc. • Gitlab Runner Logs • Cloud API errors • General orchestration info • Gitlab Runners Prometheus Endpoint • http://localhost:9252/metrics • Good scaling information • http:/ S E R V I C E L E V E L O B J E C T I V E S ( S L O S ) Note: many low-level metrics like docker pull times or times at various executor stages aren’t exposed in easy-to-use fashion. :/
  • 15. ©2022 F5 15 Visualizing Service Level Indicators (SLIs) S E R V I C E L E V E L O B J E C T I V E S ( S L O S )
  • 16. ©2022 F5 16 Resources • Gitlab Runner Overview • Gitlab Fleet Scaling • Excel runner cost model R E S O U R C E S