SlideShare a Scribd company logo
1 of 30
AWS Techniques and
lessons writing a minimal
cost gitlab runner
February 2023
● Principal Engineer for Digio
● Focus on platform engineering
● Background in development
● 10 years AWS experience
● Worked ~2 years each in Azure
and GCP
● 12 years Infrastructure as Code
● Passion for automating things
● 4 years Terraform experience
● Terraform associate certified
● Previous AWS associate
certified but now I’m lazy
Overview of
Digio & Mantel
Group
Digio and Mantel Group
Melbourne
Sydney
Brisbane
Auckland
Queenstown
Magnetic Island
Perth
Adelaide
We’re an Australian-owned, Principle based technology-
led consulting business founded in Melbourne.
Digio is Australia’s Premier Digital Services provider from concept to
production, continually evolving alongside technologies and method.
We are a dynamic business established in November 2017 and have
grown to a team of over 200 across Australia and New Zealand.
We are part of the broader Mantel Group currently comprised of 9
technology brands and a total team size of over 800. As a group we
have been recognised in the AFR’s 2020 fastest growing companies,
achieved #1 Best Place to Work for 2021 and 2022 in the Great Place
to Work Survey and awarded AWS 2022 Services Partner and
Migration Partner of the year.
Hobart
Mantel Group Brands
Working with Mantel Group not only enables access to expertise within Digio, but across all current and future brands.
A broad end-to-end capability that is vendor agnostic, yet has deep specialisations…
Software
Engineering (API)
Software
Engineering (QA)
Platform
Enablement
Software
Engineering (.NET)
Security & Identity
Managed Services
Data & Analytics
Data Strategy
Analytics & BI
Advanced Analytics
Platform Agnostic
Data Engineering
Technology
Strategy & Advisory
Software
Engineering (Web)
Application
Modernisation
Capabilities
Capabilities Capabilities
Cloud Native
Migration
Security
Data & Analytics
Managed Services
Digital Workplace
Capabilities
Automation &
DevOps
Cloud Computing
Analytics &
Machine Learning
Security & Identity
MarTech
Collaboration &
Productivity
Capabilities
Training &
Certification
Application
Transformation
Capabilities
Pursuit Model
Discovery Sprints
Rapid Prototyping
Service Design
ML Engineering
UX/UI Design
Software
Engineering (Mobile)
Capabilities
Platform
Enablement
Data Engineering
Data Architecture
Training &
Certification
Capabilities
Native Mobile
Technology
Strategy
Native Mobile
Product / Design
Strategy
Software
Engineering
(Android)
Software
Engineering (iOS)
Delivery & Method
Advanced Analytics
Capabilities
Data Engineering
Data Architecture
Data Strategy
Analytics & BI
Coaching & Training
The problem we faced
Our Solution
https://github.com/cmdlabs/terraform-aws-gitlab-runner-scale
https://registry.terraform.io/modules/cmdlabs/gitlab-runner-scale/aws
Function URL vs EventBridge with polling
The webhook is:
● Faster to respond to events as
it runs ~instantly
● Zero AWS cost to enable
● Cheaper if the repository /
runner activity is low
● Could be abused via third
parties executing the function
without security permissions.
EventBridge is:
● More predictable in terms of
AWS spend
● 14 millions free invocations
● Slow to respond
● Lower cost if the GitLab
project activity is high
● Make use of:
○ CloudWatch metrics and CloudWatch alarms
○ Triggers on auto-scale group
○ Scale policies to determine how many instances to scale
Scaling Out
● Requires multiple inputs and considerations
○ Avoid churn of runners
● Scale down based on load (number of active runners and jobs in the queue)
● Make use of a premature transition to states (see Avoiding premature
transitions to alarm state)
○ AWS alarms include logic to try to avoid false alarms
○ CloudWatch waits the full N periods before alarming
○ Any time metric above the threshold the alarm "timer" is effectively reset.
● The tradeoff longer idle time with additional cost
Scaling In
Cost Estimation
Lambda
● Running the lambda via (In the ap-southeast-2
region):
○ x86 architecture
○ 1 request per minute
○ 2000ms duration
○ 128mb memory allocated
○ 512mb ephemeral storage (default)
● Free tier cost $0.00 a month.
● Without the free tier $0.19 USD (43,800
invocations)
Runner (EC2)
● t3.medium spot instance(s) 5 hours over the
month at the average price of $0.0158 is
$0.079 a month
● A t3.medium on demand instance(s) 5 hours
over the month at the average price of
$0.0528 is $0.264 a month
● Trade off speed to respond
due to runner startup
● Likely not ideal for high
activity pipelines
● Small pipelines that trigger
after hours
Cost Estimation vs Docker machine
● Install and register GitLab
Runner for autoscaling with
Docker Machine
○ ~$10 a month for a pilot instance
running 24/7
● Patching and maintenance
● Verification
● Troubleshooting
● Internally we had issues with
SSH access
● Overall cost becomes a lot
higher
●Nice to just have it work
Terraform tips and tricks
● Diagrams and pictures
● Working examples
● Example why, not what
Auto generated Terraform docs - https://terraform-docs.io/
● Variable validation
● Ensure we pass in valid data
● Can never be sure what users will pass in
● Sort attributes alphabetically
○ Reduces cognitive load
● Order resources logically
○ If the same resource, alphabetically
● Multiple tf files
● Split via high level resource type
● Reduces cognitive load
● Reduces visual complexity
●Reduce
duplication with
locals
●Move complex
operations into
locals
●Magic strings
● Infer data where possible
● Reduces input requirements
● Reduces possible mistakes
○ VPC and subnets not aligned
Demo
Thank you

More Related Content

Similar to AWS GitLab runner scaling with Terraform

What is Google Cloud Platform - GDG DevFest 18 Depok
What is Google Cloud Platform - GDG DevFest 18 DepokWhat is Google Cloud Platform - GDG DevFest 18 Depok
What is Google Cloud Platform - GDG DevFest 18 DepokImre Nagi
 
GDG Heraklion - Architecting for the Google Cloud Platform
GDG Heraklion - Architecting for the Google Cloud PlatformGDG Heraklion - Architecting for the Google Cloud Platform
GDG Heraklion - Architecting for the Google Cloud PlatformMárton Kodok
 
PyConIT 2018 Writing and deploying serverless python applications
PyConIT 2018 Writing and deploying serverless python applicationsPyConIT 2018 Writing and deploying serverless python applications
PyConIT 2018 Writing and deploying serverless python applicationsCesar Cardenas Desales
 
Infrastructure Agnostic Machine Learning Workload Deployment
Infrastructure Agnostic Machine Learning Workload DeploymentInfrastructure Agnostic Machine Learning Workload Deployment
Infrastructure Agnostic Machine Learning Workload DeploymentDatabricks
 
Make your data fly - Building data platform in AWS
Make your data fly - Building data platform in AWSMake your data fly - Building data platform in AWS
Make your data fly - Building data platform in AWSKimmo Kantojärvi
 
Bhadale group of companies - Our project works
Bhadale group of companies - Our project worksBhadale group of companies - Our project works
Bhadale group of companies - Our project worksVijayananda Mohire
 
Cloud comparison - AWS vs Azure vs Google
Cloud comparison - AWS vs Azure vs GoogleCloud comparison - AWS vs Azure vs Google
Cloud comparison - AWS vs Azure vs GooglePatrick Pierson
 
Building what's next with google cloud's powerful infrastructure
Building what's next with google cloud's powerful infrastructureBuilding what's next with google cloud's powerful infrastructure
Building what's next with google cloud's powerful infrastructureMediaAgility
 
AWS Sydney Summit 2013 - Big Data Analytics
AWS Sydney Summit 2013 - Big Data AnalyticsAWS Sydney Summit 2013 - Big Data Analytics
AWS Sydney Summit 2013 - Big Data AnalyticsAmazon Web Services
 
AWS Big Data Demystified #1.2 | Big Data architecture lessons learned
AWS Big Data Demystified #1.2 | Big Data architecture lessons learned AWS Big Data Demystified #1.2 | Big Data architecture lessons learned
AWS Big Data Demystified #1.2 | Big Data architecture lessons learned Omid Vahdaty
 
[db tech showcase Tokyo 2018] #dbts2018 #B33 『DBA 18.0 - Life after 18c』
[db tech showcase Tokyo 2018] #dbts2018 #B33 『DBA 18.0 - Life after 18c』[db tech showcase Tokyo 2018] #dbts2018 #B33 『DBA 18.0 - Life after 18c』
[db tech showcase Tokyo 2018] #dbts2018 #B33 『DBA 18.0 - Life after 18c』Insight Technology, Inc.
 
Cloud computing
Cloud computingCloud computing
Cloud computingYash Patel
 
Building a Cross Cloud Data Protection Engine
Building a Cross Cloud Data Protection EngineBuilding a Cross Cloud Data Protection Engine
Building a Cross Cloud Data Protection EngineDatabricks
 
Google Cloud Next '22 Recap: Serverless & Data edition
Google Cloud Next '22 Recap: Serverless & Data editionGoogle Cloud Next '22 Recap: Serverless & Data edition
Google Cloud Next '22 Recap: Serverless & Data editionDaniel Zivkovic
 
Google Cloud Fundamentals
Google Cloud Fundamentals Google Cloud Fundamentals
Google Cloud Fundamentals Omar Fathy
 
Writing and deploying serverless python applications
Writing and deploying serverless python applicationsWriting and deploying serverless python applications
Writing and deploying serverless python applicationsCesar Cardenas Desales
 
Big Data in 200 km/h | AWS Big Data Demystified #1.3
Big Data in 200 km/h | AWS Big Data Demystified #1.3  Big Data in 200 km/h | AWS Big Data Demystified #1.3
Big Data in 200 km/h | AWS Big Data Demystified #1.3 Omid Vahdaty
 
GDG DevFest Romania - Architecting for the Google Cloud Platform
GDG DevFest Romania - Architecting for the Google Cloud PlatformGDG DevFest Romania - Architecting for the Google Cloud Platform
GDG DevFest Romania - Architecting for the Google Cloud PlatformMárton Kodok
 

Similar to AWS GitLab runner scaling with Terraform (20)

What is Google Cloud Platform - GDG DevFest 18 Depok
What is Google Cloud Platform - GDG DevFest 18 DepokWhat is Google Cloud Platform - GDG DevFest 18 Depok
What is Google Cloud Platform - GDG DevFest 18 Depok
 
GDG Heraklion - Architecting for the Google Cloud Platform
GDG Heraklion - Architecting for the Google Cloud PlatformGDG Heraklion - Architecting for the Google Cloud Platform
GDG Heraklion - Architecting for the Google Cloud Platform
 
PyConIT 2018 Writing and deploying serverless python applications
PyConIT 2018 Writing and deploying serverless python applicationsPyConIT 2018 Writing and deploying serverless python applications
PyConIT 2018 Writing and deploying serverless python applications
 
Infrastructure Agnostic Machine Learning Workload Deployment
Infrastructure Agnostic Machine Learning Workload DeploymentInfrastructure Agnostic Machine Learning Workload Deployment
Infrastructure Agnostic Machine Learning Workload Deployment
 
Make your data fly - Building data platform in AWS
Make your data fly - Building data platform in AWSMake your data fly - Building data platform in AWS
Make your data fly - Building data platform in AWS
 
Bhadale group of companies - Our project works
Bhadale group of companies - Our project worksBhadale group of companies - Our project works
Bhadale group of companies - Our project works
 
Cloud comparison - AWS vs Azure vs Google
Cloud comparison - AWS vs Azure vs GoogleCloud comparison - AWS vs Azure vs Google
Cloud comparison - AWS vs Azure vs Google
 
Dagster @ R&S MNT
Dagster @ R&S MNTDagster @ R&S MNT
Dagster @ R&S MNT
 
Building what's next with google cloud's powerful infrastructure
Building what's next with google cloud's powerful infrastructureBuilding what's next with google cloud's powerful infrastructure
Building what's next with google cloud's powerful infrastructure
 
AWS Sydney Summit 2013 - Big Data Analytics
AWS Sydney Summit 2013 - Big Data AnalyticsAWS Sydney Summit 2013 - Big Data Analytics
AWS Sydney Summit 2013 - Big Data Analytics
 
AWS Big Data Demystified #1.2 | Big Data architecture lessons learned
AWS Big Data Demystified #1.2 | Big Data architecture lessons learned AWS Big Data Demystified #1.2 | Big Data architecture lessons learned
AWS Big Data Demystified #1.2 | Big Data architecture lessons learned
 
[db tech showcase Tokyo 2018] #dbts2018 #B33 『DBA 18.0 - Life after 18c』
[db tech showcase Tokyo 2018] #dbts2018 #B33 『DBA 18.0 - Life after 18c』[db tech showcase Tokyo 2018] #dbts2018 #B33 『DBA 18.0 - Life after 18c』
[db tech showcase Tokyo 2018] #dbts2018 #B33 『DBA 18.0 - Life after 18c』
 
Cloud computing
Cloud computingCloud computing
Cloud computing
 
Building a Cross Cloud Data Protection Engine
Building a Cross Cloud Data Protection EngineBuilding a Cross Cloud Data Protection Engine
Building a Cross Cloud Data Protection Engine
 
Google Cloud Next '22 Recap: Serverless & Data edition
Google Cloud Next '22 Recap: Serverless & Data editionGoogle Cloud Next '22 Recap: Serverless & Data edition
Google Cloud Next '22 Recap: Serverless & Data edition
 
Ml ops on AWS
Ml ops on AWSMl ops on AWS
Ml ops on AWS
 
Google Cloud Fundamentals
Google Cloud Fundamentals Google Cloud Fundamentals
Google Cloud Fundamentals
 
Writing and deploying serverless python applications
Writing and deploying serverless python applicationsWriting and deploying serverless python applications
Writing and deploying serverless python applications
 
Big Data in 200 km/h | AWS Big Data Demystified #1.3
Big Data in 200 km/h | AWS Big Data Demystified #1.3  Big Data in 200 km/h | AWS Big Data Demystified #1.3
Big Data in 200 km/h | AWS Big Data Demystified #1.3
 
GDG DevFest Romania - Architecting for the Google Cloud Platform
GDG DevFest Romania - Architecting for the Google Cloud PlatformGDG DevFest Romania - Architecting for the Google Cloud Platform
GDG DevFest Romania - Architecting for the Google Cloud Platform
 

Recently uploaded

Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsHyundai Motor Group
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Hyundai Motor Group
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 

Recently uploaded (20)

Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 

AWS GitLab runner scaling with Terraform

  • 1. AWS Techniques and lessons writing a minimal cost gitlab runner February 2023
  • 2.
  • 3. ● Principal Engineer for Digio ● Focus on platform engineering ● Background in development ● 10 years AWS experience ● Worked ~2 years each in Azure and GCP ● 12 years Infrastructure as Code ● Passion for automating things ● 4 years Terraform experience ● Terraform associate certified ● Previous AWS associate certified but now I’m lazy
  • 4. Overview of Digio & Mantel Group
  • 5. Digio and Mantel Group Melbourne Sydney Brisbane Auckland Queenstown Magnetic Island Perth Adelaide We’re an Australian-owned, Principle based technology- led consulting business founded in Melbourne. Digio is Australia’s Premier Digital Services provider from concept to production, continually evolving alongside technologies and method. We are a dynamic business established in November 2017 and have grown to a team of over 200 across Australia and New Zealand. We are part of the broader Mantel Group currently comprised of 9 technology brands and a total team size of over 800. As a group we have been recognised in the AFR’s 2020 fastest growing companies, achieved #1 Best Place to Work for 2021 and 2022 in the Great Place to Work Survey and awarded AWS 2022 Services Partner and Migration Partner of the year. Hobart
  • 6. Mantel Group Brands Working with Mantel Group not only enables access to expertise within Digio, but across all current and future brands. A broad end-to-end capability that is vendor agnostic, yet has deep specialisations… Software Engineering (API) Software Engineering (QA) Platform Enablement Software Engineering (.NET) Security & Identity Managed Services Data & Analytics Data Strategy Analytics & BI Advanced Analytics Platform Agnostic Data Engineering Technology Strategy & Advisory Software Engineering (Web) Application Modernisation Capabilities Capabilities Capabilities Cloud Native Migration Security Data & Analytics Managed Services Digital Workplace Capabilities Automation & DevOps Cloud Computing Analytics & Machine Learning Security & Identity MarTech Collaboration & Productivity Capabilities Training & Certification Application Transformation Capabilities Pursuit Model Discovery Sprints Rapid Prototyping Service Design ML Engineering UX/UI Design Software Engineering (Mobile) Capabilities Platform Enablement Data Engineering Data Architecture Training & Certification Capabilities Native Mobile Technology Strategy Native Mobile Product / Design Strategy Software Engineering (Android) Software Engineering (iOS) Delivery & Method Advanced Analytics Capabilities Data Engineering Data Architecture Data Strategy Analytics & BI Coaching & Training
  • 8.
  • 9.
  • 10.
  • 11.
  • 14.
  • 15.
  • 16. Function URL vs EventBridge with polling The webhook is: ● Faster to respond to events as it runs ~instantly ● Zero AWS cost to enable ● Cheaper if the repository / runner activity is low ● Could be abused via third parties executing the function without security permissions. EventBridge is: ● More predictable in terms of AWS spend ● 14 millions free invocations ● Slow to respond ● Lower cost if the GitLab project activity is high
  • 17. ● Make use of: ○ CloudWatch metrics and CloudWatch alarms ○ Triggers on auto-scale group ○ Scale policies to determine how many instances to scale Scaling Out
  • 18. ● Requires multiple inputs and considerations ○ Avoid churn of runners ● Scale down based on load (number of active runners and jobs in the queue) ● Make use of a premature transition to states (see Avoiding premature transitions to alarm state) ○ AWS alarms include logic to try to avoid false alarms ○ CloudWatch waits the full N periods before alarming ○ Any time metric above the threshold the alarm "timer" is effectively reset. ● The tradeoff longer idle time with additional cost Scaling In
  • 19. Cost Estimation Lambda ● Running the lambda via (In the ap-southeast-2 region): ○ x86 architecture ○ 1 request per minute ○ 2000ms duration ○ 128mb memory allocated ○ 512mb ephemeral storage (default) ● Free tier cost $0.00 a month. ● Without the free tier $0.19 USD (43,800 invocations) Runner (EC2) ● t3.medium spot instance(s) 5 hours over the month at the average price of $0.0158 is $0.079 a month ● A t3.medium on demand instance(s) 5 hours over the month at the average price of $0.0528 is $0.264 a month
  • 20. ● Trade off speed to respond due to runner startup ● Likely not ideal for high activity pipelines ● Small pipelines that trigger after hours Cost Estimation vs Docker machine ● Install and register GitLab Runner for autoscaling with Docker Machine ○ ~$10 a month for a pilot instance running 24/7 ● Patching and maintenance ● Verification ● Troubleshooting ● Internally we had issues with SSH access ● Overall cost becomes a lot higher ●Nice to just have it work
  • 22. ● Diagrams and pictures ● Working examples ● Example why, not what
  • 23. Auto generated Terraform docs - https://terraform-docs.io/
  • 24. ● Variable validation ● Ensure we pass in valid data ● Can never be sure what users will pass in
  • 25. ● Sort attributes alphabetically ○ Reduces cognitive load ● Order resources logically ○ If the same resource, alphabetically
  • 26. ● Multiple tf files ● Split via high level resource type ● Reduces cognitive load ● Reduces visual complexity
  • 28. ● Infer data where possible ● Reduces input requirements ● Reduces possible mistakes ○ VPC and subnets not aligned
  • 29. Demo

Editor's Notes

  1. Hi, I’m Anthony Scata and I’m going to talk about some of my experience, lessons, coding tips and tricks while on my journey to write a module for deploying GitLab runners in a cost effective manner. We will see how things go, may even show some live demos.
  2. Start by saying Happy Valentines day, hopefully by saying this i can gain some good karma from my wife so is likely sitting at home, angrily watching tv wondering where i am. I did ask her to join us but she wasn’t keen.
  3. As a good consultant i cannot start a presentation without talking about where I work
  4. Working in a consultancy we often have internal project, some of which are hosted in AWS. They are not business critical but may be a small application used by a few people, an internal project or a solution accelerator that we showcase to clients regarding latest technologies. The issue is that we don’t make money from these, as a professional services consultancy we have our team members billable to clients. To means most people are very busy working on client projects and can be taken off internal work for higher value work. It also means people are busy, trying to work internally and just getting things done, typically this means automation or infrastructure as code are on the back burner. It is quite ironic that a company that works so much in the CI/CD space has very little maturity internally, but as mentioned this isn’t how we many money. As time is tough to come by and client projects can pop up consistency for internal projects is often an issue and a lot of projects becomes orphaned with little to no support.
  5. Engineers come onto these project, implement something simple, rarely with time to make it better or easier just doing what they know. Over time this leads to a large mess of reinvention or solutions, especially infrastructure as code all bandaged together. As automation is usually not people's expertise, and as most of us know, is overlooked until the next person comes long to see the dumpster fire of setup, continuing the cycle.
  6. If people do look at automation it often gets expensive, both in time to set up and then maintain. Any system left on needs to be patched, verified and validated and monitored.We have found this to be a large sink of money especially for projects that are rarely touched. We often float the idea of centrally managed runners but then we have an issue with ownership, cost allocation, debugging, generally usage, it becomes very painful.
  7. The solution was for something that needed to be low cost, easy to build, maintain and reproducible that could be deployed into any aws account or region High quality so it can be reused on another project and not falling over every few months. With the advent of serverless technologies, they provide a great approach for not needing to patch or upgrade running system, are usually low cost or at least lower and provide little attach surface for malicious actors. The idea is that is also works well for small projects that can scale to something larger. If you don’t want to spend a lot of money on CI/CD runners but if the project grows doesn’t require you to set up a whole new process or implementation.
  8. So this is where it lead to a terraform module, automation the process of building gitlab runners.
  9. Ec2 cost ~$15 a month plus extra costs
  10. Ec2 cost ~$15 a month plus extra costs
  11. Now some of the more technical tips and tricks that i learnt along the way. These help the next person picking up the code. Again one issue is people picking up the code. Build and document as if you were the one looking at this for the first time and what would really help.
  12. As with anything, architecture diagram or documentation as a whole is important. Nothing says this is a well maintained piece of code like documentation which is factual and thorough. Diagrams can really help pain a picture of what will be deployed. Again, why have consumers extra data out by looking at code when they can see it from a high level. Coupled with examples of working code makes it easier for people to try. You want to lower the barrier to entry for any piece of software and you may need them yourself as they provide a good guide. And lastly, example why in the documentation decisions were made. At times we focus on what was done but not the motivation or limitation as to why. The how can mostly be seen from the code, we can reason about this, the why is more abstract and less obvious. We use this cloudwatch setting because, this is set to negative value so that. This helps the next person out who thinks, why was this done, let me change it to something else that makes more sense to me, only to find themselves in the same situation and rabbit hole you did. Be kind to your future self and engineers.
  13. I want my code to be well documented and for those interested in it to look if necessary, key word being necessary. Terraform docs provides the ability to automatically generate resource, variable, input, provider and other docs based on the code. This means less looking at the code if you are new to the module and provides a better snapshot. Now I can see if I this works with the aws provider version i need for another module, does it use a resource type that my organisation does not allow but more importantly which input variables I need to supply, why and how. With the validation mentioned earlier and the docs a consumer shouldn’t need to view the code to see how a variable will be used making it easier to use for less experience engineers.
  14. As of 0.13.0 you have the ability for variable validation. To check the contents of a variable for example is within a certain number range, or matches a regex or is a valid json string. The idea being that sometimes a plan does not catch these incompatibilities due to the provider, we only find them when its running the apply which is likely too late. Lets do this before the plan to ensure we have a consistent and working environment. One advantage of the variable map and optionals as mentioned before is that we can check multiple variable values, for example the min is less than the max and the desired in somewhere in between. If the variables are defined separately this cannot be done
  15. This may seem minor but it really helps others who are viewing or changing your code. Some resources may use 10 or 20+ attributes and it may be hard to comprehend what is being used. Sorting the attributes alphabetically makes it easy for others to look and see where its places and then how its used. Reducing the cognitive load of making change and decisions, where does this go, should i put this here helps. This includes the resources themselves. Although Terraform does not run in a sequential order it helps for us humans to again comprehend change and find resources.
  16. We have all seen code with hundreds or thousands of lines and though, oh god, not this file again. This adds extra stress and cognitive load to changes. You are much better off splitting the files for a higher level resource type, possible autoscaling, cloudwatch and then add a locals specific to that set of resources into the file. Keeping the resources somewhat contained helps to facilitate change. This may sound contradictory to before in terms of logic ordering and it does depend on how many resources you are creating but anything more than 10 resources per file starts to ger unwieldy.
  17. The use of locals makes it easier to reuse strings or data without having to hard code it in multiple places. Sometimes people make these variables as defaults which can be messy as it gives consumers the ability to change them. Utilise locals where possible to reduce duplication of magic strings, once, twice, three times extra into a local. Locals can also be used to remove the complexity of how a variable is computed into a separate file. Resource definitions can be large enough, let alone when you add in a join, compact, concat, split, tostring, try. Move this out, make it a local and reference it when needed.
  18. Try and infer variables where possible rather than having the consumer pass them in. For example the caller account, no need to pass in an account id, we are deploying into this account, just grab the ID with a data source, same with region. This reduces the duplication of variables and the possibility where the consumer changes region and forgets to update the variable.
  19. Thanks you for listening to my presentation and hope that you gain something useful for your Terraform and Infrastructure as Code journey.