TERRAFORM IN ACTION
The road to IAC
WHO AM I
➤ Damien Pacaud
➤ Director of infrastructure @ teads
➤ Dev & Ops
➤ In love with automation
➤ twitter.com/serty2
INFRA AS
CODE ?
Blueprints to your infra
OUR INFRASTRUCTURE
➤ 2 AWS Regions
➤ EU-WEST-1
➤ US-EAST-1
➤ Highly elastic platform
➤ 6M RPM average traffic
➤ Peak around 8.5 M
➤ 77% Europe
➤ 23% US
US-EAST-1
EU-WEST-1
OUR NEEDS
➤ Operate a 3rd region
➤ Reverse engineer existing regions
➤ Build a staging environment
➤ Better support turnover
➤ Track infra changes and revert them easily
ONE SOLUTION
➤ Infrastructure as code
➤ Templates describing your infra
➤ Documentation is in the code
➤ Easier to create a staging env
➤ Code is versioned via Git
OUR CHOICE
➤ Terraform
➤ Support for many providers
➤ Cloud IAAS : AWS / GCP / Azure
➤ Virtualization : vSphere / vCloud Director
➤ Monitoring : Datadog / Grafana / statuscake
➤ Alerting : PagerDuty
➤ Open source & Well maintained by HashiCorp
➤ Highly declarative and easily readable
TERRAFORM
Hello world and beyond
provider "aws" {
region = "eu-west-1"
profile = "perso"
}
resource "aws_vpc" "vpc_perso" {
cidr_block = "10.0.0.0/16"
enable_dns_hostnames = true
enable_dns_support = true
instance_tenancy = "default"
enable_classiclink = false
tags {
Creator = "Terraform"
}
}
resource "aws_subnet" "subnet_public" {
vpc_id = "${aws_vpc.vpc_perso.id}"
cidr_block = "10.0.4.0/22"
availability_zone = "eu-west-1a"
map_public_ip_on_launch = true
tags {
Creator = "Terraform"
}
}
HELLO WORLD
PLAN
APPLY
STATE
RESULT
WHAT ABOUT TEAMWORK ?
TEAMWORK :: BACKENDS
➤ Store your state file(s) remotely using terraform backend
➤ Many different backend available (azure, gcs, consul, s3, http…)
➤ S3 is a great choice for this use case
➤ Enable encryption
➤ Enable versioning
terraform {
backend "s3" {
bucket = "terraform"
key = "myProd.tfstate"
region = "eu-west-1"
profile = "perso"
}
}
TEAMWORK :: STATE LOCKING
➤ Locking is pretty new
➤ introduced in 0.9.0
➤ Only works with S3, Consul and Local backends
➤ S3 locking involves DynamoDB
➤ Seems pretty straightforward (haven’t tested it)
terraform {
backend "s3" {
bucket = "terraform"
key = "myProd.tfstate"
region = "eu-west-1"
profile = "perso"
lock_table = "terraform_lock"
}
}
TEAMWORK :: REMOTE APPLY (CI)
➤ Mutual agreement from team
➤ No-one should apply from its machine
➤ Jenkins only will apply
➤ Job concurrency == 1
➤ Needs discipline but works well
➤ Enforces the use of Pull-Requests
MODULES
Because DRY
WHAT ARE MODULES ?
➤ A module
➤ is just a folder containing terraform templates
➤ defines a reusable component
➤ is composed of multiple resources
➤ can and should be versioned, tagged
➤ By convention
➤ main.tf : contains resources declaration
➤ variables.tf : contains input variable declaration (with default values)
➤ outputs.tf : contains output variable names and values
MODULE DECLARATION :: MAIN.TF
#VPC
resource "aws_vpc" "vpc" {
cidr_block = "${var.vpc_cidr}"
enable_dns_hostnames = true
enable_dns_support = true
instance_tenancy = "default"
enable_classiclink = false
}
# DHCP options
# This is important to populate search section in /etc/resolv.conf
resource "aws_vpc_dhcp_options" "vpc_dhcp_options" {
domain_name = "${var.domain_name}.${var.env} ${var.aws_region}.compute.internal"
domain_name_servers = ["AmazonProvidedDNS"]
}
# DHCP association
# the option needs to be associated with the VPC
resource "aws_vpc_dhcp_options_association" "vpc_dhcp_options_association" {
vpc_id = "${aws_vpc.vpc.id}"
dhcp_options_id = "${aws_vpc_dhcp_options.vpc_dhcp_options.id}"
}
# Internet Gateway, required so that instances get access/be accessed from the Internet
resource "aws_internet_gateway" "internet_gateway" {
vpc_id = "${aws_vpc.vpc.id}"
}
# S3 VPC endpoint, required so that instances with private IPs can get access to S3
resource "aws_vpc_endpoint" "s3_endpoint" {
vpc_id = "${aws_vpc.vpc.id}"
service_name = "com.amazonaws.${var.aws_region}.s3"
}
MODULE DECLARATION :: OUTPUTS.TF
output "vpc_id" {
value = "${aws_vpc.vpc.id}"
}
output "main_route_id" {
value = "${aws_vpc.vpc.main_route_table_id}"
}
output "cidr_block" {
value = "${aws_vpc.vpc.cidr_block}"
}
output "igw_id" {
value = "${aws_internet_gateway.internet_gateway.id}"
}
output "s3_endpoint_id" {
value = "${aws_vpc_endpoint.s3_endpoint.id}"
}
MODULE DECLARATION :: VARIABLES.TF
variable "vpc_cidr" {}
variable "env" {}
variable "aws_region" {}
variable "domain_name" {}
USING MODULES :: MAIN.TF
module "vpc_staging" {
source = "git::git@github.com/myorg/mymodule.git//vpc?ref=0.1"
aws_region = "eu-west-1"
env = "staging"
vpc_cidr = "10.100.0.0/16"
domain_name = "teads"
}
module "vpc_prod" {
source = "git::git@github.com/myorg/mymodule.git//vpc?ref=0.1"
aws_region = "eu-west-1"
env = "prod"
vpc_cidr = "10.0.0.0/16"
domain_name = "teads"
}
USING MODULES :: OUTPUTS.TF
output "vpc_staging_id" {
value = "${module.vpc_staging.vpc_id}"
}
output "vpc_prod_id" {
value = "${module.vpc_prod.vpc_id}"
}
output "vpc_staging_igw_id" {
value = "${module.vpc_staging.igw_id}"
}
output "vpc_staging_main_route_id" {
value = "${module.vpc_staging.main_route_id}"
}
output "vpc_staging_cidr_block" {
value = "${module.vpc_staging.cidr_block}"
}
output "vpc_staging_s3_endpoint_id" {
value = "${module.vpc_staging.s3_endpoint_id}"
}
output "vpc_staging_main_vpn_gateway_id" {
value = "${module.vpc_staging.main_vpn_gateway_id}"
}
USING MODULES
➤ Modules allows to reuse the same code in different environments
➤ The same module can be used with different input variables in staging
and production environment
➤ The same module can be sourced multiple times, even in the same file
➤ Modules should be sourced from git tags / branches
➤ This allows to update a module while not breaking apply capacity
➤ Use terraform get -update command to source the module before
planning / applying
OUR PATH
WITH
TERRAFORM
what we’ve learnt
A FEW RULES :: SOURCE CONTROL
➤ Jenkins and Jenkins only will apply
➤ Work on Feature Branch, plan on Feature Branch (through Jenkins)
➤ Pull-Request before merging to master
➤ Only master gets applied
➤ Always Plan before Apply and then Plan again
➤ No silver bullet
➤ Pretty strict rules
➤ Master can be broken
A FEW RULES :: ENVIRONMENTS
➤ No unit tests available
➤ Use a staging environment
➤ Always test your code / module in staging before prod
➤ Even to change the name of a Security Group
A FEW RULES :: ISOLATION
➤ Large state files are impractical
➤ Changing something may lead to risking everything
➤ The smaller the component, the smaller the risk
➤ Each component has its own state
➤ Reference state from one component in another one
data "terraform_remote_state" "vpc" {
backend = "s3"
config {
bucket = "terraform"
key = "vpc.tfstate"
region = "us-east-1"
profile = "perso"
}
}
A FEW RULES :: DIRECTORY STRUCTURE
➤ Define directory level variables
➤ i.e. : environment.tf
➤ contains env and profiles variables
➤ Directories are duplicated between staging and
production
➤ Directories are duplicated between regions
➤ This is the granularity that we need
ISSUES
Terraform’s Dark Side
STILL NOT 1.0
➤ Development is very active
➤ New releases will break compatibility
➤ Read changelog before updating
➤ Secret management out-of-the-box is scary
➤ Apply will fail
➤ Even when plan is ok
➤ Example : Wrong CIDR in a subnet attached to a VPC
STILL NOT 1.0
➤ RTFM
➤ and read it carefully
➤ ex : Security Group name / description
➤ Declarative, Declarative, Declarative
➤ Stay away from those loops and arrays
➤ Depends on providers so YMMV
ONE MORE
THING
WE’RE HIRING
➤ Many positions open
➤ We have great arguments
➤ Talk to your friends
➤ https://teads.tv/teads-jobs/
QUESTIONS ?

Terraform in action

  • 1.
  • 2.
    WHO AM I ➤Damien Pacaud ➤ Director of infrastructure @ teads ➤ Dev & Ops ➤ In love with automation ➤ twitter.com/serty2
  • 3.
  • 4.
    OUR INFRASTRUCTURE ➤ 2AWS Regions ➤ EU-WEST-1 ➤ US-EAST-1 ➤ Highly elastic platform ➤ 6M RPM average traffic ➤ Peak around 8.5 M ➤ 77% Europe ➤ 23% US US-EAST-1 EU-WEST-1
  • 5.
    OUR NEEDS ➤ Operatea 3rd region ➤ Reverse engineer existing regions ➤ Build a staging environment ➤ Better support turnover ➤ Track infra changes and revert them easily
  • 6.
    ONE SOLUTION ➤ Infrastructureas code ➤ Templates describing your infra ➤ Documentation is in the code ➤ Easier to create a staging env ➤ Code is versioned via Git
  • 7.
    OUR CHOICE ➤ Terraform ➤Support for many providers ➤ Cloud IAAS : AWS / GCP / Azure ➤ Virtualization : vSphere / vCloud Director ➤ Monitoring : Datadog / Grafana / statuscake ➤ Alerting : PagerDuty ➤ Open source & Well maintained by HashiCorp ➤ Highly declarative and easily readable
  • 8.
  • 9.
    provider "aws" { region= "eu-west-1" profile = "perso" } resource "aws_vpc" "vpc_perso" { cidr_block = "10.0.0.0/16" enable_dns_hostnames = true enable_dns_support = true instance_tenancy = "default" enable_classiclink = false tags { Creator = "Terraform" } } resource "aws_subnet" "subnet_public" { vpc_id = "${aws_vpc.vpc_perso.id}" cidr_block = "10.0.4.0/22" availability_zone = "eu-west-1a" map_public_ip_on_launch = true tags { Creator = "Terraform" } } HELLO WORLD
  • 10.
  • 11.
  • 12.
  • 13.
  • 15.
  • 16.
    TEAMWORK :: BACKENDS ➤Store your state file(s) remotely using terraform backend ➤ Many different backend available (azure, gcs, consul, s3, http…) ➤ S3 is a great choice for this use case ➤ Enable encryption ➤ Enable versioning terraform { backend "s3" { bucket = "terraform" key = "myProd.tfstate" region = "eu-west-1" profile = "perso" } }
  • 17.
    TEAMWORK :: STATELOCKING ➤ Locking is pretty new ➤ introduced in 0.9.0 ➤ Only works with S3, Consul and Local backends ➤ S3 locking involves DynamoDB ➤ Seems pretty straightforward (haven’t tested it) terraform { backend "s3" { bucket = "terraform" key = "myProd.tfstate" region = "eu-west-1" profile = "perso" lock_table = "terraform_lock" } }
  • 18.
    TEAMWORK :: REMOTEAPPLY (CI) ➤ Mutual agreement from team ➤ No-one should apply from its machine ➤ Jenkins only will apply ➤ Job concurrency == 1 ➤ Needs discipline but works well ➤ Enforces the use of Pull-Requests
  • 19.
  • 20.
    WHAT ARE MODULES? ➤ A module ➤ is just a folder containing terraform templates ➤ defines a reusable component ➤ is composed of multiple resources ➤ can and should be versioned, tagged ➤ By convention ➤ main.tf : contains resources declaration ➤ variables.tf : contains input variable declaration (with default values) ➤ outputs.tf : contains output variable names and values
  • 21.
    MODULE DECLARATION ::MAIN.TF #VPC resource "aws_vpc" "vpc" { cidr_block = "${var.vpc_cidr}" enable_dns_hostnames = true enable_dns_support = true instance_tenancy = "default" enable_classiclink = false } # DHCP options # This is important to populate search section in /etc/resolv.conf resource "aws_vpc_dhcp_options" "vpc_dhcp_options" { domain_name = "${var.domain_name}.${var.env} ${var.aws_region}.compute.internal" domain_name_servers = ["AmazonProvidedDNS"] } # DHCP association # the option needs to be associated with the VPC resource "aws_vpc_dhcp_options_association" "vpc_dhcp_options_association" { vpc_id = "${aws_vpc.vpc.id}" dhcp_options_id = "${aws_vpc_dhcp_options.vpc_dhcp_options.id}" } # Internet Gateway, required so that instances get access/be accessed from the Internet resource "aws_internet_gateway" "internet_gateway" { vpc_id = "${aws_vpc.vpc.id}" } # S3 VPC endpoint, required so that instances with private IPs can get access to S3 resource "aws_vpc_endpoint" "s3_endpoint" { vpc_id = "${aws_vpc.vpc.id}" service_name = "com.amazonaws.${var.aws_region}.s3" }
  • 22.
    MODULE DECLARATION ::OUTPUTS.TF output "vpc_id" { value = "${aws_vpc.vpc.id}" } output "main_route_id" { value = "${aws_vpc.vpc.main_route_table_id}" } output "cidr_block" { value = "${aws_vpc.vpc.cidr_block}" } output "igw_id" { value = "${aws_internet_gateway.internet_gateway.id}" } output "s3_endpoint_id" { value = "${aws_vpc_endpoint.s3_endpoint.id}" }
  • 23.
    MODULE DECLARATION ::VARIABLES.TF variable "vpc_cidr" {} variable "env" {} variable "aws_region" {} variable "domain_name" {}
  • 24.
    USING MODULES ::MAIN.TF module "vpc_staging" { source = "git::git@github.com/myorg/mymodule.git//vpc?ref=0.1" aws_region = "eu-west-1" env = "staging" vpc_cidr = "10.100.0.0/16" domain_name = "teads" } module "vpc_prod" { source = "git::git@github.com/myorg/mymodule.git//vpc?ref=0.1" aws_region = "eu-west-1" env = "prod" vpc_cidr = "10.0.0.0/16" domain_name = "teads" }
  • 25.
    USING MODULES ::OUTPUTS.TF output "vpc_staging_id" { value = "${module.vpc_staging.vpc_id}" } output "vpc_prod_id" { value = "${module.vpc_prod.vpc_id}" } output "vpc_staging_igw_id" { value = "${module.vpc_staging.igw_id}" } output "vpc_staging_main_route_id" { value = "${module.vpc_staging.main_route_id}" } output "vpc_staging_cidr_block" { value = "${module.vpc_staging.cidr_block}" } output "vpc_staging_s3_endpoint_id" { value = "${module.vpc_staging.s3_endpoint_id}" } output "vpc_staging_main_vpn_gateway_id" { value = "${module.vpc_staging.main_vpn_gateway_id}" }
  • 26.
    USING MODULES ➤ Modulesallows to reuse the same code in different environments ➤ The same module can be used with different input variables in staging and production environment ➤ The same module can be sourced multiple times, even in the same file ➤ Modules should be sourced from git tags / branches ➤ This allows to update a module while not breaking apply capacity ➤ Use terraform get -update command to source the module before planning / applying
  • 27.
  • 28.
    A FEW RULES:: SOURCE CONTROL ➤ Jenkins and Jenkins only will apply ➤ Work on Feature Branch, plan on Feature Branch (through Jenkins) ➤ Pull-Request before merging to master ➤ Only master gets applied ➤ Always Plan before Apply and then Plan again ➤ No silver bullet ➤ Pretty strict rules ➤ Master can be broken
  • 29.
    A FEW RULES:: ENVIRONMENTS ➤ No unit tests available ➤ Use a staging environment ➤ Always test your code / module in staging before prod ➤ Even to change the name of a Security Group
  • 30.
    A FEW RULES:: ISOLATION ➤ Large state files are impractical ➤ Changing something may lead to risking everything ➤ The smaller the component, the smaller the risk ➤ Each component has its own state ➤ Reference state from one component in another one data "terraform_remote_state" "vpc" { backend = "s3" config { bucket = "terraform" key = "vpc.tfstate" region = "us-east-1" profile = "perso" } }
  • 31.
    A FEW RULES:: DIRECTORY STRUCTURE ➤ Define directory level variables ➤ i.e. : environment.tf ➤ contains env and profiles variables ➤ Directories are duplicated between staging and production ➤ Directories are duplicated between regions ➤ This is the granularity that we need
  • 32.
  • 33.
    STILL NOT 1.0 ➤Development is very active ➤ New releases will break compatibility ➤ Read changelog before updating ➤ Secret management out-of-the-box is scary ➤ Apply will fail ➤ Even when plan is ok ➤ Example : Wrong CIDR in a subnet attached to a VPC
  • 34.
    STILL NOT 1.0 ➤RTFM ➤ and read it carefully ➤ ex : Security Group name / description ➤ Declarative, Declarative, Declarative ➤ Stay away from those loops and arrays ➤ Depends on providers so YMMV
  • 35.
  • 36.
    WE’RE HIRING ➤ Manypositions open ➤ We have great arguments ➤ Talk to your friends ➤ https://teads.tv/teads-jobs/
  • 37.