SlideShare a Scribd company logo
1 of 61
Download to read offline
Test driven
Infrastructure
development
Tomas Doran
bobtfish@bobtfish.net
@bobtfish
Today, I’m going to talk about the promised land!
And by ‘repeatable’, I mean I need to be able to spin up an arbitrary set of servers for any
environment I want, whenever I want - so _all_ the configuration of all the instances has to be
dynamic!
•High availability!
Today, I’m going to talk about the promised land!
And by ‘repeatable’, I mean I need to be able to spin up an arbitrary set of servers for any
environment I want, whenever I want - so _all_ the configuration of all the instances has to be
dynamic!
•High availability!
•Automated testing of all
infrastructure changes
Today, I’m going to talk about the promised land!
And by ‘repeatable’, I mean I need to be able to spin up an arbitrary set of servers for any
environment I want, whenever I want - so _all_ the configuration of all the instances has to be
dynamic!
•High availability!
•Automated testing of all
infrastructure changes
•Entirely repeatable application
environments
Today, I’m going to talk about the promised land!
And by ‘repeatable’, I mean I need to be able to spin up an arbitrary set of servers for any
environment I want, whenever I want - so _all_ the configuration of all the instances has to be
dynamic!
•High availability!
•Automated testing of all
infrastructure changes
•Entirely repeatable application
environments
•High confidence in changes
Today, I’m going to talk about the promised land!
And by ‘repeatable’, I mean I need to be able to spin up an arbitrary set of servers for any
environment I want, whenever I want - so _all_ the configuration of all the instances has to be
dynamic!
•High availability!
•Automated testing of all
infrastructure changes
•Entirely repeatable application
environments
•High confidence in changes
•Continuous integration and
deployment for infrastructure
Today, I’m going to talk about the promised land!
And by ‘repeatable’, I mean I need to be able to spin up an arbitrary set of servers for any
environment I want, whenever I want - so _all_ the configuration of all the instances has to be
dynamic!
So who the hell am I?
Dev
Infrastructure automation nut!
Ex-backend web developer, Ex-security, currently fixing puppet at Yelp!
Dev / Ops
State of repeatability and testing in infrastructures is generally shocking.
Leads to systems/operations teams being adverse to change and conservative - slows the
business down!
Why isn’t your infrastructure an agile software project?
Dev / Ops
•Developer viewpoint
State of repeatability and testing in infrastructures is generally shocking.
Leads to systems/operations teams being adverse to change and conservative - slows the
business down!
Why isn’t your infrastructure an agile software project?
Dev / Ops
•Developer viewpoint
•Grass IS greener
State of repeatability and testing in infrastructures is generally shocking.
Leads to systems/operations teams being adverse to change and conservative - slows the
business down!
Why isn’t your infrastructure an agile software project?
Dev / Ops
•Developer viewpoint
•Grass IS greener
State of repeatability and testing in infrastructures is generally shocking.
Leads to systems/operations teams being adverse to change and conservative - slows the
business down!
Why isn’t your infrastructure an agile software project?
Dev / Ops
•Developer viewpoint
•Grass IS greener
•Think of your infra as an
agile software project...
State of repeatability and testing in infrastructures is generally shocking.
Leads to systems/operations teams being adverse to change and conservative - slows the
business down!
Why isn’t your infrastructure an agile software project?
Dev / Ops
•Developer viewpoint
•Grass IS greener
•Think of your infra as an
agile software project...
•What workflow do I want?
State of repeatability and testing in infrastructures is generally shocking.
Leads to systems/operations teams being adverse to change and conservative - slows the
business down!
Why isn’t your infrastructure an agile software project?
The state of the art
Going to talk about how I think the generally accepted way of doing some things is
fundamentally broken!
But lets start with a simple description of the issues I’m worrying about.
CM = state machine
Each change puppet makes (or attempts to make) is a state transition. Each circle represents
the configuration state of the server on disc + services running etc..
Non deterministic
This is the key observation here - you don’t know which way puppet’s gonna jump :)
In this case - it doesn’t matter, as the two operations are orthogonal.
Convergent!
Convergence is when each run of puppet takes you nearer to 0 changes, but the next run
makes additional changes..
The classic way to screw this up is to miss a dependency in your code.
Convergent!
Of course, this doesn’t happen - the first step goes BANG, then mysql gets installed,
creates /etc/mysql.
The second puppet run _then_ sets the config up..
err: /Stage[main]//File[/etc/mysql/my.cnf]/
ensure: change from absent to file failed:
Could not set 'file on ensure: No such file or
directory - /etc/mysql/
my.cnf.puppettmp_3706 at /home/tdoran/
test.pp:4
Aaand in your puppet logs, you get.
Purple text of rage!
err: /Stage[main]//File[/etc/mysql/my.cnf]/
ensure: change from absent to file failed:
Could not set 'file on ensure: No such file or
directory - /etc/mysql/
my.cnf.puppettmp_3706 at /home/tdoran/
test.pp:4
THE PURPLE TEXT OF RAGE
Convergent!
(Shamelessly stolen from https://www.usenix.org/legacy/publications/library/proceedings/lisa02/tech/full_papers/traugott/traugott.pdf)
Aaand your machine is convergent - i.e. it gets towards the desired state in a number of
steps..
•before
•require
•subscribe
•notify
As I noted, this all happens as you missed a dependency. This is the easy case, where puppet
can detect hat and tell you! It’s also entirely possible to be totally silent.
It is though totally possible to write your puppet code well enough to need EXACTLY 1 puppet
run to fully provision a server!
Fixable!
•before
•require
•subscribe
•notify
As I noted, this all happens as you missed a dependency. This is the easy case, where puppet
can detect hat and tell you! It’s also entirely possible to be totally silent.
It is though totally possible to write your puppet code well enough to need EXACTLY 1 puppet
run to fully provision a server!
Fixable!
•before
•require
•subscribe
•notify
What about an
entire
infrastructure?
The $64,000 question is....
A whole stack
Lets start simple, but semi realistic.
Gonna ignore databases.
Gonna ignore monitoring.
Gonna ignore the n[eo]twork.
Exported resources
Each layer of systems can publish data to the systems which depend on it. (I.e. webs register,
proxies find the webs + register themselves, lbs then find the proxy).
Given you know the dependencies - you can get consistent runs by ordering them.
Exported resources
• Inter machine dependencies
Each layer of systems can publish data to the systems which depend on it. (I.e. webs register,
proxies find the webs + register themselves, lbs then find the proxy).
Given you know the dependencies - you can get consistent runs by ordering them.
Exported resources
• Inter machine dependencies
• Unidirectional!
Each layer of systems can publish data to the systems which depend on it. (I.e. webs register,
proxies find the webs + register themselves, lbs then find the proxy).
Given you know the dependencies - you can get consistent runs by ordering them.
Exported resources
• Inter machine dependencies
• Unidirectional!
• Known graph - webs, proxies, lbs
Each layer of systems can publish data to the systems which depend on it. (I.e. webs register,
proxies find the webs + register themselves, lbs then find the proxy).
Given you know the dependencies - you can get consistent runs by ordering them.
Exported resources
• Inter machine dependencies
• Unidirectional!
• Known graph - webs, proxies, lbs
• Puppetroll (github.com/youdevise/
puppetroll)
Each layer of systems can publish data to the systems which depend on it. (I.e. webs register,
proxies find the webs + register themselves, lbs then find the proxy).
Given you know the dependencies - you can get consistent runs by ordering them.
Exported resources
(Shameless ripoff of http://xkcd.com/1171/ )
Ordering dependent. Hard to test (in isolation). Slooow (have to run in order)
Co-dependence
And if we really are talking about entire infrastructures...
Then maybe we need some of these.
Co-dependence
:(
You _know_ that if everything is dynamically configured that you’re gonna have to do
multiple puppet runs per server...
Do we _really_ want to keep running puppet till it stops changing things?
The solution - an
external model
Use your software model to generate a set of machines for an environment.
And generate config for puppet to apply to each system to configure it
Add super secret special sauce (lots and lots of mcollective!)
The solution - an
external model
• Represent system as a set of ruby classes
Use your software model to generate a set of machines for an environment.
And generate config for puppet to apply to each system to configure it
Add super secret special sauce (lots and lots of mcollective!)
The solution - an
external model
• Represent system as a set of ruby classes
• DSL for describing environments
Use your software model to generate a set of machines for an environment.
And generate config for puppet to apply to each system to configure it
Add super secret special sauce (lots and lots of mcollective!)
The solution - an
external model
• Represent system as a set of ruby classes
• DSL for describing environments
• Dependencies
Use your software model to generate a set of machines for an environment.
And generate config for puppet to apply to each system to configure it
Add super secret special sauce (lots and lots of mcollective!)
The solution - an
external model
• Represent system as a set of ruby classes
• DSL for describing environments
• Dependencies
• Domain knowledge
Use your software model to generate a set of machines for an environment.
And generate config for puppet to apply to each system to configure it
Add super secret special sauce (lots and lots of mcollective!)
This is a simplified / minimal example jenkins environment - just 4 machines (2 web apps, 2
load balancers)
ENC data!
Our external node classifier generates this for each of the 4 machines, which translates to
puppet code run on the server.
Note how every server gets all of it’s dependencies
There’s a companion data structure sent to the agent which actually provisons the virtual
Call tree looks something like this: Model all the nodes, allocate all their IPs. Make calls to
KVM servers to provision machines.. VMs start, boot, run puppet, send cert to puppetmaster,
--waitforcert.
Central provisioning asks ‘do we have a cert’, waits - signs it. Looks up DNS and ENC to
Automate all the things
Suddenly, I have massive power.
I can write a small script to bring up a whole production like environment, run tests against
it, tear it down. I can do this against the latest puppet changes, and only promote them to
run on production servers when the tests pass!
BDD infrastructure
Behavior driven development - given I have a high level model of the systems comprising an
infrastructure, I can then write equally high level tests to assert the behavior of that
infrastructiure
BDD infrastructure
• Given
For example...
BDD infrastructure
• Given – the Service has finished being
provisioned
BDD infrastructure
• Given – the Service has finished being
provisioned
• And
BDD infrastructure
• Given – the Service has finished being
provisioned
• And – all monitoring related to the service is
passing
BDD infrastructure
• Given – the Service has finished being
provisioned
• And – all monitoring related to the service is
passing
• When
BDD infrastructure
• Given – the Service has finished being
provisioned
• And – all monitoring related to the service is
passing
• When – when we destroy a single member of
the service
BDD infrastructure
• Given – the Service has finished being
provisioned
• And – all monitoring related to the service is
passing
• When – when we destroy a single member of
the service
• Then
BDD infrastructure
• Given – the Service has finished being
provisioned
• And – all monitoring related to the service is
passing
• When – when we destroy a single member of
the service
• Then – we expect all monitoring at the service
level to be passing
BDD infrastructure
• Given – the Service has finished being
provisioned
• And – all monitoring related to the service is
passing
• When – when we destroy a single member of
the service
• Then – we expect all monitoring at the service
level to be passing
• And
BDD infrastructure
• Given – the Service has finished being
provisioned
• And – all monitoring related to the service is
passing
• When – when we destroy a single member of
the service
• Then – we expect all monitoring at the service
level to be passing
• And – we expect all monitoring at the single
machine level to be failing
Yes, I am suggesting regression testing your load balancer setup...
Is this for real?
Is this for real?
•Yes!
Is this for real?
•Yes!
• We actually built this, the core parts are on
github
Is this for real?
•Yes!
• We actually built this, the core parts are on
github
• Deployed real applications to production at
TIM Group
•High availability!
•Automated testing of all
infrastructure changes
•Entirely repeatable application
environments
•High confidence in changes
•Continuous integration and
deployment for infrastructure
This is my promised land!
Questions?
• https://devblog.timgroup.com/2013/06/14/
exported-resources-considered-harmful/
• https://devblog.timgroup.com/2013/06/26/
scenario-testing-infrastructures/
• https://github.com/youdevise/provisioning-
tools
• https://github.com/youdevise/stackbuilder

More Related Content

More from Puppet

Puppet camp2021 testing modules and controlrepo
Puppet camp2021 testing modules and controlrepoPuppet camp2021 testing modules and controlrepo
Puppet camp2021 testing modules and controlrepoPuppet
 
Puppetcamp r10kyaml
Puppetcamp r10kyamlPuppetcamp r10kyaml
Puppetcamp r10kyamlPuppet
 
2021 04-15 operational verification (with notes)
2021 04-15 operational verification (with notes)2021 04-15 operational verification (with notes)
2021 04-15 operational verification (with notes)Puppet
 
Puppet camp vscode
Puppet camp vscodePuppet camp vscode
Puppet camp vscodePuppet
 
Modules of the twenties
Modules of the twentiesModules of the twenties
Modules of the twentiesPuppet
 
Applying Roles and Profiles method to compliance code
Applying Roles and Profiles method to compliance codeApplying Roles and Profiles method to compliance code
Applying Roles and Profiles method to compliance codePuppet
 
KGI compliance as-code approach
KGI compliance as-code approachKGI compliance as-code approach
KGI compliance as-code approachPuppet
 
Enforce compliance policy with model-driven automation
Enforce compliance policy with model-driven automationEnforce compliance policy with model-driven automation
Enforce compliance policy with model-driven automationPuppet
 
Keynote: Puppet camp compliance
Keynote: Puppet camp complianceKeynote: Puppet camp compliance
Keynote: Puppet camp compliancePuppet
 
Automating it management with Puppet + ServiceNow
Automating it management with Puppet + ServiceNowAutomating it management with Puppet + ServiceNow
Automating it management with Puppet + ServiceNowPuppet
 
Puppet: The best way to harden Windows
Puppet: The best way to harden WindowsPuppet: The best way to harden Windows
Puppet: The best way to harden WindowsPuppet
 
Simplified Patch Management with Puppet - Oct. 2020
Simplified Patch Management with Puppet - Oct. 2020Simplified Patch Management with Puppet - Oct. 2020
Simplified Patch Management with Puppet - Oct. 2020Puppet
 
Accelerating azure adoption with puppet
Accelerating azure adoption with puppetAccelerating azure adoption with puppet
Accelerating azure adoption with puppetPuppet
 
Puppet catalog Diff; Raphael Pinson
Puppet catalog Diff; Raphael PinsonPuppet catalog Diff; Raphael Pinson
Puppet catalog Diff; Raphael PinsonPuppet
 
ServiceNow and Puppet- better together, Kevin Reeuwijk
ServiceNow and Puppet- better together, Kevin ReeuwijkServiceNow and Puppet- better together, Kevin Reeuwijk
ServiceNow and Puppet- better together, Kevin ReeuwijkPuppet
 
Take control of your dev ops dumping ground
Take control of your  dev ops dumping groundTake control of your  dev ops dumping ground
Take control of your dev ops dumping groundPuppet
 
100% Puppet Cloud Deployment of Legacy Software
100% Puppet Cloud Deployment of Legacy Software100% Puppet Cloud Deployment of Legacy Software
100% Puppet Cloud Deployment of Legacy SoftwarePuppet
 
Puppet User Group
Puppet User GroupPuppet User Group
Puppet User GroupPuppet
 
Continuous Compliance and DevSecOps
Continuous Compliance and DevSecOpsContinuous Compliance and DevSecOps
Continuous Compliance and DevSecOpsPuppet
 
The Dynamic Duo of Puppet and Vault tame SSL Certificates, Nick Maludy
The Dynamic Duo of Puppet and Vault tame SSL Certificates, Nick MaludyThe Dynamic Duo of Puppet and Vault tame SSL Certificates, Nick Maludy
The Dynamic Duo of Puppet and Vault tame SSL Certificates, Nick MaludyPuppet
 

More from Puppet (20)

Puppet camp2021 testing modules and controlrepo
Puppet camp2021 testing modules and controlrepoPuppet camp2021 testing modules and controlrepo
Puppet camp2021 testing modules and controlrepo
 
Puppetcamp r10kyaml
Puppetcamp r10kyamlPuppetcamp r10kyaml
Puppetcamp r10kyaml
 
2021 04-15 operational verification (with notes)
2021 04-15 operational verification (with notes)2021 04-15 operational verification (with notes)
2021 04-15 operational verification (with notes)
 
Puppet camp vscode
Puppet camp vscodePuppet camp vscode
Puppet camp vscode
 
Modules of the twenties
Modules of the twentiesModules of the twenties
Modules of the twenties
 
Applying Roles and Profiles method to compliance code
Applying Roles and Profiles method to compliance codeApplying Roles and Profiles method to compliance code
Applying Roles and Profiles method to compliance code
 
KGI compliance as-code approach
KGI compliance as-code approachKGI compliance as-code approach
KGI compliance as-code approach
 
Enforce compliance policy with model-driven automation
Enforce compliance policy with model-driven automationEnforce compliance policy with model-driven automation
Enforce compliance policy with model-driven automation
 
Keynote: Puppet camp compliance
Keynote: Puppet camp complianceKeynote: Puppet camp compliance
Keynote: Puppet camp compliance
 
Automating it management with Puppet + ServiceNow
Automating it management with Puppet + ServiceNowAutomating it management with Puppet + ServiceNow
Automating it management with Puppet + ServiceNow
 
Puppet: The best way to harden Windows
Puppet: The best way to harden WindowsPuppet: The best way to harden Windows
Puppet: The best way to harden Windows
 
Simplified Patch Management with Puppet - Oct. 2020
Simplified Patch Management with Puppet - Oct. 2020Simplified Patch Management with Puppet - Oct. 2020
Simplified Patch Management with Puppet - Oct. 2020
 
Accelerating azure adoption with puppet
Accelerating azure adoption with puppetAccelerating azure adoption with puppet
Accelerating azure adoption with puppet
 
Puppet catalog Diff; Raphael Pinson
Puppet catalog Diff; Raphael PinsonPuppet catalog Diff; Raphael Pinson
Puppet catalog Diff; Raphael Pinson
 
ServiceNow and Puppet- better together, Kevin Reeuwijk
ServiceNow and Puppet- better together, Kevin ReeuwijkServiceNow and Puppet- better together, Kevin Reeuwijk
ServiceNow and Puppet- better together, Kevin Reeuwijk
 
Take control of your dev ops dumping ground
Take control of your  dev ops dumping groundTake control of your  dev ops dumping ground
Take control of your dev ops dumping ground
 
100% Puppet Cloud Deployment of Legacy Software
100% Puppet Cloud Deployment of Legacy Software100% Puppet Cloud Deployment of Legacy Software
100% Puppet Cloud Deployment of Legacy Software
 
Puppet User Group
Puppet User GroupPuppet User Group
Puppet User Group
 
Continuous Compliance and DevSecOps
Continuous Compliance and DevSecOpsContinuous Compliance and DevSecOps
Continuous Compliance and DevSecOps
 
The Dynamic Duo of Puppet and Vault tame SSL Certificates, Nick Maludy
The Dynamic Duo of Puppet and Vault tame SSL Certificates, Nick MaludyThe Dynamic Duo of Puppet and Vault tame SSL Certificates, Nick Maludy
The Dynamic Duo of Puppet and Vault tame SSL Certificates, Nick Maludy
 

Recently uploaded

Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 

Recently uploaded (20)

Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 

Test Driven Infrastructure Development - PuppetConf 2013

  • 2. Today, I’m going to talk about the promised land! And by ‘repeatable’, I mean I need to be able to spin up an arbitrary set of servers for any environment I want, whenever I want - so _all_ the configuration of all the instances has to be dynamic!
  • 3. •High availability! Today, I’m going to talk about the promised land! And by ‘repeatable’, I mean I need to be able to spin up an arbitrary set of servers for any environment I want, whenever I want - so _all_ the configuration of all the instances has to be dynamic!
  • 4. •High availability! •Automated testing of all infrastructure changes Today, I’m going to talk about the promised land! And by ‘repeatable’, I mean I need to be able to spin up an arbitrary set of servers for any environment I want, whenever I want - so _all_ the configuration of all the instances has to be dynamic!
  • 5. •High availability! •Automated testing of all infrastructure changes •Entirely repeatable application environments Today, I’m going to talk about the promised land! And by ‘repeatable’, I mean I need to be able to spin up an arbitrary set of servers for any environment I want, whenever I want - so _all_ the configuration of all the instances has to be dynamic!
  • 6. •High availability! •Automated testing of all infrastructure changes •Entirely repeatable application environments •High confidence in changes Today, I’m going to talk about the promised land! And by ‘repeatable’, I mean I need to be able to spin up an arbitrary set of servers for any environment I want, whenever I want - so _all_ the configuration of all the instances has to be dynamic!
  • 7. •High availability! •Automated testing of all infrastructure changes •Entirely repeatable application environments •High confidence in changes •Continuous integration and deployment for infrastructure Today, I’m going to talk about the promised land! And by ‘repeatable’, I mean I need to be able to spin up an arbitrary set of servers for any environment I want, whenever I want - so _all_ the configuration of all the instances has to be dynamic!
  • 8. So who the hell am I?
  • 9. Dev Infrastructure automation nut! Ex-backend web developer, Ex-security, currently fixing puppet at Yelp!
  • 10. Dev / Ops State of repeatability and testing in infrastructures is generally shocking. Leads to systems/operations teams being adverse to change and conservative - slows the business down! Why isn’t your infrastructure an agile software project?
  • 11. Dev / Ops •Developer viewpoint State of repeatability and testing in infrastructures is generally shocking. Leads to systems/operations teams being adverse to change and conservative - slows the business down! Why isn’t your infrastructure an agile software project?
  • 12. Dev / Ops •Developer viewpoint •Grass IS greener State of repeatability and testing in infrastructures is generally shocking. Leads to systems/operations teams being adverse to change and conservative - slows the business down! Why isn’t your infrastructure an agile software project?
  • 13. Dev / Ops •Developer viewpoint •Grass IS greener State of repeatability and testing in infrastructures is generally shocking. Leads to systems/operations teams being adverse to change and conservative - slows the business down! Why isn’t your infrastructure an agile software project?
  • 14. Dev / Ops •Developer viewpoint •Grass IS greener •Think of your infra as an agile software project... State of repeatability and testing in infrastructures is generally shocking. Leads to systems/operations teams being adverse to change and conservative - slows the business down! Why isn’t your infrastructure an agile software project?
  • 15. Dev / Ops •Developer viewpoint •Grass IS greener •Think of your infra as an agile software project... •What workflow do I want? State of repeatability and testing in infrastructures is generally shocking. Leads to systems/operations teams being adverse to change and conservative - slows the business down! Why isn’t your infrastructure an agile software project?
  • 16. The state of the art Going to talk about how I think the generally accepted way of doing some things is fundamentally broken! But lets start with a simple description of the issues I’m worrying about.
  • 17. CM = state machine Each change puppet makes (or attempts to make) is a state transition. Each circle represents the configuration state of the server on disc + services running etc..
  • 18. Non deterministic This is the key observation here - you don’t know which way puppet’s gonna jump :) In this case - it doesn’t matter, as the two operations are orthogonal.
  • 19. Convergent! Convergence is when each run of puppet takes you nearer to 0 changes, but the next run makes additional changes.. The classic way to screw this up is to miss a dependency in your code.
  • 20. Convergent! Of course, this doesn’t happen - the first step goes BANG, then mysql gets installed, creates /etc/mysql. The second puppet run _then_ sets the config up..
  • 21. err: /Stage[main]//File[/etc/mysql/my.cnf]/ ensure: change from absent to file failed: Could not set 'file on ensure: No such file or directory - /etc/mysql/ my.cnf.puppettmp_3706 at /home/tdoran/ test.pp:4 Aaand in your puppet logs, you get.
  • 22. Purple text of rage! err: /Stage[main]//File[/etc/mysql/my.cnf]/ ensure: change from absent to file failed: Could not set 'file on ensure: No such file or directory - /etc/mysql/ my.cnf.puppettmp_3706 at /home/tdoran/ test.pp:4 THE PURPLE TEXT OF RAGE
  • 23. Convergent! (Shamelessly stolen from https://www.usenix.org/legacy/publications/library/proceedings/lisa02/tech/full_papers/traugott/traugott.pdf) Aaand your machine is convergent - i.e. it gets towards the desired state in a number of steps..
  • 24. •before •require •subscribe •notify As I noted, this all happens as you missed a dependency. This is the easy case, where puppet can detect hat and tell you! It’s also entirely possible to be totally silent. It is though totally possible to write your puppet code well enough to need EXACTLY 1 puppet run to fully provision a server!
  • 25. Fixable! •before •require •subscribe •notify As I noted, this all happens as you missed a dependency. This is the easy case, where puppet can detect hat and tell you! It’s also entirely possible to be totally silent. It is though totally possible to write your puppet code well enough to need EXACTLY 1 puppet run to fully provision a server!
  • 27. A whole stack Lets start simple, but semi realistic. Gonna ignore databases. Gonna ignore monitoring. Gonna ignore the n[eo]twork.
  • 28. Exported resources Each layer of systems can publish data to the systems which depend on it. (I.e. webs register, proxies find the webs + register themselves, lbs then find the proxy). Given you know the dependencies - you can get consistent runs by ordering them.
  • 29. Exported resources • Inter machine dependencies Each layer of systems can publish data to the systems which depend on it. (I.e. webs register, proxies find the webs + register themselves, lbs then find the proxy). Given you know the dependencies - you can get consistent runs by ordering them.
  • 30. Exported resources • Inter machine dependencies • Unidirectional! Each layer of systems can publish data to the systems which depend on it. (I.e. webs register, proxies find the webs + register themselves, lbs then find the proxy). Given you know the dependencies - you can get consistent runs by ordering them.
  • 31. Exported resources • Inter machine dependencies • Unidirectional! • Known graph - webs, proxies, lbs Each layer of systems can publish data to the systems which depend on it. (I.e. webs register, proxies find the webs + register themselves, lbs then find the proxy). Given you know the dependencies - you can get consistent runs by ordering them.
  • 32. Exported resources • Inter machine dependencies • Unidirectional! • Known graph - webs, proxies, lbs • Puppetroll (github.com/youdevise/ puppetroll) Each layer of systems can publish data to the systems which depend on it. (I.e. webs register, proxies find the webs + register themselves, lbs then find the proxy). Given you know the dependencies - you can get consistent runs by ordering them.
  • 33. Exported resources (Shameless ripoff of http://xkcd.com/1171/ ) Ordering dependent. Hard to test (in isolation). Slooow (have to run in order)
  • 34. Co-dependence And if we really are talking about entire infrastructures... Then maybe we need some of these.
  • 35. Co-dependence :( You _know_ that if everything is dynamically configured that you’re gonna have to do multiple puppet runs per server... Do we _really_ want to keep running puppet till it stops changing things?
  • 36. The solution - an external model Use your software model to generate a set of machines for an environment. And generate config for puppet to apply to each system to configure it Add super secret special sauce (lots and lots of mcollective!)
  • 37. The solution - an external model • Represent system as a set of ruby classes Use your software model to generate a set of machines for an environment. And generate config for puppet to apply to each system to configure it Add super secret special sauce (lots and lots of mcollective!)
  • 38. The solution - an external model • Represent system as a set of ruby classes • DSL for describing environments Use your software model to generate a set of machines for an environment. And generate config for puppet to apply to each system to configure it Add super secret special sauce (lots and lots of mcollective!)
  • 39. The solution - an external model • Represent system as a set of ruby classes • DSL for describing environments • Dependencies Use your software model to generate a set of machines for an environment. And generate config for puppet to apply to each system to configure it Add super secret special sauce (lots and lots of mcollective!)
  • 40. The solution - an external model • Represent system as a set of ruby classes • DSL for describing environments • Dependencies • Domain knowledge Use your software model to generate a set of machines for an environment. And generate config for puppet to apply to each system to configure it Add super secret special sauce (lots and lots of mcollective!)
  • 41. This is a simplified / minimal example jenkins environment - just 4 machines (2 web apps, 2 load balancers)
  • 42. ENC data! Our external node classifier generates this for each of the 4 machines, which translates to puppet code run on the server. Note how every server gets all of it’s dependencies There’s a companion data structure sent to the agent which actually provisons the virtual
  • 43. Call tree looks something like this: Model all the nodes, allocate all their IPs. Make calls to KVM servers to provision machines.. VMs start, boot, run puppet, send cert to puppetmaster, --waitforcert. Central provisioning asks ‘do we have a cert’, waits - signs it. Looks up DNS and ENC to
  • 44. Automate all the things Suddenly, I have massive power. I can write a small script to bring up a whole production like environment, run tests against it, tear it down. I can do this against the latest puppet changes, and only promote them to run on production servers when the tests pass!
  • 45. BDD infrastructure Behavior driven development - given I have a high level model of the systems comprising an infrastructure, I can then write equally high level tests to assert the behavior of that infrastructiure
  • 47. BDD infrastructure • Given – the Service has finished being provisioned
  • 48. BDD infrastructure • Given – the Service has finished being provisioned • And
  • 49. BDD infrastructure • Given – the Service has finished being provisioned • And – all monitoring related to the service is passing
  • 50. BDD infrastructure • Given – the Service has finished being provisioned • And – all monitoring related to the service is passing • When
  • 51. BDD infrastructure • Given – the Service has finished being provisioned • And – all monitoring related to the service is passing • When – when we destroy a single member of the service
  • 52. BDD infrastructure • Given – the Service has finished being provisioned • And – all monitoring related to the service is passing • When – when we destroy a single member of the service • Then
  • 53. BDD infrastructure • Given – the Service has finished being provisioned • And – all monitoring related to the service is passing • When – when we destroy a single member of the service • Then – we expect all monitoring at the service level to be passing
  • 54. BDD infrastructure • Given – the Service has finished being provisioned • And – all monitoring related to the service is passing • When – when we destroy a single member of the service • Then – we expect all monitoring at the service level to be passing • And
  • 55. BDD infrastructure • Given – the Service has finished being provisioned • And – all monitoring related to the service is passing • When – when we destroy a single member of the service • Then – we expect all monitoring at the service level to be passing • And – we expect all monitoring at the single machine level to be failing Yes, I am suggesting regression testing your load balancer setup...
  • 56. Is this for real?
  • 57. Is this for real? •Yes!
  • 58. Is this for real? •Yes! • We actually built this, the core parts are on github
  • 59. Is this for real? •Yes! • We actually built this, the core parts are on github • Deployed real applications to production at TIM Group
  • 60. •High availability! •Automated testing of all infrastructure changes •Entirely repeatable application environments •High confidence in changes •Continuous integration and deployment for infrastructure This is my promised land!