Puppet Camp Sydney Feb 2014 - A Build Engineering Team’s Journey of Infrastructure as Code

  • 1,698 views
Uploaded on

A Build Engineering Team’s Journey of Infrastructure as Code - the challenges that we’ve faced and the practices that we implemented as we went along our journey.  …

A Build Engineering Team’s Journey of Infrastructure as Code - the challenges that we’ve faced and the practices that we implemented as we went along our journey. 

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
  • Thanks for the comments Vincent & Christopher! You can find an updated version of the talk here: http://www.slideshare.net/PeterLeschev/puppet-camp-melbourne-nov-2014-a-build-engineering-teams-journey-of-infrastructure-as-code
    Are you sure you want to
    Your message goes here
  • Great presentation Peter.
    Are you sure you want to
    Your message goes here
  • Really enjoyed this presentation Peter. Thanks for sharing your experiences at Atlassian. I'll be taking this to the team I work in to show what others are doing with Puppet.
    Are you sure you want to
    Your message goes here
No Downloads

Views

Total Views
1,698
On Slideshare
0
From Embeds
0
Number of Embeds
9

Actions

Shares
Downloads
16
Comments
3
Likes
6

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Monday, 10 February 14
  • 2. Peter Leschev @peterleschev Husband, Father of 3 & Atlassian Build Engineering Team Lead Monday, 10 February 14
  • 3. A Build Engineering Team’s Journey of Infrastructure as Code Monday, 10 February 14
  • 4. Build Engineering today @ Atlassian • Build platform & services used internally within the company • 60k builds per month • 35k automated tests for JIRA Monday, 10 February 14
  • 5. Build Engineering today @ Atlassian • 600 build agents (own hardware + EC2 instances) • include SCM clients, JDKs, JVM build tools, databases, headless browser testing, python builds, NodeJS, installers & more • Maintain 20 AMIs of various build configurations • 6 Bamboo Servers • maven.atlassian.com / 6 Nexus instances • Monitoring - opsview / graphite / statsd Monday, 10 February 14
  • 6. Infrastructure as Code = Puppet + SCM ? Monday, 10 February 14
  • 7. 3 years ago... • Manually maintained snowflakes • Started using puppet Monday, 10 February 14
  • 8. Production rollout puppetmaster build agents Monday, 10 February 14
  • 9. Production rollout failure puppetmaster build agents Monday, 10 February 14
  • 10. Confidence of Change HIGH NONE Dev Rollout Lifecycle of an infra change Monday, 10 February 14 Soak in Prod
  • 11. Monday, 10 February 14
  • 12. https://bitbucket.org/ http://atlassian.com/git Monday, 10 February 14
  • 13. Style in Pull Requests Monday, 10 February 14
  • 14. Puppet Lint https://github.com/rodjek/puppet-lint Tim Sharpe @rodjek • Automated style checking • Setup automated build that runs checks & posts results • Still need to implement a ratchet build Monday, 10 February 14
  • 15. Confidence of Change initial + Code review HIGH NONE Dev Code review Rollout Lifecycle of an infra change Monday, 10 February 14 Soak in Prod
  • 16. Using Staging for Development puppetmaster • Coding on Puppet Master • Culture of manually modifying production - Configuration Drift • Impact on Builds staging puppet environment build agents Monday, 10 February 14
  • 17. Vagrant http://www.vagrantup.com/ Mitchell Hashimoto @mitchellh • Easily spin up Infrastructure locally on your laptop • Disposable / reproducible environments • Machine provisioning via Virtual Box / VMWare / AWS • Configuration applied via Shell Scripts / Puppet / Chef • Develop and test infrastructure changes locally Monday, 10 February 14
  • 18. Vagrant Vagrantfile vagrant basebox Monday, 10 February 14 http://www.vagrantup.com/ Mitchell Hashimoto @mitchellh
  • 19. Vagrant Mitchell Hashimoto @mitchellh http://www.vagrantup.com/ Spins up a local VM to a known state Make some puppet changes and then run: to apply your changes SSH into your VM using: to check your changes Destroy the VM when done Monday, 10 February 14
  • 20. Confidence of Change initial + Code review + Vagrant HIGH NONE Dev Code review Rollout Lifecycle of an infra change Monday, 10 February 14 Soak in Prod
  • 21. Vagrant != Production • Vagrant basebox differences with production machines • Originally using publicly available vagrant baseboxes • Installed packages biggest differences • Generating a basebox manually was a painful process Monday, 10 February 14
  • 22. Veewee Automated Vagrant basebox generation https://github.com/jedi4ever/veewee Patrick Debois @patrickdebois + Ubuntu installation iso Monday, 10 February 14 Veewee definitions.rb preseed.cfg postinstall.sh vagrant basebox
  • 23. Veewee Automated Vagrant basebox generation Monday, 10 February 14 https://github.com/jedi4ever/veewee Patrick Debois @patrickdebois
  • 24. Basebox generation via CI • Latest basebox generated in CI & published to fileshare • No need to generate baseboxes locally Monday, 10 February 14
  • 25. There are still differences! • VirtualBox Guest additions • Reduced to a minimal Monday, 10 February 14
  • 26. Common Preseed / Postinstall + preseed.cfg vagrant basebox Monday, 10 February 14 postinstall.sh PXEBoot custom ISOs
  • 27. Packer Monday, 10 February 14 http://packer.io Mitchell Hashimoto @mitchellh
  • 28. Confidence in Change initial + Code review + Vagrant + Veewee HIGH NONE Dev Code review Rollout Lifecycle of an infra change Monday, 10 February 14 Soak in Prod
  • 29. Developing locally Rolling out to staging Rolling out to production Broken build agents! Monday, 10 February 14
  • 30. Cucumber • Behaviour Driven Development Monday, 10 February 14
  • 31. Cucumber & Vagrant Vagrant Virtual Box VM Custom Provisioner via ssh puppet apply cucumber *.features Monday, 10 February 14
  • 32. Disadvantages • Requires cucumber dependencies to be installed on tested VM • Tests run within the VM making testing firewall rules harder Monday, 10 February 14
  • 33. Confidence in Change initial + Code review + Vagrant + Veewee + Cukes HIGH NONE Dev Code review Rollout Lifecycle of an infra change Monday, 10 February 14 Soak in Prod
  • 34. “ Monday, 10 February 14 But it works on my machine! – Every Developer ”
  • 35. Continuous Integration • ‘From scratch’ provisioning • Confidence that you can rebuild in disaster Monday, 10 February 14
  • 36. “ The Pets: you give nice names, you stroke them, and when they get ill, you nurse them back to health, taking a long time over it The Cattle: you give them numbers. When they get ill, you shoot them ” – Tim Bell, CERN Monday, 10 February 14
  • 37. Confidence in Change initial + Code review + Vagrant + Veewee + Cukes + CI HIGH NONE Dev Code review CI & Rollout Lifecycle of an infra change Monday, 10 February 14 Soak in Prod
  • 38. Provisioning from scratch is slow Monday, 10 February 14
  • 39. Spread out CI provision VM1 provision VM2 provision VM3 provision VM4 Monday, 10 February 14 provision VM1 provision VM2 Moved from sequential to parallel provisioning provision VM3 provision VM4
  • 40. There are so many MacPros you can steal Monday, 10 February 14
  • 41. The ones I have my eye on.... Monday, 10 February 14
  • 42. Profiling Puppet Runs Add “--evaltrace” to puppet apply + Collect and show the longest occurrences of: “Evaluated in ([d.]+) seconds” Monday, 10 February 14 =
  • 43. Profiling Cucumber runs http://itshouldbeuseful.wordpress.com/2010/11/10/find-your-slowest-running-cucumber-features/ Monday, 10 February 14
  • 44. Delta Provisioning • Provision locally & for CI • Faster & different class of problems found • Matches production state ‘from scratch’ provision delta provision provision VM1 import VM1 box on success export VM1 Monday, 10 February 14 fileshare provision VM1
  • 45. Confidence in Change initial + CI + Code review + Delta CI + Vagrant + Veewee + Cukes HIGH NONE Dev Code review CI & Rollout Lifecycle of an infra change Monday, 10 February 14 Soak in Prod
  • 46. Infrequent Releases Monday, 10 February 14
  • 47. Painful Puppet Rollouts • Puppet runs impacted running builds • Disabling all the build agents • Performing the roll out • • git clone / librarian-puppet / symlink update on puppetmaster Manually kick off puppet on all the build agents • Enabling all the build agents • Set of Puppet environments for every bamboo server Monday, 10 February 14
  • 48. Graceful Service restarts + Bamboo Agent JVM process watches for touch file & shutdowns when Idle (written as a Bamboo Plugin) Monday, 10 February 14
  • 49. Puppet Environments • BEFORE - Multiple puppet envs for each Bamboo Server • jbac_staging • jbac_production • cbac_staging • cbac_production • etc • AFTER - Changed to use ‘staging’ & ‘production’ only Monday, 10 February 14
  • 50. Updates on Puppetmaster • BEFORE: Manually on puppetmaster • git clone the puppet tree • run librarian-puppet to pull external modules • Update staging / production symlink • AFTER: Bamboo build which performs the above steps automatically Monday, 10 February 14
  • 51. Less Human interaction + More automation = Higher Confidence Monday, 10 February 14
  • 52. Less Human Effort = Increased frequency of releases Monday, 10 February 14
  • 53. Confidence in Change initial + Cukes + Code review + CI + Vagrant + Delta CI + Veewee + Frequent releases HIGH NONE Dev Code review CI & Rollout Lifecycle of an infra change Monday, 10 February 14 Soak in Prod
  • 54. “ I’m scared! “ ” – Peter Leschev, 3 years ago ” Should I be scared? Monday, 10 February 14 – Peter Leschev, 3 months ago
  • 55. Hipchat integration Monday, 10 February 14
  • 56. Confidence in Change initial + Cukes + Notification + Code review + CI + Vagrant + Delta CI + Veewee + Frequent releases HIGH NONE Dev Code review CI & Rollout Lifecycle of an infra change Monday, 10 February 14 Soak in Prod
  • 57. Confidence in Change before after HIGH NONE Dev Code review CI & Rollout Lifecycle of an infra change Monday, 10 February 14 Soak in Prod
  • 58. Confidence in Change or Finding & fixing problems sooner rather than later Monday, 10 February 14
  • 59. Commit Graph Monday, 10 February 14
  • 60. Snowflakes Pets Cattle Stateless Machines Monday, 10 February 14
  • 61. We’re still on the Journey Come join us! atlassian.com/jobs Monday, 10 February 14
  • 62. Questions? Monday, 10 February 14
  • 63. Thank you! Monday, 10 February 14
  • 64. Monday, 10 February 14