Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Puppet Camp Melbourne Nov 2014 - A Build Engineering Team’s Journey of Infrastructure as Code

12,098 views

Published on

A Build Engineering Team’s Journey of Infrastructure as Code - the challenges that we’ve faced and the practices that we implemented as we went along our journey. 

Published in: Technology
  • Be the first to comment

Puppet Camp Melbourne Nov 2014 - A Build Engineering Team’s Journey of Infrastructure as Code

  1. 1. Peter Leschev @peterleschev Husband, Father of 3 & Atlassian Build Engineering
  2. 2. A Build Engineering Team’s Journey of Infrastructure as Code Nov-2014
  3. 3. Build Engineering today @ Atlassian • Build platform & services used internally within the company • 90k builds per month • 43k automated tests just for JIRA • Developers expect a reliable infrastructure & fast CI feedback
  4. 4. Build Engineering today @ Atlassian • 1000 build agents (own hardware + EC2 instances) • include SCM clients, JDKs, JVM build tools, databases, headless browser testing, python builds, NodeJS, installers & more • Maintain 20 AMIs of various build configurations • 8 Bamboo Servers • maven.atlassian.com / 6 Nexus instances • Monitoring - opsview / graphite / statsd
  5. 5. Build Engineering today @ Atlassian
  6. 6. Infrastructure as Code = Puppet + SCM ?
  7. 7. 4 years ago... • Manually maintained snowflakes • Started using puppet
  8. 8. Production rollout puppetmaster build agents
  9. 9. Production rollout failure puppetmaster build agents
  10. 10. Confidence of Change HIGH NONE Lifecycle of an infra change confidence Dev Rollout Soak in Prod
  11. 11. atlassian.com/git
  12. 12. Style in Pull Requests
  13. 13. Puppet Lint https://github.com/rodjek/puppet-lint Tim Sharpe @rodjek • Automated style checking • Setup automated build that runs checks & posts results • Setup ratchet build to detect regressions
  14. 14. Confidence of Change HIGH NONE initial + Code review Lifecycle of an infra change confidence Dev Code review Rollout Soak in Prod
  15. 15. Using Staging for Development • Coding on Puppet Master • Culture of manually modifying production - Configuration Drift • Impact on Builds puppetmaster build agents staging puppet environment
  16. 16. Vagrant http://www.vagrantup.com/ Mitchell Hashimoto @mitchellh • Easily spin up Infrastructure locally on your laptop • Reproducible / disposable environments • Machine provisioning via Virtual Box / VMWare / AWS • Configuration applied via Shell Scripts / Puppet / Chef • Develop and test infrastructure changes locally
  17. 17. Vagrant Vagrantfile vagrant basebox http://www.vagrantup.com/ Mitchell Hashimoto @mitchellh
  18. 18. Vagrant Spins up a local VM to a known state Make some puppet changes and then run: Destroy the VM when done to apply your changes SSH into your VM using: to check your changes http://www.vagrantup.com/ Mitchell Hashimoto @mitchellh
  19. 19. Confidence of Change HIGH NONE initial + Code review + Vagrant Lifecycle of an infra change confidence Dev Code review Rollout Soak in Prod
  20. 20. Vagrant != Production • Vagrant basebox differences with production machines • Originally using publicly available vagrant baseboxes • Installed packages biggest differences • Generating a basebox manually was a painful process
  21. 21. Packer http://packer.io Mitchell Hashimoto @mitchellh Vagrant box for Virtualbox packer template JSON Vagrant box for AWS
  22. 22. Basebox generation via CI • Latest basebox generated in CI & published to fileshare • No need to generate baseboxes locally
  23. 23. Confidence in Change HIGH NONE initial + Code review + Vagrant + Packer Lifecycle of an infra change confidence Dev Code review Rollout Soak in Prod
  24. 24. Developing locally Rolling out to staging Rolling out to production Broken build agents!
  25. 25. Cucumber https://github.com/cucumber/aruba • Behaviour Driven Development
  26. 26. Cucumber & Vagrant Vagrant Custom Provisioner Virtual Box VM puppet apply cucumber *.features via ssh
  27. 27. Disadvantages • Requires cucumber dependencies to be installed on tested VM • Tests run within the VM making testing firewall rules harder
  28. 28. Confidence in Change HIGH NONE initial + Code review + Vagrant + Packer + Cukes Lifecycle of an infra change confidence Dev Code review Rollout Soak in Prod
  29. 29. But it works on my machine! “ – Every Developer”
  30. 30. Continuous Integration • ‘From scratch’ provisioning • Confidence that you can rebuild in disaster
  31. 31. The Pets: you give nice names, you stroke them, and when they get ill, you nurse them back to health, taking a long time over it The Cattle: you give them numbers. When they get ill, you shoot them – Tim Bell, CERN ” “
  32. 32. Confidence in Change HIGH NONE initial + Code review + Vagrant + Packer + Cukes + CI Lifecycle of an infra change confidence Dev Code review CI & Rollout Soak in Prod
  33. 33. Provisioning from scratch is slow
  34. 34. Spread out CI provision VM #1 Moved from sequential to parallel provisioning provision VM #2 provision VM #3 provision VM #4 provision VM #1 provision VM #2 provision VM #3 provision VM #4
  35. 35. There are so many MacPros you can steal
  36. 36. The ones I had my eye on....
  37. 37. Profiling Puppet Runs Add “--evaltrace” to puppet apply + = Collect and show the longest occurrences of: “Evaluated in ([d.]+) seconds”
  38. 38. Profiling Cucumber runs http://itshouldbeuseful.wordpress.com/2010/11/10/find-your-slowest-running-cucumber-features/
  39. 39. Delta Provisioning • Provision locally & for CI • Faster & different class of problems found • Matches production state ‘from scratch’ provision delta provision provision VM export VM fileshare import VM box provision VM on success
  40. 40. Confidence in Change HIGH NONE initial + Code review + Vagrant + Packer + Cukes + CI + Delta CI Lifecycle of an infra change confidence Dev Code review CI & Rollout Soak in Prod
  41. 41. Broken builds master
  42. 42. Branch builds BUILDENG-5669 master BUILDENG-5670
  43. 43. Confidence in Change HIGH NONE initial + Code review + Vagrant + Packer + Cukes + CI + Delta CI + Branch CI Lifecycle of an infra change confidence Dev Branch CI Code review CI & Rollout Soak in Prod
  44. 44. Slow builds
  45. 45. Vagrant-AWS https://github.com/mitchellh/vagrant-aws
  46. 46. Vagrant-AWS https://github.com/mitchellh/vagrant-aws • MacPros no longer required • They were limited in supply & old • 2x speed improvement • Only limited by our credit card limit
  47. 47. Catalog Diff Step 1: Generate a hash of a node’s catalog puppet master --logdest console --compile HOSTNAME HOSTNAME.json - Sort elements - Remove timestamps - Generate shasum f50db91e6461f5bdcb56769a8f77da1fac26943d
  48. 48. Catalog Diff Step 2: Compare the hash of master versus your branch to avoid unnecessary provisioning Example 1: master branch f50db91e6461f5bdcb56769a8f77da1fac26943d = f50db91e6461f5bdcb56769a8f77da1fac26943d Hash is the same, no build required Example 2: master branch f50db91e6461f5bdcb56769a8f77da1fac26943d != 18033e4d21b78bab6deb3ae1ff3c147ade5a37ca Hash is different, build required
  49. 49. Catalog Diff Step 3: Profit! Reduction in feedback time + $$$ saved Images: http://pixabay.com/p-30984/ https://www.flickr.com/photos/williamnyk/3598113750/
  50. 50. Infrequent Releases
  51. 51. Painful Puppet Rollouts • Puppet runs impacted running builds • Disabling all the build agents • Performing the roll out • git clone / librarian-puppet / symlink update on puppetmaster • Manually kick off puppet on all the build agents • Enabling all the build agents • Set of Puppet environments for every bamboo server
  52. 52. Graceful Service restarts + Bamboo Agent JVM process watches for touch file & shutdowns when Idle (written as a Bamboo Plugin)
  53. 53. Puppet Environments • BEFORE - Multiple puppet envs for each Bamboo Server • jbac_staging • jbac_production • cbac_staging • cbac_production • etc • AFTER - Changed to use ‘staging’ & ‘production’ only
  54. 54. Updates on Puppetmaster • BEFORE: Manually on puppetmaster • git clone the puppet tree • run librarian-puppet to pull external modules • Update staging / production symlink • AFTER: Bamboo build which performs the above steps automatically
  55. 55. Bot automation - ‘open prs’
  56. 56. Less Human interaction + More automation = Higher Confidence
  57. 57. Less Human Effort = Increased frequency of releases
  58. 58. Confidence in Change HIGH NONE initial + Code review + Vagrant + Packer + Cukes + CI + Delta CI + Branch CI + Frequent Releases Lifecycle of an infra change confidence Dev Branch CI Code review CI & Rollout Soak in Prod
  59. 59. I’m scared! “– Peter Leschev, 3.5” years ago Should I be scared? “ – Peter Leschev, 3 months a”go
  60. 60. Hipchat integration
  61. 61. Confidence in Change HIGH NONE initial + Code review + Vagrant + Packer + Cukes + CI + Delta CI + Branch CI + Frequent Releases + Notification Lifecycle of an infra change confidence Dev Branch CI Code review CI & Rollout Soak in Prod
  62. 62. Confidence in Change HIGH NONE before after Lifecycle of an infra change confidence Dev Branch CI Code review CI & Rollout Soak in Prod
  63. 63. Confidence in Change or Finding & fixing problems sooner rather than later
  64. 64. Snowflakes Pets Cattle Stateless Machines
  65. 65. We’re still on the Journey Come join us! atlassian.com/jobs
  66. 66. one more thing…
  67. 67. Puppet Module for Sonatype Nexus • https://forge.puppetlabs.com/atlassian/nexus_rest • Configure Nexus using Custom Puppet Provider Types rather than XML files
  68. 68. Thank you!
  69. 69. Questions?

×