Puppet Camp Sydney Feb 2014 - A Build Engineering Team’s Journey of Infrastructure as Code

3,328 views

Published on

A Build Engineering Team’s Journey of Infrastructure as Code - the challenges that we’ve faced and the practices that we implemented as we went along our journey. 

3 Comments
9 Likes
Statistics
Notes
No Downloads
Views
Total views
3,328
On SlideShare
0
From Embeds
0
Number of Embeds
633
Actions
Shares
0
Downloads
47
Comments
3
Likes
9
Embeds 0
No embeds

No notes for slide

Puppet Camp Sydney Feb 2014 - A Build Engineering Team’s Journey of Infrastructure as Code

  1. 1. Monday, 10 February 14
  2. 2. Peter Leschev @peterleschev Husband, Father of 3 & Atlassian Build Engineering Team Lead Monday, 10 February 14
  3. 3. A Build Engineering Team’s Journey of Infrastructure as Code Monday, 10 February 14
  4. 4. Build Engineering today @ Atlassian • Build platform & services used internally within the company • 60k builds per month • 35k automated tests for JIRA Monday, 10 February 14
  5. 5. Build Engineering today @ Atlassian • 600 build agents (own hardware + EC2 instances) • include SCM clients, JDKs, JVM build tools, databases, headless browser testing, python builds, NodeJS, installers & more • Maintain 20 AMIs of various build configurations • 6 Bamboo Servers • maven.atlassian.com / 6 Nexus instances • Monitoring - opsview / graphite / statsd Monday, 10 February 14
  6. 6. Infrastructure as Code = Puppet + SCM ? Monday, 10 February 14
  7. 7. 3 years ago... • Manually maintained snowflakes • Started using puppet Monday, 10 February 14
  8. 8. Production rollout puppetmaster build agents Monday, 10 February 14
  9. 9. Production rollout failure puppetmaster build agents Monday, 10 February 14
  10. 10. Confidence of Change HIGH NONE Dev Rollout Lifecycle of an infra change Monday, 10 February 14 Soak in Prod
  11. 11. Monday, 10 February 14
  12. 12. https://bitbucket.org/ http://atlassian.com/git Monday, 10 February 14
  13. 13. Style in Pull Requests Monday, 10 February 14
  14. 14. Puppet Lint https://github.com/rodjek/puppet-lint Tim Sharpe @rodjek • Automated style checking • Setup automated build that runs checks & posts results • Still need to implement a ratchet build Monday, 10 February 14
  15. 15. Confidence of Change initial + Code review HIGH NONE Dev Code review Rollout Lifecycle of an infra change Monday, 10 February 14 Soak in Prod
  16. 16. Using Staging for Development puppetmaster • Coding on Puppet Master • Culture of manually modifying production - Configuration Drift • Impact on Builds staging puppet environment build agents Monday, 10 February 14
  17. 17. Vagrant http://www.vagrantup.com/ Mitchell Hashimoto @mitchellh • Easily spin up Infrastructure locally on your laptop • Disposable / reproducible environments • Machine provisioning via Virtual Box / VMWare / AWS • Configuration applied via Shell Scripts / Puppet / Chef • Develop and test infrastructure changes locally Monday, 10 February 14
  18. 18. Vagrant Vagrantfile vagrant basebox Monday, 10 February 14 http://www.vagrantup.com/ Mitchell Hashimoto @mitchellh
  19. 19. Vagrant Mitchell Hashimoto @mitchellh http://www.vagrantup.com/ Spins up a local VM to a known state Make some puppet changes and then run: to apply your changes SSH into your VM using: to check your changes Destroy the VM when done Monday, 10 February 14
  20. 20. Confidence of Change initial + Code review + Vagrant HIGH NONE Dev Code review Rollout Lifecycle of an infra change Monday, 10 February 14 Soak in Prod
  21. 21. Vagrant != Production • Vagrant basebox differences with production machines • Originally using publicly available vagrant baseboxes • Installed packages biggest differences • Generating a basebox manually was a painful process Monday, 10 February 14
  22. 22. Veewee Automated Vagrant basebox generation https://github.com/jedi4ever/veewee Patrick Debois @patrickdebois + Ubuntu installation iso Monday, 10 February 14 Veewee definitions.rb preseed.cfg postinstall.sh vagrant basebox
  23. 23. Veewee Automated Vagrant basebox generation Monday, 10 February 14 https://github.com/jedi4ever/veewee Patrick Debois @patrickdebois
  24. 24. Basebox generation via CI • Latest basebox generated in CI & published to fileshare • No need to generate baseboxes locally Monday, 10 February 14
  25. 25. There are still differences! • VirtualBox Guest additions • Reduced to a minimal Monday, 10 February 14
  26. 26. Common Preseed / Postinstall + preseed.cfg vagrant basebox Monday, 10 February 14 postinstall.sh PXEBoot custom ISOs
  27. 27. Packer Monday, 10 February 14 http://packer.io Mitchell Hashimoto @mitchellh
  28. 28. Confidence in Change initial + Code review + Vagrant + Veewee HIGH NONE Dev Code review Rollout Lifecycle of an infra change Monday, 10 February 14 Soak in Prod
  29. 29. Developing locally Rolling out to staging Rolling out to production Broken build agents! Monday, 10 February 14
  30. 30. Cucumber • Behaviour Driven Development Monday, 10 February 14
  31. 31. Cucumber & Vagrant Vagrant Virtual Box VM Custom Provisioner via ssh puppet apply cucumber *.features Monday, 10 February 14
  32. 32. Disadvantages • Requires cucumber dependencies to be installed on tested VM • Tests run within the VM making testing firewall rules harder Monday, 10 February 14
  33. 33. Confidence in Change initial + Code review + Vagrant + Veewee + Cukes HIGH NONE Dev Code review Rollout Lifecycle of an infra change Monday, 10 February 14 Soak in Prod
  34. 34. “ Monday, 10 February 14 But it works on my machine! – Every Developer ”
  35. 35. Continuous Integration • ‘From scratch’ provisioning • Confidence that you can rebuild in disaster Monday, 10 February 14
  36. 36. “ The Pets: you give nice names, you stroke them, and when they get ill, you nurse them back to health, taking a long time over it The Cattle: you give them numbers. When they get ill, you shoot them ” – Tim Bell, CERN Monday, 10 February 14
  37. 37. Confidence in Change initial + Code review + Vagrant + Veewee + Cukes + CI HIGH NONE Dev Code review CI & Rollout Lifecycle of an infra change Monday, 10 February 14 Soak in Prod
  38. 38. Provisioning from scratch is slow Monday, 10 February 14
  39. 39. Spread out CI provision VM1 provision VM2 provision VM3 provision VM4 Monday, 10 February 14 provision VM1 provision VM2 Moved from sequential to parallel provisioning provision VM3 provision VM4
  40. 40. There are so many MacPros you can steal Monday, 10 February 14
  41. 41. The ones I have my eye on.... Monday, 10 February 14
  42. 42. Profiling Puppet Runs Add “--evaltrace” to puppet apply + Collect and show the longest occurrences of: “Evaluated in ([d.]+) seconds” Monday, 10 February 14 =
  43. 43. Profiling Cucumber runs http://itshouldbeuseful.wordpress.com/2010/11/10/find-your-slowest-running-cucumber-features/ Monday, 10 February 14
  44. 44. Delta Provisioning • Provision locally & for CI • Faster & different class of problems found • Matches production state ‘from scratch’ provision delta provision provision VM1 import VM1 box on success export VM1 Monday, 10 February 14 fileshare provision VM1
  45. 45. Confidence in Change initial + CI + Code review + Delta CI + Vagrant + Veewee + Cukes HIGH NONE Dev Code review CI & Rollout Lifecycle of an infra change Monday, 10 February 14 Soak in Prod
  46. 46. Infrequent Releases Monday, 10 February 14
  47. 47. Painful Puppet Rollouts • Puppet runs impacted running builds • Disabling all the build agents • Performing the roll out • • git clone / librarian-puppet / symlink update on puppetmaster Manually kick off puppet on all the build agents • Enabling all the build agents • Set of Puppet environments for every bamboo server Monday, 10 February 14
  48. 48. Graceful Service restarts + Bamboo Agent JVM process watches for touch file & shutdowns when Idle (written as a Bamboo Plugin) Monday, 10 February 14
  49. 49. Puppet Environments • BEFORE - Multiple puppet envs for each Bamboo Server • jbac_staging • jbac_production • cbac_staging • cbac_production • etc • AFTER - Changed to use ‘staging’ & ‘production’ only Monday, 10 February 14
  50. 50. Updates on Puppetmaster • BEFORE: Manually on puppetmaster • git clone the puppet tree • run librarian-puppet to pull external modules • Update staging / production symlink • AFTER: Bamboo build which performs the above steps automatically Monday, 10 February 14
  51. 51. Less Human interaction + More automation = Higher Confidence Monday, 10 February 14
  52. 52. Less Human Effort = Increased frequency of releases Monday, 10 February 14
  53. 53. Confidence in Change initial + Cukes + Code review + CI + Vagrant + Delta CI + Veewee + Frequent releases HIGH NONE Dev Code review CI & Rollout Lifecycle of an infra change Monday, 10 February 14 Soak in Prod
  54. 54. “ I’m scared! “ ” – Peter Leschev, 3 years ago ” Should I be scared? Monday, 10 February 14 – Peter Leschev, 3 months ago
  55. 55. Hipchat integration Monday, 10 February 14
  56. 56. Confidence in Change initial + Cukes + Notification + Code review + CI + Vagrant + Delta CI + Veewee + Frequent releases HIGH NONE Dev Code review CI & Rollout Lifecycle of an infra change Monday, 10 February 14 Soak in Prod
  57. 57. Confidence in Change before after HIGH NONE Dev Code review CI & Rollout Lifecycle of an infra change Monday, 10 February 14 Soak in Prod
  58. 58. Confidence in Change or Finding & fixing problems sooner rather than later Monday, 10 February 14
  59. 59. Commit Graph Monday, 10 February 14
  60. 60. Snowflakes Pets Cattle Stateless Machines Monday, 10 February 14
  61. 61. We’re still on the Journey Come join us! atlassian.com/jobs Monday, 10 February 14
  62. 62. Questions? Monday, 10 February 14
  63. 63. Thank you! Monday, 10 February 14
  64. 64. Monday, 10 February 14

×