• Like
Puppet Camp Boston 2014: Orchestrating Infrastructure Change Using Puppet Rake, mcollective, LM and Jenkins (Intermediate)
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

Puppet Camp Boston 2014: Orchestrating Infrastructure Change Using Puppet Rake, mcollective, LM and Jenkins (Intermediate)

  • 406 views
Published

Orchestrating Infrastructure Change Using Puppet Rake, mcollective, LM and Jenkins presented by Anton Gurov and Chaminda Delpagodage, Paydiant at Puppet Camp Boston 2014

Orchestrating Infrastructure Change Using Puppet Rake, mcollective, LM and Jenkins presented by Anton Gurov and Chaminda Delpagodage, Paydiant at Puppet Camp Boston 2014

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
406
On SlideShare
0
From Embeds
0
Number of Embeds
1

Actions

Shares
Downloads
9
Comments
0
Likes
3

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Application Deployment Orchestration with Puppet and Jenkins Anton Gurov, Chaminda Delpagodage August 20, 2014
  • 2. 2 About Us Chaminda Delpagodage Paydiant Technical Operations Team Release Engineering, Systems Administration, Automation linkedin.com/in/chamindad Anton Gurov Paydiant Technical Operations Team Infrastructure, Systems Administration, Security linkedin.com/in/antongurov
  • 3. 3 Cloud-based mobile wallet solution Open ecosystem for mobile payments, offers and loyalty Completely white-label “Bank grade” platform of shared services ↘ SaaS ↘ Secure SDKs for iPhone and Android Top tier investors and well capitalized
  • 4. 4 Paydiant Puppet Use Puppet Enterprise (PE) users since day one 100% PE coverage of Paydiant platform ↘ PE handles everything after instance bootstrap 900 800 700 600 500 400 Multiple environments actively managed by PE ↘ 4 Puppet Masters in multiple datacenters and security zones ↘ 8 Environments Licensed node count doubling every year Nodes under management Estimated by Year-End 300 200 100 0 Hosts 2011 2012 2013 2014 EST
  • 5. 5 Paydiant Puppet Use ‘11-12 – Bi-annual production platform releases ↘ Waterfall – major platform change ↘ Big outage – 1-2 days on the weekend ‘13-14 – Transition to daily/weekly non-production and monthly production releases ↘ Agile – smaller platform changes ↘ Zero-downtime deployment ↘ 100% Production release success rate since inception Heavy usage of Puppet Dashboard, Puppet APIs and Jenkins
  • 6. 6 Puppet Dashboard as data repository Why Dashboard? ↘ Visual, flexible, powerful (if used right) ↘ Allows for business data edits by teams unfamiliar with Puppet ↘ Hiera not available at the time Decided early on to keep Puppet code and data separate Came up with our own Dashboard pattern – “Classes, Parameters and Supergroups” Puppet Module Code Puppet Dashboar d Business Data Puppet Module Parameter s
  • 7. 7 Puppet Dashboard as data repository Classes, Parameters and Supergroups pattern overview class_C supergroup_type_A class_A class_B parameters_X parameters_Y … parameters_Z … node 1 node 2 node 3 node 4 … node X Groups Nodes
  • 8. 8 Puppet Dashboard as data repository Classes, Parameters and Supergroups pattern overview class_C supergroup_type_B class_A class_B parameters_X parameters_Y … parameters_Z … node 1 node 2 node 3 node 4 … node X Groups Nodes
  • 9. 9 Puppet Dashboard as data repository Class building block Group name prefixed with class_ Contains Puppet class and some default variables/parameters for the class class A class B class_B def: default params incl: class B class_A def: default params incl: class A class C class_C def: default params incl: class C … Classes Groups
  • 10. 10 Puppet Dashboard as data repository Class building block - example
  • 11. 11 Puppet Dashboard as data repository Parameters building block Group name prefixed with parameters_ Only contains data and data overrides Arbitrary hierarchy levels Allows for inheritance and reuse parameters_X def: default params parameters_X_1 incl: def: params overrides def: additional params parameters_X parameters_X_2 incl: def: params overrides def: additional params supergroup_A supergroup_B parameters_X supergroup_C
  • 12. 12 Puppet Dashboard as data repository Parameters building block – inheritance example
  • 13. 13 Puppet Dashboard as data repository Supergroup building block == server “role” Group name prefixed with supergroup_ Contains all the “ingredients” for the node to configure and define itself Node can belong to only one supergroup (many-to-one) class_B parameters_Z class_A parameters_X supergroup_type_A incl: class_A parameters_Z def: params overrides (if any) def: additional params (if any) node 1 node 2 Groups Nodes class_B parameters_X
  • 14. 14 Puppet Dashboard as data repository Supergroup building block - example 2-3 pages condensed
  • 15. 15 Classes, Parameters and Supergroups pattern Pros All parameters and classes are visible on the Supergroup page ↘ See missing parameters (if inherited “SET ME!” from parent for example) ↘ See parameter clashes (Dashboard will warn if parameter is defined in 2 places) ↘ See exactly where parameter is defined Allows teams unfamiliar with Puppet to make changes via Dashboard Arbitrary data hierarchy/inheritance Data reuse
  • 16. 16 Classes, Parameters and Supergroups pattern Cons Version control is difficult ↘ Have to resolve to group cloning/export/import (custom RAKE copy/clone command from Puppet support) ↘ Puppet roadmap to fix this Dashboard UI could use some help ↘ Too much data on the screen sometimes ↘ Lack of sorting/grouping Can’t store complex multi-line variables like text blobs
  • 17. Zero-Downtime Deployment architecture …
  • 18. Frontend Load Balancer v.1 FFEE--AB v.1 FFEE--BB v.1 Backend Load Balancer BFEE--AB v.1 BFEE--BB v.1 parameters_deployment-staging-FE-BankA paydiant_deployment_bank=STAGING-FRONTEND-A paydiant_app_operation_mode=LIVE paydiant_app_version=1 v.1 High-level platform representation parameters_deployment-staging-BE-BankA paydiant_deployment_bank=STAGING-BACKEND-A paydiant_app_operation_mode=LIVE paydiant_app_version=1 parameters_deployment-staging-FE-BankB paydiant_deployment_bank=STAGING-FRONTEND-B paydiant_app_operation_mode=LIVE paydiant_app_version=1 parameters_deployment-staging-BE-BankB paydiant_deployment_bank=STAGING-BACKEND-B paydiant_app_operation_mode=LIVE paydiant_app_version=1
  • 19. Frontend Load Balancer FFEE--AB v.1 FFEE--BB v.1 Backend Load Balancer BFEE--AB v.1 BFEE--BB v.1 Disable B(FE+BE) vv.1.1 parameters_deployment-staging-FE-BankB paydiant_deployment_bank=STAGING-FRONTEND-B paydiant_app_operation_mode=MAINTENANCE paydiant_app_version=1 parameters_deployment-staging-BE-BankB paydiant_deployment_bank=STAGING-BACKEND-B paydiant_app_operation_mode=MAINTENANCE paydiant_app_version=1
  • 20. Frontend Load Balancer v.2 a FFEE--AB v.1 FFEE--BB v.1 Backend Load Balancer BFEE--AB v.1 BFEE--BB v.1 Run first phase of database changes (i.e. adds new stuff & migrate data) v.2 a DB changes Phase 1
  • 21. Frontend Load Balancer FFEE--AB v.1 FFEE--BB v.2 Backend Load Balancer BFEE--AB v.1 BFEE--BB v.2 Upgrade B (FE+BE) v.2 a v.2 a parameters_deployment-staging-FE-BankB paydiant_deployment_bank=STAGING-FRONTEND-B paydiant_app_operation_mode=MAINTENANCE paydiant_app_version=2 parameters_deployment-staging-BE-BankB paydiant_deployment_bank=STAGING-BACKEND-B paydiant_app_operation_mode=MAINTENANCE paydiant_app_version=2
  • 22. Frontend Load Balancer FFEE--AB v.1 FFEE--BB v.2 Backend Load Balancer BFEE--AB v.1 BFEE--BB v.2 Re-enable B (FE+BE) v.2 a v.2 a parameters_deployment-staging-FE-BankB paydiant_deployment_bank=STAGING-FRONTEND-B paydiant_app_operation_mode=LIVE paydiant_app_version=2 parameters_deployment-staging-BE-BankB paydiant_deployment_bank=STAGING-BACKEND-B paydiant_app_operation_mode=LIVE paydiant_app_version=2
  • 23. Frontend Load Balancer FFEE--AB v.1 FFEE--BB v.2 Backend Load Balancer BFEE--AB v.1 BFEE--BB v.2 Disable A(FE+BE) v.2 a v.2 a parameters_deployment-staging-FE-BankA paydiant_deployment_bank=STAGING-FRONTEND-A paydiant_app_operation_mode=MAINTENANCE paydiant_app_version=1 parameters_deployment-staging-BE-BankA paydiant_deployment_bank=STAGING-BACKEND-A paydiant_app_operation_mode=MAINTENANCE paydiant_app_version=1
  • 24. Frontend Load Balancer FFEE--AB v.2 FFEE--BB v.2 Backend Load Balancer BFEE--AB v.2 BFEE--BB v.2 Upgrade A (FE+BE) v.2 a v.2 a parameters_deployment-staging-FE-BankA paydiant_deployment_bank=STAGING-FRONTEND-A paydiant_app_operation_mode=MAINTENANCE paydiant_app_version=2 parameters_deployment-staging-BE-BankA paydiant_deployment_bank=STAGING-BACKEND-A paydiant_app_operation_mode=MAINTENANCE paydiant_app_version=2
  • 25. Frontend Load Balancer FFEE--AB v.2 FFEE--BB v.2 Backend Load Balancer BFEE--AB v.2 BFEE--BB v.2 Re-enable A (FE+BE) v.2 a v.2 a parameters_deployment-staging-FE-BankA paydiant_deployment_bank=STAGING-FRONTEND-A paydiant_app_operation_mode=LIVE paydiant_app_version=2 parameters_deployment-staging-BE-BankA paydiant_deployment_bank=STAGING-BACKEND-A paydiant_app_operation_mode=LIVE paydiant_app_version=2
  • 26. Frontend Load Balancer v.2 FFEE--AB v.2 FFEE--BB v.2 Backend Load Balancer BFEE--AB v.2 BFEE--BB v.2 Run second phase of database changes (Cleanup old v.1 data) v.2 DB changes Phase 2
  • 27. Details of the upgrade sequence …
  • 28. Frontend Load Balancer v.1 FFEE--AB v.1 FFEE--BB v.1 Backend Load Balancer BFEE--AB v.1 BFEE--BB v.1 Putting a set of nodes into maintenance mode
  • 29. 29 Putting nodes into maintenance mode Using LB node health check – http://nodeX:8080/healthcheck.jsp Puppet ERB template for healthcheck.jsp content … … … Pseudo code: Check if “maintenance mode” throw exception else If “module A” present Check if module A is up If “module B” present Check if module B is up … Throw 503 if any exception caught
  • 30. 30 Putting nodes into maintenance mode cont. A parameter group controls the maintenance mode E.g. Parameter group “parameters_deployment-staging-BankB” controls “paydiant_app_operation_mode” for the nodes in set FE-B of the Staging environment
  • 31. 31 Putting nodes into maintenance mode cont. Update group parameter using Rake API (as ‘puppet-dashboard’ user) RACK_ENV=production /opt/puppet/bin/rake -s -X -f /opt/puppet/share/puppet-dashboard/Rakefile nodegroup:variables [parameters_deployment-stagin-BankB, 'paydiant_app_operation_mode=MAINTENANCE’] Puppet run-once using MCO (as ‘peadmin’ user) mco puppet runonce --with-fact fact_paydiant_deployment_bank=STAGING-FRONTEND-B While loop… check the health check page till all nodes return 503 (i.e. in maintenance) status mco shellcmd --with-fact fact_paydiant_deployment_bank=STAGING-FRONTEND-B --cmd=''curl --silent http://localhost:8080/healthcheck/healthcheck.jsp
  • 32. Frontend Load Balancer FFEE--AB v.1 FFEE--BB v.2 Backend Load Balancer BFEE--AB v.1 BFEE--BB v.2 Upgrading applications on a set of nodes v.2 a
  • 33. 33 Upgrading Application Version Disable Puppet agent mco puppet disable --with-fact fact_paydiant_deployment_bank=STAGING-FRONTEND-B Stop Tomcat service mco service tomcat stop --with-fact fact_paydiant_deployment_bank=STAGING-FRONTEND-B Cleanup exploded Tomcat webapps directory (for sanity) mco shellcmd --with-fact fact_paydiant_deployment_bank=STAGING-FRONTEND-B --cmd='find $TOMCAT_HOME/webapps/ - maxdepth 1 -mindepth 1 -type d -exec rm -rf {} ;’
  • 34. 34 Upgrading Application Version Cont. Upgrade the application version RACK_ENV=production /opt/puppet/bin/rake -s -X -f /opt/puppet/share/puppet-dashboard/Rakefile nodegroup:variables [parameters_deployment-stagin-BankB, ’paydiant_app_version=2’] Re-enable Puppet mco puppet enable --with-fact fact_paydiant_deployment_bank=STAGING-FRONTEND-B Puppet run-once mco puppet runonce --with-fact fact_paydiant_deployment_bank=STAGING-FRONTEND-B
  • 35. Frontend Load Balancer FFEE--AB v.1 FFEE--BB v.2 Backend Load Balancer BFEE--AB v.1 BFEE--BB v.2 Taking a set of nodes out of maintenance mode v.2 a
  • 36. 36 Taking nodes out of maintenance mode Update parameter using Rake API (as ‘puppet-dashboard’ user) RACK_ENV=production /opt/puppet/bin/rake -s -X -f /opt/puppet/share/puppet-dashboard/Rakefile nodegroup:variables [parameters_deployment-staging-BankB, 'paydiant_app_operation_mode=LIVE’] Puppet run-once using MCO (as ‘peadmin’ user) mco puppet runonce --with-fact fact_paydiant_deployment_bank=STAGING-FRONTEND-B While loop… check the health check page till all nodes return 200 (i.e. live) status mco shellcmd --with-fact fact_paydiant_deployment_bank=STAGING-FRONTEND-B --cmd=''curl --silent http://localhost:8080/healthcheck/healthcheck.jsp
  • 37. Frontend Load Balancer FFEE--AB v.1 FFEE--BB v.2 Backend Load Balancer BFEE--AB v.1 BFEE--BB v.2 Switching traffic to upgraded stack v.2 a
  • 38. Viewing transition in Splunk across multiple datacenters
  • 39. Jenkins …
  • 40. 40 What is Jenkins Tool to schedule and monitor the execution of repeated jobs
  • 41. 41 Why Jenkins ? Configurability ↘ Different types of input parameters ↘ Invoke shell scripts ↘ Post-build actions (automatic/manual)
  • 42. 42 Why Jenkins ? cont. Plugin support ↘ More than 600 plugins (https://wiki.jenkins-ci.org/display/JENKINS/Plugins) ↘ Eg. vSphere plugin (stop/start, snapshots, rollbacks…) ↘ Build pipeline plugin ↘ Parameterized remote trigger plugin
  • 43. 43 Why Jenkins ? cont. Keeps all your console logs at a single place ↘ No need to hunt for 10 log files on 5 different machines ↘ Visual representation of passed/failed/in-progress status, based on downstream shell scripts or other jobs
  • 44. 44 Why Jenkins ? cont. And it’s…
  • 45. MCO Rake API Source code, liquibase change sets DB FFEE--*B BFEE--*B
  • 46. 46 Jenkins – Puppet Integration
  • 47. 47 Jenkins – Puppet Integration cont.
  • 48. 48 Jenkins – Puppet Integration cont.
  • 49. 49 Jenkins – Puppet Integration cont.
  • 50. 50 Jenkins – Puppet Integration cont. Jenkins invoke local bash scripts, which in turn use SSH to call; ↘ MCO (as ‘peadmin’ user on Puppet Master) ↘ Rake API (as ‘puppet-dashboard’ user on Puppet Master) SSH login as ‘peadmin’ and ‘puppet-dashboard’ is password-less, using PKI ↘ Generate RSA keypair for the local Jenkins user, using ssh-keygen command ↘ Append the public key to ~/.ssh/authorized_keys file of ‘peadmin’ and ‘puppet-dashboard’ users, on Puppet Master MCO special purpose sub commands we use; ↘ puppet ↘ service ↘ shellcmd* (ask your Puppet Enterprise Support for this custom MCO plugin)
  • 51. 51 Links Rake API: https://docs.puppetlabs.com/pe/latest/console_rake_api.html MCO: https://docs.puppetlabs.com/mcollective/reference/basic/basic_cli_usage.html Jenkins: http://jenkins-ci.org/ Liquibase: http://www.liquibase.org/documentation/index.html
  • 52. 52 Recap/Takeaways… Use Puppet Enterprise ↘ Support is awesome (Celia Cottle, Jay Wallace, Ken Johnson, Zachary Stern – you guys rock!) ↘ Gotten help and features from James Turnbull and Nigel Kersten with some early versions of PE ↘ Live management and Mcollective are essential for any self-respecting enterprise Zero-downtime upgrades ↘ To Dashboard or not to Dashboard? ↘ Database update phases ↘ Managing LB health check monitors dynamically using Puppet Automation baby steps – don’t boil the ocean ↘ Understand what you are doing before automating it - develop runbooks ↘ Identify manual steps and script some of them ↘ Add scripts to orchestration tool (Jenkins, ServiceNow, whatever else you use in-house)
  • 53. Thank you.