BelvedereEnvironment consistency    from Dev to Prod Infracoders, April 2013
Colin Panisset   @nonspecialistTech Lead – Infrastructure       REA Group
belvedere noun: a building, or   architectural feature of a building,designed and situated to look out upon a            p...
In other words:A platform that lets you see nice             things
In thebeginning …
Devs wrote codeOps deployed it
Release cycles were long
Code moved throughenvironments like ...
When code got to staging:
#!/bin/sdlcwhile !staging.ok:    Fix problems    Redeploy
#!/bin/sdlcwhile !staging.ok:   devs.fix_problems   ops.deploy
Then, when the code got to         prod:
This was not considered ideal
But the code hadnt     changed ...
Environments had
The more we looked at it,the more problems we found
OS version differences
Deployment methods    OS version differences
Deployment methods    OS version differences              Package versions
Deployment methods    OS version differencesApp configs              Package versions
Deployment methods               Authentication    OS version differencesApp configs              Package versions
Deployment methods                 Authentication    OS version differences              Hardcoded IPsApp configs         ...
What to do?
Designing Belvedere
3-pronged approach
3-pronged approach1. Move environment-specificconfig into the environment
3-pronged approach1. Move environment-specificconfig into the environment2. Convention overconfiguration
3-pronged approach1. Move environment-specificconfig into the environment2. Convention overconfiguration3. Use the same OS...
Find problems as early in the  development pipeline as          possible
Give devs familiarity with aproduction-like system to aid  transition to a more open        access model
1. Move environment-specific config into the environment
Credentials     New Relic keys   Database passwords     ConfigurationApp-emitted endpoint URLs
Move environment-specific  configuration into … environment variables
Populate environment variables       before app starts
Update source files forenvironment variables each   time the app restarts
Provide a “config service” as thesource of truth for values in the         environment
#!/bin/sh                                                  applicationinit script                  ...
Config service is hierarchical based on client hostname
Common values sit higher up        the tree
global       zone1.foo.com            zone2.foo.com                            hostA.zone2.foo.comhostA.zone1.foo.com     ...
global       zone1.foo.com            zone2.foo.com               New Relic key            New Relic key                  ...
Override values for host or  domain-specific cases
2. Convention     overconfiguration
Use short DNS CNAMEs whichresolve differently in different        environments
Examples            prod.foo.com   dev.foo.comsmtpdb-rwauth.nearauth.far
Examples              prod.foo.com      dev.foo.comsmtp        mailsvr.foo.comdb-rw       master-db.foo.comauth.near   lda...
Examples              prod.foo.com        dev.foo.comsmtp        mailsvr.foo.com     null.dev.foo.comdb-rw       master-db...
You need to have DNS working
You need to have environments with different DNS domains or           subdomains
Environment-independent     config example      ntp.conf
/etc/ntp.conf:...server ntp1.near iburstserver ntp2.near iburstserver ntp1.far   iburstrestrict nagios.near mask 255.255.2...
Note use of .near and .far to indicate relative preference;  you cant weight A records
DNS resolvers support search  paths; use that feature!
3. Use the same OS imageeverywhere (including dev)
Basic principlesa. Build the “platform image” once
Basic principlesa. Build the “platform image” once   b. Transform the built image      format, not the content
Basic principlesa. Build the “platform image” once   b. Transform the built image      format, not the content  c. Provide...
Creating theplatform image
CentOS 6 x86_64
KojiRaw disk image
In words  Commits to the belvedere repo in   github trigger a build in Jenkins which uses koji spin-appliance to   create ...
KojiRaw disk image
Images have the commit SHAburned into the filesystem for        identification
Build time: 26 minutesPromotion via tags: 1 minute
Testing the built image
ovftool             knife vsphere                             VMRaw disk image
puppet/modules/postfix/manifests/config.pp:class postfix::config {    include nrpe    file { /etc/nagios/nrpe.d/postfix.cf...
On the platform image:/etc/nagios/nrpe.d/postfix.cfgcommand[check_postfix]=/usr/lib64/nagios/plugins/check_smtp -4 -H loca...
The test script in the build pipeline:echo -n "Checking Postfix (SMTP): "RES=$( check_nrpe -H $TARGET -c check_postfix 2>&...
Use the same Nagios probes inprod to test running instances
Distributing theplatform image
AMIVM       Template     ovftool                    aws-cli               Raw disk image
Multiple VMware environments   in different datacentres
Raw disk image →qemu-img convert →    ovftool →  knife vsphere
Multiple AWS Regions andaccounts (dev/staging/prod)
Raw disk image →aws cloudformation create-stack →           rsync →             dd →  aws ec2 create-snapshot    →    aws ...
For distant regionsmaintain a persistent EC2        instance and use rsyncs delta tominimise transmitted bytes
Raw disk image: 1GB  (uncompressed)
Built-in auto-resizing on reboot when the underlying device             grows
Works for physical nodes too:   same Puppet manifest   different kickstart file
We can bring persistent(physical) boxes up to date          simply: box# puppet apply ...
Raw disk image can be used as       a local VM, too  (KVM, Xen, vagrant, VMWare fusion,             Parallels, etc)
Image promotion is the only       manual step
Features!
AWS CloudFormation cfn-init
AWS CloudFormation cfn-init      VMWare Tools
AWS CloudFormation cfn-init       VMWare ToolsEnvironment-independent config
AWS CloudFormation cfn-init       VMWare ToolsEnvironment-independent configStandardised base packages
AWS CloudFormation cfn-init       VMWare ToolsEnvironment-independent configStandardised base packagesPlatform-level Nagio...
AWS CloudFormation cfn-init        VMWare ToolsEnvironment-independent configStandardised base packagesPlatform-level Nagi...
Example installed base           kernel auditing          Splunk forwarder             LDAP clientNFS client with automoun...
Network config via DHCP:     Standardised,      ubiquitous, environment-relevant
In prod environments: 60% of all VMs are platform  In dev AWS environment:37% of instances are platform
In the end …
Things to Improve
Support for multiple partitions        Better testing on-image IDS (ossec, snort)      Shareable code (?)
Questions?
Colin Panisset@nonspecialist
Photo creditsMr. Belvedere    http://en.wikipedia.org/wiki/File:Mr_Belvedere.jpgIowa Landscape   http://www.flickr.com/pho...
Belvedere
Belvedere
Belvedere
Belvedere
Belvedere
Upcoming SlideShare
Loading in …5
×

Belvedere

543 views

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
543
On SlideShare
0
From Embeds
0
Number of Embeds
5
Actions
Shares
0
Downloads
8
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Belvedere

  1. 1. BelvedereEnvironment consistency from Dev to Prod Infracoders, April 2013
  2. 2. Colin Panisset @nonspecialistTech Lead – Infrastructure REA Group
  3. 3. belvedere noun: a building, or architectural feature of a building,designed and situated to look out upon a pleasing scene. Latin bellus fine + vidēre to see
  4. 4. In other words:A platform that lets you see nice things
  5. 5. In thebeginning …
  6. 6. Devs wrote codeOps deployed it
  7. 7. Release cycles were long
  8. 8. Code moved throughenvironments like ...
  9. 9. When code got to staging:
  10. 10. #!/bin/sdlcwhile !staging.ok: Fix problems Redeploy
  11. 11. #!/bin/sdlcwhile !staging.ok: devs.fix_problems ops.deploy
  12. 12. Then, when the code got to prod:
  13. 13. This was not considered ideal
  14. 14. But the code hadnt changed ...
  15. 15. Environments had
  16. 16. The more we looked at it,the more problems we found
  17. 17. OS version differences
  18. 18. Deployment methods OS version differences
  19. 19. Deployment methods OS version differences Package versions
  20. 20. Deployment methods OS version differencesApp configs Package versions
  21. 21. Deployment methods Authentication OS version differencesApp configs Package versions
  22. 22. Deployment methods Authentication OS version differences Hardcoded IPsApp configs Package versions
  23. 23. What to do?
  24. 24. Designing Belvedere
  25. 25. 3-pronged approach
  26. 26. 3-pronged approach1. Move environment-specificconfig into the environment
  27. 27. 3-pronged approach1. Move environment-specificconfig into the environment2. Convention overconfiguration
  28. 28. 3-pronged approach1. Move environment-specificconfig into the environment2. Convention overconfiguration3. Use the same OS imageeverywhere (including dev)
  29. 29. Find problems as early in the development pipeline as possible
  30. 30. Give devs familiarity with aproduction-like system to aid transition to a more open access model
  31. 31. 1. Move environment-specific config into the environment
  32. 32. Credentials New Relic keys Database passwords ConfigurationApp-emitted endpoint URLs
  33. 33. Move environment-specific configuration into … environment variables
  34. 34. Populate environment variables before app starts
  35. 35. Update source files forenvironment variables each time the app restarts
  36. 36. Provide a “config service” as thesource of truth for values in the environment
  37. 37. #!/bin/sh applicationinit script  (source)  #!/bin/ruby #!/bin/shupdate script credentials   config service
  38. 38. Config service is hierarchical based on client hostname
  39. 39. Common values sit higher up the tree
  40. 40. global zone1.foo.com zone2.foo.com hostA.zone2.foo.comhostA.zone1.foo.com hostB.zone1.foo.com
  41. 41. global zone1.foo.com zone2.foo.com New Relic key New Relic key hostA.zone2.foo.comhostA.zone1.foo.com hostB.zone1.foo.com
  42. 42. Override values for host or domain-specific cases
  43. 43. 2. Convention overconfiguration
  44. 44. Use short DNS CNAMEs whichresolve differently in different environments
  45. 45. Examples prod.foo.com dev.foo.comsmtpdb-rwauth.nearauth.far
  46. 46. Examples prod.foo.com dev.foo.comsmtp mailsvr.foo.comdb-rw master-db.foo.comauth.near ldap.prod.foo.comauth.far ldap.dr.foo.com
  47. 47. Examples prod.foo.com dev.foo.comsmtp mailsvr.foo.com null.dev.foo.comdb-rw master-db.foo.com dev-db.foo.comauth.near ldap.prod.foo.com ldap.dev.foo.comauth.far ldap.dr.foo.com ldap.test.foo.com
  48. 48. You need to have DNS working
  49. 49. You need to have environments with different DNS domains or subdomains
  50. 50. Environment-independent config example ntp.conf
  51. 51. /etc/ntp.conf:...server ntp1.near iburstserver ntp2.near iburstserver ntp1.far iburstrestrict nagios.near mask 255.255.255.255 nomodify notraprestrict nagios.far mask 255.255.255.255 nomodify notrap...
  52. 52. Note use of .near and .far to indicate relative preference; you cant weight A records
  53. 53. DNS resolvers support search paths; use that feature!
  54. 54. 3. Use the same OS imageeverywhere (including dev)
  55. 55. Basic principlesa. Build the “platform image” once
  56. 56. Basic principlesa. Build the “platform image” once b. Transform the built image format, not the content
  57. 57. Basic principlesa. Build the “platform image” once b. Transform the built image format, not the content c. Provide a minimal-function image, let application RPM dependencies fill in the rest
  58. 58. Creating theplatform image
  59. 59. CentOS 6 x86_64
  60. 60. KojiRaw disk image
  61. 61. In words Commits to the belvedere repo in github trigger a build in Jenkins which uses koji spin-appliance to create a disk image defined by akickstart file which calls puppet apply to impose configuration and whichresults in a raw, bootable disk image
  62. 62. KojiRaw disk image
  63. 63. Images have the commit SHAburned into the filesystem for identification
  64. 64. Build time: 26 minutesPromotion via tags: 1 minute
  65. 65. Testing the built image
  66. 66. ovftool knife vsphere VMRaw disk image
  67. 67. puppet/modules/postfix/manifests/config.pp:class postfix::config { include nrpe file { /etc/nagios/nrpe.d/postfix.cfg: ensure => present, owner => root, group => root, mode => 0644, source => puppet:///modules/postfix/nrpe.cfg, require => Class[nrpe] }}
  68. 68. On the platform image:/etc/nagios/nrpe.d/postfix.cfgcommand[check_postfix]=/usr/lib64/nagios/plugins/check_smtp -4 -H localhost
  69. 69. The test script in the build pipeline:echo -n "Checking Postfix (SMTP): "RES=$( check_nrpe -H $TARGET -c check_postfix 2>&1 )if [ $? -ne 0 ]; then failure; echo echo "$RES" fail=trueelse success; echo echo "$RES"fi
  70. 70. Use the same Nagios probes inprod to test running instances
  71. 71. Distributing theplatform image
  72. 72. AMIVM Template ovftool aws-cli Raw disk image
  73. 73. Multiple VMware environments in different datacentres
  74. 74. Raw disk image →qemu-img convert → ovftool → knife vsphere
  75. 75. Multiple AWS Regions andaccounts (dev/staging/prod)
  76. 76. Raw disk image →aws cloudformation create-stack → rsync → dd → aws ec2 create-snapshot → aws ec2 register-image (awscli command line)
  77. 77. For distant regionsmaintain a persistent EC2 instance and use rsyncs delta tominimise transmitted bytes
  78. 78. Raw disk image: 1GB (uncompressed)
  79. 79. Built-in auto-resizing on reboot when the underlying device grows
  80. 80. Works for physical nodes too: same Puppet manifest different kickstart file
  81. 81. We can bring persistent(physical) boxes up to date simply: box# puppet apply ...
  82. 82. Raw disk image can be used as a local VM, too (KVM, Xen, vagrant, VMWare fusion, Parallels, etc)
  83. 83. Image promotion is the only manual step
  84. 84. Features!
  85. 85. AWS CloudFormation cfn-init
  86. 86. AWS CloudFormation cfn-init VMWare Tools
  87. 87. AWS CloudFormation cfn-init VMWare ToolsEnvironment-independent config
  88. 88. AWS CloudFormation cfn-init VMWare ToolsEnvironment-independent configStandardised base packages
  89. 89. AWS CloudFormation cfn-init VMWare ToolsEnvironment-independent configStandardised base packagesPlatform-level Nagios checks
  90. 90. AWS CloudFormation cfn-init VMWare ToolsEnvironment-independent configStandardised base packagesPlatform-level Nagios checksOS-level tuning (eg IO scheduler)
  91. 91. Example installed base kernel auditing Splunk forwarder LDAP clientNFS client with automounter (homedirs) postfixsudo configuration (with LDAP support) NTP munin/graphite iSCSI support
  92. 92. Network config via DHCP: Standardised, ubiquitous, environment-relevant
  93. 93. In prod environments: 60% of all VMs are platform In dev AWS environment:37% of instances are platform
  94. 94. In the end …
  95. 95. Things to Improve
  96. 96. Support for multiple partitions Better testing on-image IDS (ossec, snort) Shareable code (?)
  97. 97. Questions?
  98. 98. Colin Panisset@nonspecialist
  99. 99. Photo creditsMr. Belvedere http://en.wikipedia.org/wiki/File:Mr_Belvedere.jpgIowa Landscape http://www.flickr.com/photos/yoorock/7842391144/ (by-nc-nd)Neonate queen snake choking on a crayfish http://www.flickr.com/photos/peteandnoewoods/4367200217/ (by-sa)Ferry toll sign at Cowes http://www.flickr.com/photos/auntiep/5281268994/Keep Left http://www.flickr.com/photos/mrlederhosen/4283136097/Brunel mixed gauge track http://www.flickr.com/photos/nox_noctis_silentium/7929226488/Train wreck at Montparnasse 1895http://commons.wikimedia.org/wiki/File:Train_wreck_at_Montparnasse_1895.jpg (pd)

×