Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Deploying Software in an
Autoscaled AWS
Environment
Jeff Horwitz
Director of Cloud Engineering, ShopRunner

jhorwitz@shopr...
• Members get 2-day free shipping and returns
• Benefits apply across a variety of retailers
• Extending our reach into Chi...
Applications
• Many different applications (10+)
• Each with its own repository and set of servers
• Multiple deployments ...
AWS @ ShopRunner
• Infrastructure is 100% in the cloud
• AWS + other services
• Heavy use of VPC, AutoScaling, CloudFormat...
How We Launch
• Everything launched via a CloudFormation Stack
• Use nested stacks to stay DRY
• single_instance.json
• au...
ASG in Cloudformation
...
!
"ContentGroup": {
"Type": "AWS::CloudFormation::Stack",
"Properties": {
"Parameters": {
"Serve...
Waiting for Puppet
• Puppet can take some time to run
• Group shouldn't go live until puppet is finished
• Use CloudFormati...
Puppet Wait Conditions
"PuppetWaitHandle" : {
"Type" : "AWS::CloudFormation::WaitConditionHandle",
"Properties" : {}
},
!
...
Signal Wait Handler
...
!
"command": { "Fn::Join": [ "", [

"/opt/aws/bin/cfn-signal -s $success ",
"-r "puppet agent exit...
Legacy Deployments
• one long-lived AS group per application
• per-application scripts launch AS groups
• scripts pull cod...
Problems
• scripts w/o CloudFormation diverge quickly
• can't easily launch multiple versions
• no association with a tag/...
Solutions
• stop treating our infrastructure like it's static
• create new stacks for each deployment
• store state in etc...
Tenets of SR Deployments
• Unit of deployment is the stack
• Deployed servers are immutable
• Deployments are reproducible...
ELB Catch-22
• new instances added to ELB once running
• autoscaling needs services to start automatically
• what if we're...
Delay Service Start?
• configure instances not to start services on launch
and only start services when ready to deploy
• m...
Lifecycle Hooks FTW
• Register hooks for ASG lifecycle events
• Lifecycle halts until told to proceed
• Can launch our gro...
Autoscaling Lifecycle Hooks
Copied from AWS documentation at

http://docs.aws.amazon.com/AutoScaling/latest/DeveloperGuide...
Autoscaling Lifecycle Hooks
Copied from AWS documentation at

http://docs.aws.amazon.com/AutoScaling/latest/DeveloperGuide...
Autoscaling Lifecycle Hooks
Copied from AWS documentation at

http://docs.aws.amazon.com/AutoScaling/latest/DeveloperGuide...
Pre-deployment

Lonely ELB
Deployment #1

Launch Autoscaling Group
ASG v1
PENDING
Cache warming
Launch status check

"go/no-go"
OOPS I launched the w...
Deployment #1

Deploy Autoscaling Group
ASG v1
GO LIVE
Deployment #1

Deploy Autoscaling Group
ASG v1
Deployment #2

Launch Autoscaling Group
ASG v1 ASG v2
PENDING
Deployment #2

Deploy Autoscaling Group
ASG v1 ASG v2
GO LIVE
Deployment #2

Multiple ASG Backends
ASG v1 ASG v2
Deployment #2

REVERT!
ASG v1 ASG v2
Deployment #2

Multiple ASG Backends
ASG v1 ASG v2
Deployment #2

Remove ASG v1
ASG v1 ASG v2
Deployment #2

Suspend Processes on ASG v1
ASG v1 ASG v2
No scaling
No ELB
Deployment #2

Delete ASG v1
ASG v2
Deployment #2

ASG v2 Deployed
ASG v2
Suspend/Resume
• Launch
• Terminate
• AddToLoadBalancer
• AlarmNotification
• AZRebalance
• HealthCheck
• ReplaceUnhealthy
...
Standby State
• Removes instances from autoscaling group
• Resources are still managed by the group
• Option to maintain c...
Attach/Detach Instances
• Relatively new feature
• Use to attach to a pre-launch testing ASG/ELB
• Move instances to produ...
Deployment Procedure
1. Build the app.
2. Create snapshot and register it in etcd.
3. Launch a deployment with the build s...
Future Work
• Test instances with a pre-launch testing ELB
• Register Jenkins builds for deployment
• Support multiple env...
Upcoming SlideShare
Loading in …5
×

Deploying Software in an Autoscaled AWS Environment

1,450 views

Published on

Presented at Philly DevOps on January 20, 2015

Published in: Internet
  • Be the first to comment

Deploying Software in an Autoscaled AWS Environment

  1. 1. Deploying Software in an Autoscaled AWS Environment Jeff Horwitz Director of Cloud Engineering, ShopRunner
 jhorwitz@shoprunner.com Presented at Philly DevOps January 20, 2015
  2. 2. • Members get 2-day free shipping and returns • Benefits apply across a variety of retailers • Extending our reach into China w/ Alipay
  3. 3. Applications • Many different applications (10+) • Each with its own repository and set of servers • Multiple deployments per day
  4. 4. AWS @ ShopRunner • Infrastructure is 100% in the cloud • AWS + other services • Heavy use of VPC, AutoScaling, CloudFormation
  5. 5. How We Launch • Everything launched via a CloudFormation Stack • Use nested stacks to stay DRY • single_instance.json • autoscaling_group.json • CloudInit bootstraps each instance • Puppet applies role-specific configurations
  6. 6. ASG in Cloudformation ... ! "ContentGroup": { "Type": "AWS::CloudFormation::Stack", "Properties": { "Parameters": { "ServerEnvironment": "prd", "ServerRole": "content", "InstanceType": "m3.medium", "LoadBalancerNames": { "Ref": "ContentELB" }, "AvailabilityZones": { "Fn::Join": [ ",", { "Ref": "AvailabilityZones" } ] }, "VPCZoneIdentifier": { "Fn::Join": [ ",", { "Ref": "ASGroupSubnets" } ] }, "SecurityGroupIds": { "Fn::Join": [ ",", { "Ref": "ContentSecurityGroup" } ] }, "DesiredASCapacity": 3, "MinASSize": 3, "MaxASSize": 6 }, "TemplateURL": "https://s3.amazonaws.com/BUCKET/cloudformation/autoscaling_group.json", "TimeoutInMinutes": 30 } } ! ...
  7. 7. Waiting for Puppet • Puppet can take some time to run • Group shouldn't go live until puppet is finished • Use CloudFormation Wait Conditions • Wait for stack status CREATE_COMPLETE
  8. 8. Puppet Wait Conditions "PuppetWaitHandle" : { "Type" : "AWS::CloudFormation::WaitConditionHandle", "Properties" : {} }, ! "PuppetWaitCondition": { "Type" : "AWS::CloudFormation::WaitCondition", "DependsOn" : "AutoScalingGroup", "Properties" : { "Handle" : { "Ref" : "PuppetWaitHandle" }, "Timeout" : "1800", "Count" :{ "Ref": "DesiredASCapacity" } } },
  9. 9. Signal Wait Handler ... ! "command": { "Fn::Join": [ "", [
 "/opt/aws/bin/cfn-signal -s $success ", "-r "puppet agent exited with code $rc" ", "-i "puppet-signal-$EC2_INSTANCE_ID" '", { "Ref": "PuppetWaitHandle" }, "'" ! ...
  10. 10. Legacy Deployments • one long-lived AS group per application • per-application scripts launch AS groups • scripts pull code from git into an EBS volume • create snapshot and upload ID to S3 • rsync volume to servers in existing AS group • restart services as necessary
  11. 11. Problems • scripts w/o CloudFormation diverge quickly • can't easily launch multiple versions • no association with a tag/branch/commit • rsync changes code on running servers • can't easily stage new code before deploying • can't easily warm servers before deploying • no clean or consistent rollback procedure
  12. 12. Solutions • stop treating our infrastructure like it's static • create new stacks for each deployment • store state in etcd • stop deploying code changes • start deploying stacks
  13. 13. Tenets of SR Deployments • Unit of deployment is the stack • Deployed servers are immutable • Deployments are reproducible • Fail back to old stacks, fail forward to new stacks • DB migrations should be backwards compatible • Test on the same configuration as production
  14. 14. ELB Catch-22 • new instances added to ELB once running • autoscaling needs services to start automatically • what if we're not ready? • what if the service is actually broken? • wait to associate ELB w/ ASG? can't do that!
  15. 15. Delay Service Start? • configure instances not to start services on launch and only start services when ready to deploy • manage with manual steps or custom code • initial launch versus scale-out event • feature flags (etcd, other orchestration)
  16. 16. Lifecycle Hooks FTW • Register hooks for ASG lifecycle events • Lifecycle halts until told to proceed • Can launch our group but tell it not to go live
  17. 17. Autoscaling Lifecycle Hooks Copied from AWS documentation at
 http://docs.aws.amazon.com/AutoScaling/latest/DeveloperGuide/AutoScalingGroupLifecycle.html
  18. 18. Autoscaling Lifecycle Hooks Copied from AWS documentation at
 http://docs.aws.amazon.com/AutoScaling/latest/DeveloperGuide/AutoScalingGroupLifecycle.html
  19. 19. Autoscaling Lifecycle Hooks Copied from AWS documentation at
 http://docs.aws.amazon.com/AutoScaling/latest/DeveloperGuide/AutoScalingGroupLifecycle.html
  20. 20. Pre-deployment
 Lonely ELB
  21. 21. Deployment #1
 Launch Autoscaling Group ASG v1 PENDING Cache warming Launch status check
 "go/no-go" OOPS I launched the wrong thing -- run away!
  22. 22. Deployment #1
 Deploy Autoscaling Group ASG v1 GO LIVE
  23. 23. Deployment #1
 Deploy Autoscaling Group ASG v1
  24. 24. Deployment #2
 Launch Autoscaling Group ASG v1 ASG v2 PENDING
  25. 25. Deployment #2
 Deploy Autoscaling Group ASG v1 ASG v2 GO LIVE
  26. 26. Deployment #2
 Multiple ASG Backends ASG v1 ASG v2
  27. 27. Deployment #2
 REVERT! ASG v1 ASG v2
  28. 28. Deployment #2
 Multiple ASG Backends ASG v1 ASG v2
  29. 29. Deployment #2
 Remove ASG v1 ASG v1 ASG v2
  30. 30. Deployment #2
 Suspend Processes on ASG v1 ASG v1 ASG v2 No scaling No ELB
  31. 31. Deployment #2
 Delete ASG v1 ASG v2
  32. 32. Deployment #2
 ASG v2 Deployed ASG v2
  33. 33. Suspend/Resume • Launch • Terminate • AddToLoadBalancer • AlarmNotification • AZRebalance • HealthCheck • ReplaceUnhealthy • ScheduledActions
  34. 34. Standby State • Removes instances from autoscaling group • Resources are still managed by the group • Option to maintain capacity while in standby • Once ready, return the instance to service • Great for debugging w/o affecting capacity
  35. 35. Attach/Detach Instances • Relatively new feature • Use to attach to a pre-launch testing ASG/ELB • Move instances to production ASG when ready
  36. 36. Deployment Procedure 1. Build the app. 2. Create snapshot and register it in etcd. 3. Launch a deployment with the build snapshot. 4. Perform pre-launch tasks (warming, etc.). 5. Release deployment (completes lifecycle). 6. Revert to or remove old deployment. 7. Delete old deployment.
  37. 37. Future Work • Test instances with a pre-launch testing ELB • Register Jenkins builds for deployment • Support multiple environments • UI/Dashboard

×