Zero to Prod in Crazy TimeJohn Martinez | Adobe Cloud Services
About Me• Currently working as a Cloud Operations Engineer at Adobe• I get to figure out new stuff, and make really old stu...
About Ops PeopleSome people see us as Ninjas, I really see us as Storm Troopers
Cloud Platforms @ Adobe• Creative Cloud• Marketing Cloud• Digital Publishing Suite• Phonegap• Typekit• Acrobat.com• Echosi...
How We Got Started• Creative Cloud went live in late April 2012• AWS from the start• We needed to do SOMETHING• Yes, it wa...
#EPICFAIL #1• Not socializing the need for Chef to the dev team• Once sold, keep momentum going• The “let’s make this more...
Tweaking Knobs• EC2 AMIs: bake or configure?• Baking positive: fast boot times• Baking negative: too static• Configure posit...
#EPICFAIL #2• Get Chef, don’t actually use it• Back to that learning curve (Hint:Training)• Issue with compressed timeline...
Out of the Rubble• Now that we’re live: refactor time (a.k.a. Fix all the broken stuff)• Chef development for reals• OMG:W...
It’s Alive!• Did gradually over time• Started with simple recipes, graduated to more complicated ones• Using Environments ...
It’s Alive (v1)EC2InstancesS3 Bucket(validatorkey)CloudFormationAutoScaleGroupHosted11. knife uploadCookbooksEnvironmentRo...
More Automation (v2)EC2InstancesS3 Bucket(validatorkey)CloudFormationAutoScaleGroupHosted11. knife uploadCookbooksEnvironm...
On Bootstrapping EC2 Instances• Biggest issue with Chef in AWS: straying from knife-ec2• Read the bootstrap document and r...
#EPICFAIL #3Oh crap, Opscode is DOWN!!!
#EPICFAIL #3• Failing to architect for failure (double BAM)• Even though we built a hot AWS architecture, we still got bit...
How We’re Trying to Improve• Mostly around availability• Augment Hosted Chef with Private Chef• Mostly around security• Us...
The End• Operational scripts, template examples and other bits• https://github.com/Adobe-CloudOps• Contact me:• @johnmarti...
Upcoming SlideShare
Loading in...5
×

Zero to Production in Crazy Time: Adobe’s Transformation

1,218

Published on

Adobe has quickly scaled from nothing to a huge presence in the AWS cloud.

This is the story from the trenches: how we screwed up, learned and evolved our use of Chef to help get us to today. Taming Chef to work in the AWS cloud while trying to build a platform at a large scale was not as easy as we originally planned, and we’re consistently trying to make it better. We’ll share some tips and tricks from our experience.

Published in: Technology, Self Improvement
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
1,218
On Slideshare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
15
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Zero to Production in Crazy Time: Adobe’s Transformation

  1. 1. Zero to Prod in Crazy TimeJohn Martinez | Adobe Cloud Services
  2. 2. About Me• Currently working as a Cloud Operations Engineer at Adobe• I get to figure out new stuff, and make really old stuff work in AWS• 20+ years doing UNIX/Linux work• Learned about cloud computing at Netflix• Working at Adobe feeds my habit - photography
  3. 3. About Ops PeopleSome people see us as Ninjas, I really see us as Storm Troopers
  4. 4. Cloud Platforms @ Adobe• Creative Cloud• Marketing Cloud• Digital Publishing Suite• Phonegap• Typekit• Acrobat.com• Echosign• Revel• ...and growing...
  5. 5. How We Got Started• Creative Cloud went live in late April 2012• AWS from the start• We needed to do SOMETHING• Yes, it was really that scientific of a decision• Chef vs. Puppet• That learning curve
  6. 6. #EPICFAIL #1• Not socializing the need for Chef to the dev team• Once sold, keep momentum going• The “let’s make this more complicated than it needs to be syndrome”• Start with easy stuff first, then graduate• Ops guy admits: the dev people know how to use softwareengineering methods for creating and maintaining infrastructure code:USE IT
  7. 7. Tweaking Knobs• EC2 AMIs: bake or configure?• Baking positive: fast boot times• Baking negative: too static• Configure positive: very dynamic• Configure negative: can take forever to boot• We settled on a mostly dynamic configuration, with some static baking• knife-ec2 is great, but what about autoscale?• The CloudFormation connection
  8. 8. #EPICFAIL #2• Get Chef, don’t actually use it• Back to that learning curve (Hint:Training)• Issue with compressed timelines and small staff• In the heat of deploying prod, doing stupid things• Losing track of what got deployed where• Who’s doing what?• Not sleeping sucks
  9. 9. Out of the Rubble• Now that we’re live: refactor time (a.k.a. Fix all the broken stuff)• Chef development for reals• OMG:WINDOWS?!?!• Not a lot of expertise in-house or outside• Ops guy admits: learned to love dev tools like Jenkins and Git
  10. 10. It’s Alive!• Did gradually over time• Started with simple recipes, graduated to more complicated ones• Using Environments to deploy the right thing in the right place• It’s AWS stupid: you SHOULD kill your instances• CloudFormation to AutoScale to Chef Client
  11. 11. It’s Alive (v1)EC2InstancesS3 Bucket(validatorkey)CloudFormationAutoScaleGroupHosted11. knife uploadCookbooksEnvironmentRolesData bags2 3400. ManualEditor (vi)Perforcecfn-create-stack4. Chef ClientBootstrapData Bag KeyRecipes
  12. 12. More Automation (v2)EC2InstancesS3 Bucket(validatorkey)CloudFormationAutoScaleGroupHosted11. knife uploadCookbooksEnvironmentRolesData bags2 3400. AutomatedGitJenkinsJenkins CFN4. Chef ClientBootstrapData Bag KeyRecipes
  13. 13. On Bootstrapping EC2 Instances• Biggest issue with Chef in AWS: straying from knife-ec2• Read the bootstrap document and reverse engineer it• http://wiki.opscode.com/display/chef/Client+Bootstrap+Fast+Start+Guide• http://wiki.opscode.com/display/chef/EC2+Bootstrap+Fast+Start+Guide• user-data is your friend• Use it for node identity• Resist the devil: don’t send any API keys or passwords or embarrassing things via user-data!!!• Windows works this way, too, but learn PowerShell
  14. 14. #EPICFAIL #3Oh crap, Opscode is DOWN!!!
  15. 15. #EPICFAIL #3• Failing to architect for failure (double BAM)• Even though we built a hot AWS architecture, we still got bit• What does it mean when Hosted Chef is down for us?• Talk to Opscode...really, talk to them, they want to help
  16. 16. How We’re Trying to Improve• Mostly around availability• Augment Hosted Chef with Private Chef• Mostly around security• Use the tools at your disposal• IAM policies for EC2 roles and S3 bucket security• Mostly around performance• Refactoring AWS-related code to use AWS SDK for Ruby• AMI factory from base Amazon Linux or Ubuntu AMIs (bonus points for Windows)
  17. 17. The End• Operational scripts, template examples and other bits• https://github.com/Adobe-CloudOps• Contact me:• @johnmartinez• martinez@adobe.com• Questions? Suggestions? Come talk to me after!
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×