Practical Cloud & Workflow Orchestration

11,487 views

Published on

A presentation given at the 2011 Amazon AWS Genomics meeting held in Seattle, WA.

This is a 30 minute talk I gave focusing mainly on practical tools, tips and methods for bootstrapping and orchestration on the cloud.

Covers examples of:

Ubuntu Cloud Init
AWS Cloud Formation
Opscode Chef
MIT StarCluster

Published in: Technology
0 Comments
11 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
11,487
On SlideShare
0
From Embeds
0
Number of Embeds
902
Actions
Shares
0
Downloads
137
Comments
0
Likes
11
Embeds 0
No embeds

No notes for slide

Practical Cloud & Workflow Orchestration

  1. Practical Cloud & Workflow Orchestration 2011 Amazon Genomics Event Chris Dagdigian chris@bioteam.net  
  2. Twitter: @chris_dag  I’m Chris.I’m an infrastructuregeek.I work for theBioTeam.
  3. Disclaimer.
  4. I’m not an Amazon shill.
  5. Really.
  6. The IaaS competition just can’t compete.
  7. AWS lets me build useful stuff.
  8. When stuff gets built, I get paid.
  9. Installing VMware & excreting a press release does not turn acompany into a cloud provider.
  10. I need more than just virtual computeand block storage. AWS has tons of glue and many useful IaaS building blocks.
  11. IaaS competitors lag far behind in features and service offerings.
  12. Speaking of pretenders…
  13. No APIs?Not a cloud.
  14. No self-service? Not a cloud.
  15. I have to email a human? Not a cloud.
  16. 50% failure rate on server launch? Lame cloud.
  17. Virtual servers & block storage only? Barely a cloud.
  18. I’m getting insufferable, huh? Moving on …
  19. Three Topics Today.
  20. Time, Laziness Beauty.
  21. Tick … Tick Tick… image: shanelin via flickr
  22. User expectations are changing. image: shanelin via flickr
  23. Automated provisioning can shrink the time between “I want to do some science” “I’m ready to do some science”. image: shanelin via flickr
  24. However… image: shanelin via flickr
  25. If servers, storage and systems can be deployed in minutes … image: shanelin via flickr
  26. … why does it still take days, severalhelpdesk tickets a team of humans to load software and configure my systems to actually do science? image: shanelin via flickr
  27. It shouldn’t. image: shanelin via flickr
  28. If provisioning gets faster,configuration management also needs to keep pace. image: shanelin via flickr
  29. Laziness.
  30. Larry Wall’s 1st Great Virtue
  31. “… the quality that makes you go to greateffort to reduce overall energy expenditure. It makes you write labor-saving programs that other people will find useful, and documentwhat you wrote so you dont have to answer so many questions about it.”
  32. It’s all scriptable.
  33. •  Servers•  Storage•  Network•  Bootstrapping•  Provisioning•  Configuration•  Management•  Monitoring•  Scaling•  Accounting audit trails
  34. Not hype. Real.
  35. I can do it from my ipad.
  36. No cubicle required.
  37. Our research IT infrastructures can now be 100% virtual and 100% scriptable
  38. And it’s pretty easy to understand.
  39. Anyone can drive this stuff.
  40. Especially motivated researchers.
  41. Stuff like this is a big deal.
  42. 5GB managed MySQL in the cloud. $.011 / hour
  43. Database Administrator not required.
  44. Automatic patching, backups clustering
  45. Anyone with a web browser can launch one.
  46. Beauty.
  47. Scriptable infrastructure is just the beginning.
  48. The really cool stuff is what we build on top.
  49. With good tools …
  50. We can orchestrate complex systems, pipelines and workflows.
  51. Orchestrated systems working in concert are a beautiful thing.
  52. Let me show you a few of the tools we like.
  53. Cloud Init
  54. Cloud Init•  https://help.ubuntu.com/community/UEC•  Developed by Ubuntu•  Baked into all Ubuntu UEC releases•  Also baked into Amazon Linux AMIs•  Works on Eucalyptus clouds as well
  55. Cloud Init gives you a hook into freshly booted systems.
  56. It’s a great and easy-to-comprehend way tobootstrap or customize generic server images.
  57. When you launch a server, you can inject aYAML formatted file into the environment.
  58. Cloud init files are parsed and executed right after the node boots for the first time.
  59. You can run scripts, install software, load SSH keys, etc. to ‘bootstrap’ a generic node.
  60. #cloud-config!packages:! - httpd!!runcmd:! - /etc/init.d/httpd start ! - echo h1Hello Amazon Genomics Event!/h1” ! /var/www/html/index.html!!
  61. Previous real-world example does this:1.  Download/install Apache web server2.  Turn on the web server3.  Create a cheezy index.html
  62. This is the script I ran moments before this talk …
  63. #!/bin/sh!!ec2-run-instances ami-8c1fece5 ! -n 1 ! -t m1.small ! -g dagdemo-SG ! -k dagdemo-sshkeypair ! --user-data-file ./cloudInit-config.txt!!
  64. Important to understand:•  ami-8c1fece5 is Amazon Linux public AMI•  No web server pre-installed•  Never before been ‘touched’ by me•  Cloud Init does it all via the script I injected at instance launch time
  65. Lets see if it worked
  66. Amazon CloudFormation
  67. Amazon CloudFormation•  http://aws.amazon.com/cloudformation/•  AWS specific•  Sweet way to turn on|off entire stacks of related and dependent AWS services
  68. Treat complex infrastructure as single resource•  Cliché example - In a single “stack” you can define and then start/stop: •  Elastic database cluster + •  Elastic webserver cluster + •  Monitoring auto-scaling triggers •  Event error notification •  Elastic load balancer
  69. My live demo of CloudFormation•  Using the example WordPress Blog template•  It does a ton of cool stuff: •  RDS backend for mySQL database, elastic webserver cluster with auto-scaling, security group setup, automatic scaling, automatic alarm notices •  It all sits behind an elastic load balancer
  70. My CloudFormation blog demo:•  Actual stack file at http://biote.am/6d•  Check it out … •  .JSON formatted but still quite readable•  It lets me define and then control a ton of different related AWS services all at once.
  71. #!/bin/sh!# Launch Stack
!cfn-create-stack AWSGenomics-demoStack ! --template-file cf-wordpress.json.txt!!!
  72. #!/bin/sh!# Check state status!!cfn-describe-stacks AWSGenomics-demoStack!echo !cfn-describe-stack-events ! AWSGenomics-demoStack --headers!
  73. 10 AWS Services/Resources orchestrated as one.
  74. Cloudwatch.
  75. Auto-scaling triggers.
  76. SNS Endpoints for Alarms.
  77. Alarm triggers.
  78. RDS Database Security Group.
  79. Elastic Load Balancer.
  80. EC2 Security Group.
  81. Cool, huh?
  82. { in case the demo fails! }
  83. Opscode Chef
  84. Chef enables Infrastructure as Code
  85. It’s freaking awesome.
  86. Chef lets you:Manage configuration as idempotent Resources.Group resources as idempotent Recipes.Group recipes into Roles.Track it all like Source Code.Search your infrastructure like a ninja. Ohai!Configure your systems, software pipelines
  87. http://www.opscode.com/chef/ •  Several flavors •  Open source •  Commercial / Managed •  Commercial / ‘Behind your Firewall’ •  No time today for even a short description of how it works. You should check it out.
  88. Chef demo via ‘knife’ command line …
  89. knife ec2 server create ! -N aws-genomicsDemo ! -I ami-63be790a ! -f t1.micro ! -G default ! -S bioteam-IAM-admins-v1 ! -r recipe[getting-started] ! -i ./bioteam-IAM-admins-v1.pem ! -x ubuntu!
  90. Fully automatic remote bootstrapping …
  91. Done!
  92. Search-driven, parallel remote SSH execution
  93. knife ssh name:aws-genomicsDemo ! -a cloud.public_hostname ! -x ubuntu ! -i bioteam-IAM-admins-v1.pem ! sudo chef-client; ! cat /tmp/chef-getting-started.txt!
  94. Lets install some genomics tools•  Our Maq short read assembler cookbook: •  Installs all dependencies (compilers, etc.) •  Puts application source on node •  Builds maq from source •  Installs it
  95. $ knife node ! run_list add ! aws-genomicsDemo ! recipe[maq]!
  96. It really is that easy.
  97. MIT StarCluster
  98. MIT Starcluster•  http://web.mit.edu/stardev/cluster•  Ready to use Linux compute farm on AWS •  Grid Engine, MPI, NFS filesystems •  Libraries, tools, applications •  Easy to use, easy to extend •  Integrates well with Chef
  99. If you have not built Linux clusters from scratch before …
  100. It’s hard to really appreciate everything that StarCluster does behind the scenes.
  101. MIT Starcluster – More Info•  Live demo (time permitting)•  StarCluster Spot Instances Screencast •  http://biote.am/6c •  http://aws.amazon.com/ec2/spot-and- science/
  102. Phew. That’s a lot of slides.
  103. Time to explore the demos?
  104. Questions?
  105. Thanks! Related talk slides: http://biote.am/6a“Mapping Informatics to the Cloud”

×