• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Practical Cloud & Workflow Orchestration

Practical Cloud & Workflow Orchestration



A presentation given at the 2011 Amazon AWS Genomics meeting held in Seattle, WA. ...

A presentation given at the 2011 Amazon AWS Genomics meeting held in Seattle, WA.

This is a 30 minute talk I gave focusing mainly on practical tools, tips and methods for bootstrapping and orchestration on the cloud.

Covers examples of:

Ubuntu Cloud Init
AWS Cloud Formation
Opscode Chef
MIT StarCluster



Total Views
Views on SlideShare
Embed Views



14 Embeds 1,381

http://bioteam.net 643
http://bioteam.net 643
http://www.bioteam.net 43
http://paper.li 26
https://twitter.com 8
http://bioteam.wpengine.com 6
http://cloudstack.org 5
http://webcache.googleusercontent.com 1
http://a0.twimg.com 1
http://drizzlin.com 1
http://us-w1.rockmelt.com 1
http://localhost 1
http://twitter.com 1 1



Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

    Practical Cloud & Workflow Orchestration Practical Cloud & Workflow Orchestration Presentation Transcript

    • Practical Cloud & Workflow Orchestration 2011 Amazon Genomics Event Chris Dagdigian chris@bioteam.net  
    • Twitter: @chris_dag  I’m Chris.I’m an infrastructuregeek.I work for theBioTeam.
    • Disclaimer.
    • I’m not an Amazon shill.
    • Really.
    • The IaaS competition just can’t compete.
    • AWS lets me build useful stuff.
    • When stuff gets built, I get paid.
    • Installing VMware & excreting a press release does not turn acompany into a cloud provider.
    • I need more than just virtual computeand block storage. AWS has tons of glue and many useful IaaS building blocks.
    • IaaS competitors lag far behind in features and service offerings.
    • Speaking of pretenders…
    • No APIs?Not a cloud.
    • No self-service? Not a cloud.
    • I have to email a human? Not a cloud.
    • 50% failure rate on server launch? Lame cloud.
    • Virtual servers & block storage only? Barely a cloud.
    • I’m getting insufferable, huh?  Moving on …
    • Three Topics Today.
    • Time, Laziness & Beauty.
    • Tick … Tick Tick… image: shanelin via flickr
    • User expectations are changing. image: shanelin via flickr
    • Automated provisioning  can shrink the time between  “I want to do some science” & “I’m ready to do some science”. image: shanelin via flickr
    • However… image: shanelin via flickr
    • If servers, storage and systems can be deployed in minutes … image: shanelin via flickr
    • … why does it still take days, severalhelpdesk tickets & a team of humans to load software and configure my systems to actually do science? image: shanelin via flickr
    • It shouldn’t. image: shanelin via flickr
    • If provisioning gets faster,configuration management  also needs to keep pace. image: shanelin via flickr
    • Laziness.
    • Larry Wall’s 1st Great Virtue
    • “… the quality that makes you go to greateffort to reduce overall energy expenditure. It makes you write labor-saving programs that other people will find useful, and documentwhat you wrote so you dont have to answer so many questions about it.”
    • It’s all scriptable.
    • •  Servers•  Storage•  Network•  Bootstrapping•  Provisioning•  Configuration•  Management•  Monitoring•  Scaling•  Accounting & audit trails
    • Not hype. Real.
    • I can do it from my ipad.
    • No cubicle required.
    • Our research IT infrastructures can now be 100% virtual and 100% scriptable
    • And it’s pretty easy to understand.
    • Anyone can drive this stuff.
    • Especially motivated researchers.
    • Stuff like this is a big deal.
    • 5GB managed MySQL in the cloud. $.011 / hour
    • Database Administrator not required.
    • Automatic patching, backups & clustering
    • Anyone with a web browser can launch one.
    • Beauty.
    • Scriptable infrastructure is just the beginning.
    • The really cool stuff is what we build on top.
    • With good tools …
    • We can orchestrate complex systems, pipelines and workflows.
    • Orchestrated systems working in concert are a beautiful thing.
    • Let me show you a few of the tools we like.
    • Cloud Init
    • Cloud Init•  https://help.ubuntu.com/community/UEC•  Developed by Ubuntu•  Baked into all Ubuntu UEC releases•  Also baked into Amazon Linux AMIs•  Works on Eucalyptus clouds as well
    • Cloud Init gives you a hook into freshly booted systems.
    • It’s a great and easy-to-comprehend way tobootstrap or customize generic server images.
    • When you launch a server, you can inject aYAML formatted file into the environment.
    • Cloud init files are parsed and executed right after the node boots for the first time.
    • You can run scripts, install software, load SSH keys, etc. to ‘bootstrap’ a generic node.
    • #cloud-config!packages:! - httpd!!runcmd:! - /etc/init.d/httpd start ! - echo "<h1>Hello Amazon Genomics Event!</h1>” !> /var/www/html/index.html!!
    • Previous real-world example does this:1.  Download/install Apache web server2.  Turn on the web server3.  Create a cheezy index.html
    • This is the script I ran moments before this talk …
    • #!/bin/sh!!ec2-run-instances ami-8c1fece5 ! -n 1 ! -t m1.small ! -g dagdemo-SG ! -k dagdemo-sshkeypair ! --user-data-file ./cloudInit-config.txt!!
    • Important to understand:•  ami-8c1fece5 is Amazon Linux public AMI•  No web server pre-installed•  Never before been ‘touched’ by me•  Cloud Init does it all via the script I injected at instance launch time
    • Lets see if it worked …
    • Amazon CloudFormation
    • Amazon CloudFormation•  http://aws.amazon.com/cloudformation/•  AWS specific•  Sweet way to turn on|off entire stacks of related and dependent AWS services
    • Treat complex infrastructure as single resource•  Cliché example - In a single “stack” you can define and then start/stop: •  Elastic database cluster + •  Elastic webserver cluster + •  Monitoring & auto-scaling triggers •  Event & error notification •  Elastic load balancer
    • My live demo of CloudFormation•  Using the example WordPress Blog template•  It does a ton of cool stuff: •  RDS backend for mySQL database, elastic webserver cluster with auto-scaling, security group setup, automatic scaling, automatic alarm notices •  It all sits behind an elastic load balancer
    • My CloudFormation blog demo:•  Actual stack file at http://biote.am/6d•  Check it out … •  .JSON formatted but still quite readable•  It lets me define and then control a ton of different related AWS services all at once.
    • #!/bin/sh!# Launch Stack
!cfn-create-stack AWSGenomics-demoStack ! --template-file cf-wordpress.json.txt!!!
    • #!/bin/sh!# Check state & status!!cfn-describe-stacks AWSGenomics-demoStack!echo ""!cfn-describe-stack-events ! AWSGenomics-demoStack --headers!
    • 10 AWS Services/Resources orchestrated as one.
    • Cloudwatch.
    • Auto-scaling triggers.
    • SNS Endpoints for Alarms.
    • Alarm triggers.
    • RDS Database & Security Group.
    • Elastic Load Balancer.
    • EC2 Security Group.
    • Cool, huh?
    • { in case the demo fails! }
    • Opscode Chef
    • Chef enables Infrastructure as Code
    • It’s freaking awesome.
    • Chef lets you:Manage configuration as idempotent Resources.Group resources as idempotent Recipes.Group recipes into Roles.Track it all like Source Code.Search your infrastructure like a ninja. Ohai!Configure your systems, software & pipelines
    • http://www.opscode.com/chef/ •  Several flavors •  Open source •  Commercial / Managed •  Commercial / ‘Behind your Firewall’ •  No time today for even a short description of how it works. You should check it out.
    • Chef demo via ‘knife’ command line …
    • knife ec2 server create ! -N aws-genomicsDemo ! -I ami-63be790a ! -f t1.micro ! -G default ! -S bioteam-IAM-admins-v1 ! -r recipe[getting-started] ! -i ./bioteam-IAM-admins-v1.pem ! -x ubuntu!
    • Fully automatic remote bootstrapping …
    • Done!
    • Search-driven, parallel remote SSH execution
    • knife ssh name:aws-genomicsDemo ! -a cloud.public_hostname ! -x ubuntu ! -i bioteam-IAM-admins-v1.pem ! sudo chef-client; ! cat /tmp/chef-getting-started.txt!
    • Lets install some genomics tools•  Our Maq short read assembler cookbook: •  Installs all dependencies (compilers, etc.) •  Puts application source on node •  Builds maq from source •  Installs it
    • $ knife node ! run_list add ! aws-genomicsDemo ! recipe[maq]!
    • It really is that easy.
    • MIT StarCluster
    • MIT Starcluster•  http://web.mit.edu/stardev/cluster•  Ready to use Linux compute farm on AWS •  Grid Engine, MPI, NFS filesystems •  Libraries, tools, applications •  Easy to use, easy to extend •  Integrates well with Chef
    • If you have not built Linux clusters from scratch before …
    • It’s hard to really appreciate everything that StarCluster does behind the scenes.
    • MIT Starcluster – More Info•  Live demo (time permitting)•  StarCluster & Spot Instances Screencast •  http://biote.am/6c •  http://aws.amazon.com/ec2/spot-and- science/
    • Phew. That’s a lot of slides.
    • Time to explore the demos?
    • Questions?
    • Thanks! Related talk slides: http://biote.am/6a“Mapping Informatics to the Cloud”