Practical Cloud & Workflow Orchestration
Upcoming SlideShare
Loading in...5

Practical Cloud & Workflow Orchestration



A presentation given at the 2011 Amazon AWS Genomics meeting held in Seattle, WA. ...

A presentation given at the 2011 Amazon AWS Genomics meeting held in Seattle, WA.

This is a 30 minute talk I gave focusing mainly on practical tools, tips and methods for bootstrapping and orchestration on the cloud.

Covers examples of:

Ubuntu Cloud Init
AWS Cloud Formation
Opscode Chef
MIT StarCluster



Total Views
Views on SlideShare
Embed Views



15 Embeds 1,410 655 655 43 26 12 6 5 1 1 1 1 1
http://localhost 1 1 1



Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

Practical Cloud & Workflow Orchestration Practical Cloud & Workflow Orchestration Presentation Transcript

  • Practical Cloud & Workflow Orchestration 2011 Amazon Genomics Event Chris Dagdigian  
  • Twitter: @chris_dag  I’m Chris.I’m an infrastructuregeek.I work for theBioTeam.
  • Disclaimer.
  • I’m not an Amazon shill.
  • Really.
  • The IaaS competition just can’t compete.
  • AWS lets me build useful stuff.
  • When stuff gets built, I get paid.
  • Installing VMware & excreting a press release does not turn acompany into a cloud provider.
  • I need more than just virtual computeand block storage. AWS has tons of glue and many useful IaaS building blocks.
  • IaaS competitors lag far behind in features and service offerings.
  • Speaking of pretenders…
  • No APIs?Not a cloud.
  • No self-service? Not a cloud.
  • I have to email a human? Not a cloud.
  • 50% failure rate on server launch? Lame cloud.
  • Virtual servers & block storage only? Barely a cloud.
  • I’m getting insufferable, huh?  Moving on …
  • Three Topics Today.
  • Time, Laziness & Beauty.
  • Tick … Tick Tick… image: shanelin via flickr
  • User expectations are changing. image: shanelin via flickr
  • Automated provisioning  can shrink the time between  “I want to do some science” & “I’m ready to do some science”. image: shanelin via flickr
  • However… image: shanelin via flickr
  • If servers, storage and systems can be deployed in minutes … image: shanelin via flickr
  • … why does it still take days, severalhelpdesk tickets & a team of humans to load software and configure my systems to actually do science? image: shanelin via flickr
  • It shouldn’t. image: shanelin via flickr
  • If provisioning gets faster,configuration management  also needs to keep pace. image: shanelin via flickr
  • Laziness.
  • Larry Wall’s 1st Great Virtue
  • “… the quality that makes you go to greateffort to reduce overall energy expenditure. It makes you write labor-saving programs that other people will find useful, and documentwhat you wrote so you dont have to answer so many questions about it.”
  • It’s all scriptable.
  • •  Servers•  Storage•  Network•  Bootstrapping•  Provisioning•  Configuration•  Management•  Monitoring•  Scaling•  Accounting & audit trails
  • Not hype. Real.
  • I can do it from my ipad.
  • No cubicle required.
  • Our research IT infrastructures can now be 100% virtual and 100% scriptable
  • And it’s pretty easy to understand.
  • Anyone can drive this stuff.
  • Especially motivated researchers.
  • Stuff like this is a big deal.
  • 5GB managed MySQL in the cloud. $.011 / hour
  • Database Administrator not required.
  • Automatic patching, backups & clustering
  • Anyone with a web browser can launch one.
  • Beauty.
  • Scriptable infrastructure is just the beginning.
  • The really cool stuff is what we build on top.
  • With good tools …
  • We can orchestrate complex systems, pipelines and workflows.
  • Orchestrated systems working in concert are a beautiful thing.
  • Let me show you a few of the tools we like.
  • Cloud Init
  • Cloud Init••  Developed by Ubuntu•  Baked into all Ubuntu UEC releases•  Also baked into Amazon Linux AMIs•  Works on Eucalyptus clouds as well
  • Cloud Init gives you a hook into freshly booted systems.
  • It’s a great and easy-to-comprehend way tobootstrap or customize generic server images.
  • When you launch a server, you can inject aYAML formatted file into the environment.
  • Cloud init files are parsed and executed right after the node boots for the first time.
  • You can run scripts, install software, load SSH keys, etc. to ‘bootstrap’ a generic node.
  • #cloud-config!packages:! - httpd!!runcmd:! - /etc/init.d/httpd start ! - echo "<h1>Hello Amazon Genomics Event!</h1>” !> /var/www/html/index.html!!
  • Previous real-world example does this:1.  Download/install Apache web server2.  Turn on the web server3.  Create a cheezy index.html
  • This is the script I ran moments before this talk …
  • #!/bin/sh!!ec2-run-instances ami-8c1fece5 ! -n 1 ! -t m1.small ! -g dagdemo-SG ! -k dagdemo-sshkeypair ! --user-data-file ./cloudInit-config.txt!!
  • Important to understand:•  ami-8c1fece5 is Amazon Linux public AMI•  No web server pre-installed•  Never before been ‘touched’ by me•  Cloud Init does it all via the script I injected at instance launch time
  • Lets see if it worked …
  • Amazon CloudFormation
  • Amazon CloudFormation••  AWS specific•  Sweet way to turn on|off entire stacks of related and dependent AWS services
  • Treat complex infrastructure as single resource•  Cliché example - In a single “stack” you can define and then start/stop: •  Elastic database cluster + •  Elastic webserver cluster + •  Monitoring & auto-scaling triggers •  Event & error notification •  Elastic load balancer
  • My live demo of CloudFormation•  Using the example WordPress Blog template•  It does a ton of cool stuff: •  RDS backend for mySQL database, elastic webserver cluster with auto-scaling, security group setup, automatic scaling, automatic alarm notices •  It all sits behind an elastic load balancer
  • My CloudFormation blog demo:•  Actual stack file at•  Check it out … •  .JSON formatted but still quite readable•  It lets me define and then control a ton of different related AWS services all at once.
  • #!/bin/sh!# Launch Stack
!cfn-create-stack AWSGenomics-demoStack ! --template-file cf-wordpress.json.txt!!!
  • #!/bin/sh!# Check state & status!!cfn-describe-stacks AWSGenomics-demoStack!echo ""!cfn-describe-stack-events ! AWSGenomics-demoStack --headers!
  • 10 AWS Services/Resources orchestrated as one.
  • Cloudwatch.
  • Auto-scaling triggers.
  • SNS Endpoints for Alarms.
  • Alarm triggers.
  • RDS Database & Security Group.
  • Elastic Load Balancer.
  • EC2 Security Group.
  • Cool, huh?
  • { in case the demo fails! }
  • Opscode Chef
  • Chef enables Infrastructure as Code
  • It’s freaking awesome.
  • Chef lets you:Manage configuration as idempotent Resources.Group resources as idempotent Recipes.Group recipes into Roles.Track it all like Source Code.Search your infrastructure like a ninja. Ohai!Configure your systems, software & pipelines
  • •  Several flavors •  Open source •  Commercial / Managed •  Commercial / ‘Behind your Firewall’ •  No time today for even a short description of how it works. You should check it out.
  • Chef demo via ‘knife’ command line …
  • knife ec2 server create ! -N aws-genomicsDemo ! -I ami-63be790a ! -f t1.micro ! -G default ! -S bioteam-IAM-admins-v1 ! -r recipe[getting-started] ! -i ./bioteam-IAM-admins-v1.pem ! -x ubuntu!
  • Fully automatic remote bootstrapping …
  • Done!
  • Search-driven, parallel remote SSH execution
  • knife ssh name:aws-genomicsDemo ! -a cloud.public_hostname ! -x ubuntu ! -i bioteam-IAM-admins-v1.pem ! sudo chef-client; ! cat /tmp/chef-getting-started.txt!
  • Lets install some genomics tools•  Our Maq short read assembler cookbook: •  Installs all dependencies (compilers, etc.) •  Puts application source on node •  Builds maq from source •  Installs it
  • $ knife node ! run_list add ! aws-genomicsDemo ! recipe[maq]!
  • It really is that easy.
  • MIT StarCluster
  • MIT Starcluster••  Ready to use Linux compute farm on AWS •  Grid Engine, MPI, NFS filesystems •  Libraries, tools, applications •  Easy to use, easy to extend •  Integrates well with Chef
  • If you have not built Linux clusters from scratch before …
  • It’s hard to really appreciate everything that StarCluster does behind the scenes.
  • MIT Starcluster – More Info•  Live demo (time permitting)•  StarCluster & Spot Instances Screencast • • science/
  • Phew. That’s a lot of slides.
  • Time to explore the demos?
  • Questions?
  • Thanks! Related talk slides:“Mapping Informatics to the Cloud”