• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Gov 2.0: Scaling, Automation, & Management in the Cloud
 

Gov 2.0: Scaling, Automation, & Management in the Cloud

on

  • 6,187 views

Gov 2.0: Scaling, Automation, & Management in the Cloud

Gov 2.0: Scaling, Automation, & Management in the Cloud

Statistics

Views

Total Views
6,187
Views on SlideShare
5,713
Embed Views
474

Actions

Likes
10
Downloads
157
Comments
0

11 Embeds 474

http://www.slideshare.net 235
http://www.readwriteweb.com 143
http://www.opscode.com 65
http://readwrite.com 19
http://static.slidesharecdn.com 4
http://www.getchef.com 3
http://facebook.slideshare.com 1
http://translate.googleusercontent.com 1
http://www.lmodules.com 1
http://opscode.com 1
http://www.linkedin.com 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

CC Attribution-ShareAlike LicenseCC Attribution-ShareAlike License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Gov 2.0: Scaling, Automation, & Management in the Cloud Gov 2.0: Scaling, Automation, & Management in the Cloud Presentation Transcript

    • Scaling in the Cloud Speaker: Jesse Robbins CEO ‣ jesse@opscode.com ‣ @jesserobbins ‣ www.opscode.com Copyright © 2010 Opscode, Inc - All Rights Reserved 1
    • Opscode makes a new kind of Infrastructure Automation, offered as a hosted Service. Copyright © 2010 Opscode, Inc. – Confidential – Do Not Redistribute 2
    • http://www.flickr.com/photos/ timyates/2854357446/sizes/l/
    • • Developers? http://www.flickr.com/photos/ timyates/2854357446/sizes/l/
    • • Developers? • Systems Administrators? http://www.flickr.com/photos/ timyates/2854357446/sizes/l/
    • • Developers? • Systems Administrators? • Executives/Leaders? http://www.flickr.com/photos/ timyates/2854357446/sizes/l/
    • For Developers...
    • For Developers... • Do it yourself.
    • For Developers... • Do it yourself. • The infrastructure is the application (and vice versa).
    • For Developers... • Do it yourself. • The infrastructure is the application (and vice versa). • You are not a Systems Administrator.
    • For Developers... • Do it yourself. • The infrastructure is the application (and vice versa). • You are not a Systems Administrator. • You need tools.
    • Sysadmins.. http://covers.oreilly.com/images/9780596007836/lrg.jpg Lean into it appears courtesy of Cliff Moon, of Dynomite fame: http://twitter.com/moonpolysoft
    • Sysadmins.. • Say “Yes”. http://covers.oreilly.com/images/9780596007836/lrg.jpg Lean into it appears courtesy of Cliff Moon, of Dynomite fame: http://twitter.com/moonpolysoft
    • Sysadmins.. • Say “Yes”. • You never liked rack and stack that much anyway. http://covers.oreilly.com/images/9780596007836/lrg.jpg Lean into it appears courtesy of Cliff Moon, of Dynomite fame: http://twitter.com/moonpolysoft
    • Sysadmins.. • Say “Yes”. • You never liked rack and stack that much anyway. • You have never been more critical. http://covers.oreilly.com/images/9780596007836/lrg.jpg Lean into it appears courtesy of Cliff Moon, of Dynomite fame: http://twitter.com/moonpolysoft
    • Sysadmins.. • Say “Yes”. • You never liked rack and stack that much anyway. • You have never been more critical. • Lean into it. http://covers.oreilly.com/images/9780596007836/lrg.jpg Lean into it appears courtesy of Cliff Moon, of Dynomite fame: http://twitter.com/moonpolysoft
    • Executives...
    • Executives... • Not a magic unicorn
    • Executives... • Not a magic unicorn • Benefits come from efficiency, not raw Capex
    • Executives... • Not a magic unicorn • Benefits come from efficiency, not raw Capex • Has real cultural implications at every level
    • Executives... • Not a magic unicorn • Benefits come from efficiency, not raw Capex • Has real cultural implications at every level • You are the biggest asset to success
    • “Traditional” Operations Operations - The “Secret Sauce” 50 50 40 40 # of Hours 30 30 20 20 Hardware OS Install 10 10 Config Upkeep 20 20 15 15 Servers 10 10 New 5 5 Existing 0 0 1 2 3 4 5 6 7 9 10 11 12 1 2 3 4 5 6 7 9 10 11 12 Week # Week # (http://radar.oreilly.com/archives/2007/10/operations-advantage.html) Copyright © 2010 Opscode, Inc - All Rights Reserved 7
    • “Traditional” Operations Operations - The “Secret Sauce” 50 50 This is the secret of 40 40 Cloud Computing. Every other virtue stems from # of Hours 30 30 here. 20 20 Hardware OS Install 10 10 Config Upkeep 20 20 15 15 Servers 10 10 New 5 5 Existing 0 0 1 2 3 4 5 6 7 9 10 11 12 1 2 3 4 5 6 7 9 10 11 12 Week # Week # (http://radar.oreilly.com/archives/2007/10/operations-advantage.html) Copyright © 2010 Opscode, Inc - All Rights Reserved 7
    • You are 10% Unique
    • You are 10% Unique And itʼs probably the things you did wrong
    • Infrastructure is Hard Copyright © 2010 Opscode, Inc - All Rights Reserved 9
    • Infrastructure is Hard 1999 Inventory, packaged file transers and desktops Copyright © 2010 Opscode, Inc - All Rights Reserved 9
    • Infrastructure is Hard 1999 Inventory, packaged file transers and desktops 2005 Unattended bare metal servers “very very” hard 7k Nodes took 5 days w/90 success Copyright © 2010 Opscode, Inc - All Rights Reserved 9
    • Infrastructure is Hard 1999 Inventory, packaged file transers and desktops 2005 Unattended bare metal servers “very very” hard 7k Nodes took 5 days w/90 success 2007 Unattended bare metal in under 10 minutes Fully configured in under 3 mins Copyright © 2010 Opscode, Inc - All Rights Reserved 9
    • Infrastructure is Hard 1999 Inventory, packaged file transers and desktops 2005 Unattended bare metal servers “very very” hard 7k Nodes took 5 days w/90 success 2007 Unattended bare metal in under 10 minutes Fully configured in under 3 mins 2008 Unattended server in 2 minutes 5000 servers in a week Copyright © 2010 Opscode, Inc - All Rights Reserved 9
    • Infrastructure is Hard 1999 Inventory, packaged file transers and desktops 2005 Unattended bare metal servers “very very” hard 7k Nodes took 5 days w/90 success 2007 Unattended bare metal in under 10 minutes Fully configured in under 3 mins 2008 Unattended server in 2 minutes 5000 servers in a week 2010 10k Nodes in under 5 minutes Copyright © 2010 Opscode, Inc - All Rights Reserved 9
    • Infrastructure is changing Copyright © 2010 Opscode, Inc - All Rights Reserved 10
    • Infrastructure is changing ‣ Easier to get (good!) ...but harder to manage (bad!) Copyright © 2010 Opscode, Inc - All Rights Reserved 10
    • Infrastructure is changing ‣ Easier to get (good!) ...but harder to manage (bad!) ‣ Demand is dynamic Copyright © 2010 Opscode, Inc - All Rights Reserved 10
    • Infrastructure is changing ‣ Easier to get (good!) ...but harder to manage (bad!) ‣ Demand is dynamic ‣ Developers are crucial to Operations Copyright © 2010 Opscode, Inc - All Rights Reserved 10
    • Infrastructure is changing ‣ Easier to get (good!) ...but harder to manage (bad!) ‣ Demand is dynamic ‣ Developers are crucial to Operations ‣ Web / Cloud services are proliferating ...and Enterprise is following along. Copyright © 2010 Opscode, Inc - All Rights Reserved 10
    • Infrastructure is changing ‣ Easier to get (good!) ...but harder to manage (bad!) ‣ Demand is dynamic ‣ Developers are crucial to Operations ‣ Web / Cloud services are proliferating ...and Enterprise is following along. ‣ Manual configuration no longer a crutch Copyright © 2010 Opscode, Inc - All Rights Reserved 10
    • Infrastructure is changing ‣ Easier to get (good!) ...but harder to manage (bad!) ‣ Demand is dynamic ‣ Developers are crucial to Operations ‣ Web / Cloud services are proliferating ...and Enterprise is following along. ‣ Manual configuration no longer a crutch ‣ Few tools to solve a ubiquitous problem Copyright © 2010 Opscode, Inc - All Rights Reserved 10
    • Managing Infrastructure Is Hard Has Always Been Proprietary Solutions Previous Attempts Typically... 1980 • Solve very little of the problem... 1989 • Reach just a handful of large, enterprise customers 1999 • Require custom implementations with large professional services bills • Deployed exclusively on-premise 2001 • Acquired by companies with large consulting organizations (IBM, HP, CA) Copyright © 2010 Opscode, Inc - All Rights Reserved
    • Google, Amazon, Microsoft built their own tools Copyright © 2010 Opscode, Inc - All Rights Reserved 12
    • but it’s “secret sauce” Copyright © 2010 Opscode, Inc - All Rights Reserved 13
    • P everyone else is here ... inexperienced & poorly equipped for the world they must now operate in. Copyright © 2010 Opscode, Inc - All Rights Reserved 14
    • “Cloud”
    • Alistair’s mom’s definition Cloud
    • Alistair’s mom’s definition Cloud = Web
    • Alistair’s mom’s definition Cloud = Web = Internet
    • Alistair’s mom’s definition Cloud = Web = Internet = Useless
    • Slide courtesy Alistair Croll - alistair@rednod.com
    • Private Public Slide courtesy Alistair Croll - alistair@rednod.com
    • Private Public Managed Virtualization hosting Slide courtesy Alistair Croll - alistair@rednod.com
    • Private Public IaaS IaaS Managed Virtualization hosting Slide courtesy Alistair Croll - alistair@rednod.com
    • Private Public PaaS PaaS IaaS IaaS Managed Virtualization hosting Slide courtesy Alistair Croll - alistair@rednod.com
    • Private Public SaaS PaaS PaaS IaaS IaaS Managed Virtualization hosting Slide courtesy Alistair Croll - alistair@rednod.com
    • Private Public nt t o SaaS a w d s, o u lo u y c If l k PaaS rs t.PaaS t a ne f i i ck o IaaS p IaaS Managed Virtualization hosting Slide courtesy Alistair Croll - alistair@rednod.com
    • Infrastructure as a Service (IaaS) Amazon EC2, Rackspace Cloud, Terremark, Gogrid, Joyent (and nearly every private cloud built on Zenserver or VMWare.) Slide courtesy Alistair Croll - alistair@rednod.com
    • Dedicated On-premise Virtual Third-party hardware private clouds private clouds public clouds Slide courtesy Alistair Croll - alistair@rednod.com
    • Slide courtesy Alistair Croll - alistair@rednod.com
    • Always on premise Private Compliance- enforced Need to track and audit Legislative Data near local computation Slide courtesy Alistair Croll - alistair@rednod.com
    • Always on Can be done premise anywhere Private Compliance- Testing enforced Training Need to track and Prototyping audit Batch processing Legislative Seasonal load Data near local computation Slide courtesy Alistair Croll - alistair@rednod.com
    • Always on Can be done Always in premise anywhere cloud Private Partner access Compliance- Testing enforced Proximity to cloud Training services (storage, Need to track and Prototyping CDN, etc.) audit Batch processing Massively grid/ Legislative Seasonal load parallel (genomic, Data near local modelling) computation Slide courtesy Alistair Croll - alistair@rednod.com
    • Always on Can be done Always in premise anywhere cloud Load/pricing engine Private Partner access Compliance- Testing enforced Proximity to cloud Training services (storage, Need to track and Prototyping CDN, etc.) audit Batch processing Massively grid/ Legislative Seasonal load parallel (genomic, Data near local modelling) computation Slide courtesy Alistair Croll - alistair@rednod.com
    • Always on Can be done Always in premise anywhere cloud Load/pricing engine Private Partner access Compliance- Testing enforced Proximity to cloud Training services (storage, Policy engine Need to track and Prototyping CDN, etc.) audit Batch processing Massively grid/ Legislative Seasonal load parallel (genomic, Data near local modelling) computation Slide courtesy Alistair Croll - alistair@rednod.com
    • Virtual machine (infrastructure cloud) Always on Can be done Always in premise anywhere cloud Load/pricing engine Private Partner access Compliance- Testing enforced Proximity to cloud Training services (storage, Policy engine Need to track and Prototyping CDN, etc.) audit Batch processing Massively grid/ Legislative Seasonal load parallel (genomic, Data near local modelling) computation Slide courtesy Alistair Croll - alistair@rednod.com
    • Compute task (service cloud) Always on Can be done Always in premise anywhere cloud Load/pricing engine Private Partner access Compliance- Testing enforced Proximity to cloud Training services (storage, Policy engine Need to track and Prototyping CDN, etc.) audit Batch processing Massively grid/ Legislative Seasonal load parallel (genomic, Data near local modelling) computation Slide courtesy Alistair Croll - alistair@rednod.com
    • Automation
    • Bootstrapping
    • Bootstrapping Approaches Good Bad Time Known Costs, No High Waste (Hoarding) Variation. Red Tape Corp Approvals Anything you want, as long Expensive ($/Time) 6-8w as IT pre-approved it. Long lead time Lower Waste Agile Corp Known Costs. Less Red Tape Total Hardware Control. Still slow 2-4w Approvals Trivial Approvals. Expensive ($/Time) Shorter lead time Variable Costs. Highly Adaptable. Variable Costs. Cloud Minimal lead time. Trivial approvals. No control over hardware. Must re-train. 5-10m No humans needed.
    • Configuration curl -O http://brainspl.at/velocity.sh && sh velocity.sh
    • Configuration Approaches Good Bad Slow. You can do anything. Error Prone (Bus Error!) Manual Results in an intimate knowledge of the details. Non-repeatable. Difficult knowledge transfer. Rarely idempotent. More repeatable. Hard to collaborate. Ad-Hoc Knowledge is dispersed. Built your way, with your model. Brittle. No API. Repeatable. Infrastructure Idempotent. Agile. Have to learn how to use it. Hard things remain hard. as Code Sharable. Self documenting. Not magic. (Yet!)
    • Command and Control
    • Command and Control Good Bad Super flexible. Error Prone. Can do almost anything. Slow. Meatcloud* Always easy to find someone to blame. Expensive to Scale. Not repeatable. Free will. Free will. One-off by neccessity. More repeatable. Tooling sprawl. Ad-Hoc Easier to scale. Less error prone (hopefully!) Hard to share solutions. Much higher learning curve. One system to learn. Scales well. Not everything maps cleanly. Framework Paint by numbers. Repeatable. Trades depth of knowledge for ease of use. Two-Way. *Meatcloud appears in this presentation courtesy of Andrew Shafer - http://is.gd/Ega
    • Lightning Strikes! DOOM Webservers Database Servers Webservers
    • Lightning Strikes! DOOM X Webservers XX Database Servers Webservers
    • Lightning Strikes! DOOM X 1 1 2 1 Signals Moar! XX Monitoring Command & Bootstrapping System Webservers Updates Control Provisions 2 1 3 1 5 1 3 1 Database Servers 4 1 4 1 Configuration Webservers
    • Lightning Strikes! Monitoring Signals Nanite /node/down Service DOOM X 1 1 2 1 Signals Moar! XX Monitoring Command & Bootstrapping System Webservers Updates Control Provisions 2 1 3 1 5 1 3 1 Database Servers 4 1 4 1 Configuration Webservers
    • Lightning Strikes! Nanite boots new EC2 Nanite removes DOOM Instances, with Chef Role + nodes in Chef Attribute X 1 1 2 1 Signals Moar! XX Monitoring Command & Bootstrapping System Webservers Updates Control Provisions 2 1 3 1 5 1 3 1 Database Servers 4 1 4 1 Configuration Webservers
    • Lightning Strikes! DOOM X 1 1 2 1 Provisions Instances, EBS, Signals Moar! Elastic IPs XX Monitoring Command & Bootstrapping System Webservers Updates Control Provisions 2 1 3 1 5 1 3 1 Database Servers 4 1 4 1 Configuration Webservers
    • Lightning Strikes! DOOM X 1 1 2 1 Signals Moar! XX Monitoring Command & Bootstrapping System Webservers Updates Control Provisions 2 1 3 1 5 1 3 1 Chef Database Servers configures nodes 4 1 4 1 according to Configuration assigned Webservers
    • Lightning Strikes! DOOM Chef X 1 1 2 1 Signals Moar! updates the XX Monitoring Command & Bootstrapping monitoring System Webservers Updates Control Provisions system 5 1 2 1 3 1 3 1 Database Servers 4 1 4 1 Configuration Webservers
    • A word about Scaling...
    • Typical Peak Load 1.Bring on capacity as traffic ramps up 2.Take down capacity as it ramps down 3.10-15 Minutes on either side, fully unattended Graphs in this portion of the presentation taken from Theo Schlossnagle http://omniti.com/seeds/dissecting-todays-internet-traffic-spikes
    • Atypical Load No way However, around you are Capacity still better Planning off! 1.Hope you know it is coming. 2.Increase capacity in advance. 3.Take down capacity as it ramps down. Graphs in this portion of the presentation taken from Theo Schlossnagle http://omniti.com/seeds/dissecting-todays-internet-traffic-spikes
    • Capacity Planning is king. http://www.flickr.com/photos/allspaw/2095439645/sizes/l/
    • Have a queue?
    • Have a queue? Does it scale linearly with more resources?
    • Have a queue? Does it scale linearly with more resources? Congratulations - you can auto-scale!
    • NoSQL http://www.flickr.com/photos/wingler/3429634150/sizes/l/
    • CAP Theorem • Consistency Pick Two • Availability • Partition Tolerance
    • Most SQL Databases • Choose Consistency over all • Availability comes distant second
    • Web Applications need... • Availability • Partition Tolerance
    • “Global temporal consistency is a fiction” Christopher Brown
    • Choosing Consistency for your Web App... Means failure is global
    • When you choose Partition Tolerance and Availability... You fail or succeed for a subset of users
    • Apologies • Apologize after the fact for failures • Better than nothing at all
    • NoSQL • Many different tools • They tweak CAP differently • CouchDB • Cassandra • Redis • MongoDB
    • Scaling in the Cloud Speaker: Jesse Robbins CEO ‣ jesse@opscode.com ‣ @jesserobbins ‣ www.opscode.com Copyright © 2010 Opscode, Inc - All Rights Reserved 43