Managing Virtual Sprawl
Upcoming SlideShare
Loading in...5
×
 

Managing Virtual Sprawl

on

  • 3,444 views

http://twitter.com/jhitchco

http://twitter.com/jhitchco

Statistics

Views

Total Views
3,444
Views on SlideShare
3,443
Embed Views
1

Actions

Likes
1
Downloads
17
Comments
0

1 Embed 1

http://www.slideshare.net 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Managing Virtual Sprawl Managing Virtual Sprawl Presentation Transcript

  • Managing Virtual Sprawl (How to not let this happen to you) Jeremy Hitchcock, jeremy@dyn.com
  • Why care? What you have What you want What you got
  • Managing clouds like managing single systems increases "system" management by 10x-20x
  • Clouds Promise • Greater efficiency • Faster deploys/less management • Little/no capital costs and no step functions
  • Sprawl Eats Potential • Greater efficiency • Faster deploys/less management • Little/no capital costs and no step functions
  • Just not good, yet 15 years 3 years
  • Don’t just change broken light bulbs
  • Wait until it gets dark, then change them all
  • Let’s get started 1. Architectures 2. Pain points 3. Best practices 4. What do we get?
  • 1: Architectures • Architecture changes • Decoupling • Geography/load balancing • Disaster recovery
  • 2004
  • 2007-2008
  • Opera dynamic resource pricing model
  • Decoupling • Apps and infrastructure mirror each other • Years of coupled development • Hard to retrofit, easier to do from start
  • Decoupling Old: Web App DB New: Processing Dispatcher Storage
  • Decoupling is Hard • Logging/debugging • Common scratch • Images and provisioning • Configuration data (run/boot) • Job dispatch (async/sync)
  • Images and provisioning Add __ new front ends Publish New Code Even better is that is automatic
  • Configuration Data • Most config data is on each image • Instead, auto populate into source control • Config, image, controller re-architected
  • Job Dispatch (sync) Request for photo Read photo off disc Resize/reformat Log Return photo to user
  • Job Dispatch (async) 2 Read photo off disc 1 4 Log Request for photo 5 Return photo to user 3 Resize/reformat
  • Geography/load balancing • Data centers do not house eyeballs • Intra/inter-site load balancing • Names to numbers (users think names) • Between clouds/interoperability?
  • Disaster Recovery • Practice them • Failovers should be automatic • DNS (Quick DNS nit: use short TTLs) • Contingency plans
  • Case Study: Authorize.net
  • Case Study: Authorize.net
  • Case Study: Authorize.net ; QUESTION SECTION: ;secure.authorize.net. IN A ;; ANSWER SECTION: secure.authorize.net. 86400 IN A 64.94.118.32 secure.authorize.net. 86400 IN A 64.94.118.33
  • Case Study: Authorize.net
  • Case Study: Authorize.net ; QUESTION SECTION: ;secure.authorize.net. IN A ;; ANSWER SECTION: secure.authorize.net. 86400 IN A 64.94.118.32 secure.authorize.net. 86400 IN A 64.94.118.33 GAH!
  • 2: Pain Points • Inventory • Delivery speed • Supply/demand • Configuration • Points of failure
  • “I can ping it but I don’t know where it is!”
  • Inventory • Does it matter? • Not an asset tag but provisioning scripts • Audit bills (operational costs)
  • Delivery Speed • May actually suffer (more pieces, not iron) • Be analytical about what can be slow • Limiting factor of what’s virtualized • Were you looking before?
  • Delivery Speed •Where is the testing from? •Is this load dependent? •Do users notice/care? Graph from Gomez •Does it matter? •Cost to make it faster? •Savings to make it slower?
  • Supply/demand • Capital investments versus operating costs • Big architecture changes to constant tuning • Sampling time
  • Configuration • Configuration in source control • Has to move to a centralized location • Patches, updates, revision images • Lot of hard work here (no return)
  • Points of Failure • It’s about risk • All in the name, DNS • 99.9% is different from 99.99% • Any page is better than nothing
  • 3: Best Practices • App rewrite • Controller (code, monitoring) • Configuration (chef, puppet, etc) • Dev/staging/production (Django/Rails) • Security • Monitoring and verification
  • Dev/Staging/Production • This stuff works, use it • Clouds make this possible • ONLY exception is load testing (big exception) • Nothing going to work out of the box
  • Security • No “behind the firewall” • Not an after thought, core feature • Something to test • Two hash encryption (private data) • Centralized management makes security easier (At least double or nothing)
  • Monitoring and Verification What your user sees What you monitor Are they the same? Test transactions
  • 4: What do we get? • More choice on availability • Less step functions (capacity, cost) • Reduce computational marginal cost
  • Final Remarks • Sprawl eats away from the promised good • Never truly decoupled, apps dictate arch • Management tools still lacking, more homegrown • Make it all automatic, not easy
  • Questions? Jeremy Hitchcock, jeremy@dyn.com DynDNS.com offers a suite of The Dynect Platform provides DNS, email, domain registration the enterprise with external and virtual servers for the home managed DNS and traffic and small business user. management services.