Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Farms, Fabrics and Clouds


Published on

Trends in server-side, clustered, datacentred and cloud computing, including a look at what classic assumptions are no longer valid in these worlds.

Published in: Technology
  • Be the first to comment

Farms, Fabrics and Clouds

  1. Steve Loughran Julio Guijarro HP Laboratories, Bristol, UK November 2008 Farms, Fabrics and Clouds [email_address] [email_address]
  2. <ul><ul><li>Researcher at HP Laboratories </li></ul></ul><ul><ul><li>Area of interest: Deployment </li></ul></ul><ul><ul><li>Author of Ant in Action </li></ul></ul>Steve Loughran
  3. Julio Guijarro <ul><ul><li>Researcher at HP Laboratories </li></ul></ul><ul><ul><li>Area of interest: Deployment </li></ul></ul><ul><ul><li>In charge of OSS release </li></ul></ul><ul><ul><li> </li></ul></ul>
  4. <ul><li>How to host big applications across distributed resources </li></ul><ul><ul><li>Automatically </li></ul></ul><ul><ul><li>Repeatably </li></ul></ul><ul><ul><li>Dynamically </li></ul></ul><ul><ul><li>Correctly </li></ul></ul><ul><ul><li>Securely </li></ul></ul><ul><li>How to manage them from installation to removal </li></ul><ul><li>How to make dynamically allocated servers useful </li></ul>Our research - see
  5. Who had breakfast this morning? Question
  6. Who harvested wheat or corn, or killed an animal for that breakfast? Question
  7. Farms provide food. It is somebody else's problem
  8. Who is wearing clothes they wove or knitted themselves? Question
  9. Provisioning of clothing -fabrics- is outsourced It is somebody else's problem
  10. Future applications are on the Web <ul><li>Web Browser, AJAX clients </li></ul><ul><li>Richer: Flash, XUL, Silverlight </li></ul><ul><li>&quot;… as a Service &quot; </li></ul><ul><li>Lots of code running in the server </li></ul><ul><li>Unpredictable demand </li></ul><ul><li>Data mining/analysis problems </li></ul>
  11. Old world installation: single server Single web server, Single DB RAID filestore -SPOF -limitations of scale
  12. yesterday: clustering Multiple web servers, Replicated DB RAID Network filestore Load-balancing router -Cost -Complexity -Limitations of scale Maintains the illusion of a single server
  13. Now: server farms 500 web servers, Distributed filestore Rented storage & CPU Scales up No capital outlay Agile infrastructure
  14. Tomorrow? grid fabric. 50000 servers
  15. Application architectures and deployment problems change radically in this world
  16. Application architectures September 2008
  17. Application architectures <ul><li>ROA/REST </li></ul><ul><li>Virtualized </li></ul><ul><li>MapReduce </li></ul><ul><li>Shards </li></ul><ul><li>Tuple-spaces </li></ul><ul><li>XMPP </li></ul>
  18. Virtualization
  19. Why? <ul><li>Save on hardware (and power, space)‏ </li></ul><ul><li>Dynamically move running servers </li></ul><ul><li>Demand creation of new images </li></ul><ul><li>Testing complex system configurations </li></ul><ul><li>Redistributing entire machine image </li></ul><ul><li>'virtual appliance' </li></ul>
  20. Assumptions that are now invalid <ul><li>Systems have a long lifespan </li></ul><ul><li>It is slow/expensive to create a new system </li></ul><ul><li>It is expensive to duplicate one </li></ul><ul><li>Systems can/should be managed by hand </li></ul><ul><li>Clocks proceed at the same rate </li></ul><ul><li>Physical RAM doesn’t get swapped out </li></ul><ul><li>Running machines can't be moved/cloned </li></ul><ul><li>Virtualization is only for testing. </li></ul>
  21. Server Farms
  22. Assumptions that are now invalid <ul><li>System failure is an unusual event </li></ul><ul><li>100% availability can be achieved </li></ul><ul><li>Data is always near the server </li></ul><ul><li>You need physical access to the servers </li></ul><ul><li>Databases are the best form of storage </li></ul><ul><li>You need millions of $/£/€ to play </li></ul>
  23. Who has the servers? <ul><li>Yahoo!, Google, MSN, Amazon, eBay: services </li></ul><ul><li>MMORPG Game Vendors: World of Warcraft, Second Life </li></ul><ul><li>EU Grid: Scientists </li></ul><ul><li>HP, IBM, Sun: rent to companies (some resold) -focus on CPU performance for enterprise </li></ul><ul><li>Amazon: rent to anyone with an Amazon account -focus on startups </li></ul>
  24. Amazon EC2 <ul><li>Pay as you go Virtual Machine Hosting </li></ul><ul><li>No persistent storage other than S3 filestore -uses HTTP GET/PUT/DELETE operations </li></ul><ul><li>$0.10 per CPU/hour </li></ul><ul><li>Resold OS images for more (RedHat, Windows)‏ </li></ul><ul><li>Rent static IP addresses for failover/balancing </li></ul><ul><li>New: RAID-like storage </li></ul>
  25. Host Amazon EC2 S3 Storage AMI (Xen VM)‏ AMI (Xen VM)‏ /mnt Host AMI (Xen VM)‏ AMI (Xen VM)‏ Public Internet /mnt /mnt /mnt Fast (free) network free access; slow initial read time pay per GET; per megabyte $ $ $ $ $
  26. Demo
  27. EC2 Limitations <ul><li>Can't talk to peers using public IP addresses </li></ul><ul><li>Persistent file system is a premium extra </li></ul><ul><li>Most addresses are dynamic </li></ul><ul><li>No managed redundancy/restart </li></ul><ul><li>No multicast IP </li></ul><ul><li>No movement of VMs off high-traffic racks </li></ul>
  28. Amazon S3 <ul><li>Multiple geo-located data storage </li></ul><ul><li>No limits on size </li></ul><ul><li>Cost of write is high (guarantee of written remotely)‏ </li></ul><ul><li>Read is cheap; may be out of date </li></ul><ul><li>Cost: Low </li></ul><ul><li>S3 is a global file system that any project can afford </li></ul>
  29. Amazon S3 Charges <ul><li>S3 sets the limit on costs for reliable data storage over the network </li></ul><ul><li>For Amazon, indexing and writes are the big costs…small files are the enemy </li></ul>Storage $0.15/GB/month Upload $0.10 per GB - all data transfer in Download $0.18 per GB - first 10 TB / month data transfer out $0.16 per GB - next 40 TB / month data transfer out $0.13 per GB - data transfer out / month over 50 TB Requests $0.01 per 1,000 PUT or LIST $0.01 per 10,000 GET or HEAD $0 DELETE
  30. MapReduce Commodity data processing for commodity data
  31. Assumptions that are now invalid <ul><li>Terabyte datasets are hard to work with </li></ul><ul><li>Code runs on a single machine </li></ul><ul><li>Sequential code is better than parallel code </li></ul><ul><li>RAID hardware is the best way to store data </li></ul><ul><li>Databases are better than filesystems </li></ul><ul><li>Low-value data isn't worth collecting even if you don't have a use for it now </li></ul>
  32. Shards
  33. Assumptions that are now invalid <ul><li>A single farm needs to scale to infinity </li></ul><ul><li>You need to provide 100% availability to 100% of users </li></ul><ul><li>You have to roll out simultaneous updates to the application, changes to the DB schema, globally </li></ul>
  34. XMPP post extends GoogleChatClientWorkflow { to &quot;;; login &quot;smartfrog.two&quot;; password xmpp.password; message &quot;hello, world&quot;; }
  35. Assumptions that are now invalid <ul><li>You can't send message to a laptop that moves around behind a firewall. </li></ul><ul><li>You need to build your own monitoring infrastructure. </li></ul><ul><li>Blocking RPC is a good metaphor for long-haul communications. </li></ul><ul><li>You can't send messages to your server farm from your phone </li></ul><ul><li>IT doesn't have their eyes on your protocol </li></ul>
  36. Problems for us farmers <ul><li>Power management </li></ul><ul><li>Predictive disk failure management </li></ul><ul><li>Load balancing for availability, power </li></ul><ul><li>File management </li></ul><ul><li>Billing </li></ul><ul><li>Routing </li></ul><ul><li>Security/Isolation </li></ul><ul><li>How will this change server hardware? </li></ul><ul><li>Managing/Configuring Machine Images </li></ul><ul><li>Diagnostics when things go wrong </li></ul>
  37. “ Agile” Routers <ul><li>Handle hundreds to thousands of (concurrent) change requests/second </li></ul><ul><li>Integrate with billing </li></ul><ul><li>Managed throttling to specific hosts </li></ul><ul><li>Propagation of state to peer rooters </li></ul><ul><li>'agile' DHCP -short leases; mobile </li></ul><ul><li>Monitored bandwidth may trigger VM migration </li></ul>
  38. “ Agile” Operating Systems <ul><li>Design for VM-only use </li></ul><ul><li>Limited functionality </li></ul><ul><li>Limited lifespan </li></ul><ul><li>Fully configurable before initial boot </li></ul><ul><li>Adapt to changes in surrounding environment </li></ul><ul><li>Viable licensing model </li></ul>