Steve Loughran Julio Guijarro HP Laboratories, Bristol, UK November 2008 Farms, Fabrics and Clouds [email_address] [email_address]
Researcher at HP Laboratories Area of interest: Deployment Author of  Ant in Action Steve Loughran
Julio Guijarro Researcher at HP Laboratories Area of interest: Deployment In charge of OSS release http://smartfrog.org/
How to host big applications across distributed resources Automatically Repeatably Dynamically Correctly Securely How to manage them from installation to removal How to make dynamically allocated servers useful Our research - see smartfrog.org
Who had breakfast this morning? Question
Who harvested wheat or corn,  or killed an animal for  that breakfast? Question
Farms provide food. It is  somebody else's problem
Who is wearing clothes they wove or knitted themselves? Question
Provisioning  of clothing -fabrics- is outsourced It is  somebody else's problem
Future applications are on the Web Web Browser, AJAX clients Richer: Flash, XUL, Silverlight "… as  a Service " Lots of code running in the server Unpredictable demand Data mining/analysis problems
Old world installation: single server Single web server, Single DB RAID filestore -SPOF -limitations of scale
yesterday: clustering Multiple web servers, Replicated DB RAID Network filestore Load-balancing router -Cost -Complexity -Limitations of scale Maintains the illusion of a single server
Now: server farms 500 web servers, Distributed filestore Rented storage & CPU Scales up No capital outlay Agile infrastructure
Tomorrow? grid fabric. 50000 servers
Application architectures and deployment problems change radically in this world
Application architectures September 2008
Application architectures ROA/REST Virtualized MapReduce Shards Tuple-spaces XMPP
Virtualization
Why? Save on hardware (and power, space)‏ Dynamically move running servers Demand creation of new images Testing complex system configurations Redistributing entire machine image 'virtual appliance'
Assumptions that are now invalid Systems have a long lifespan It is slow/expensive to  create a new system It is expensive to duplicate one Systems can/should be managed by hand Clocks proceed at the same rate Physical RAM doesn’t get swapped out Running machines can't be moved/cloned Virtualization is only for testing.
Server Farms http://www.linuxjournal.com/
Assumptions that are now invalid System failure is an unusual event 100% availability can be achieved Data is always near the server You need physical access to the servers Databases are the best form of storage You need millions of $/£/€ to play
Who has the servers? Yahoo!, Google, MSN, Amazon, eBay: services MMORPG Game Vendors:  World of Warcraft, Second Life EU Grid: Scientists HP, IBM, Sun: rent to companies (some resold)  -focus on CPU performance for enterprise Amazon: rent to anyone with an Amazon account -focus on startups
Amazon EC2 Pay as you go Virtual Machine Hosting No persistent storage other than S3 filestore -uses HTTP GET/PUT/DELETE operations $0.10 per CPU/hour Resold OS images for more  (RedHat, Windows)‏ Rent static IP addresses for failover/balancing New: RAID-like storage
Host Amazon EC2 S3 Storage AMI (Xen VM)‏ AMI (Xen VM)‏ /mnt Host AMI (Xen VM)‏ AMI (Xen VM)‏ Public Internet /mnt /mnt /mnt Fast (free) network free access; slow initial read time pay per GET; per megabyte $ $ $ $ $
Demo
EC2 Limitations Can't talk to peers using public IP addresses Persistent file system is a premium extra Most addresses are dynamic No managed redundancy/restart No multicast IP No movement of VMs off high-traffic racks
Amazon S3 Multiple geo-located data storage No limits on size Cost of write is high (guarantee of written remotely)‏ Read is cheap; may be out of date Cost: Low S3 is a global file system that any project can afford
Amazon S3 Charges S3 sets the limit on costs for reliable data storage over the network For Amazon, indexing and writes are the big costs…small files are the enemy  Storage $0.15/GB/month Upload $0.10 per GB - all data transfer in Download $0.18 per GB - first 10 TB / month data transfer out $0.16 per GB - next 40 TB / month data transfer out $0.13 per GB - data transfer out / month over 50 TB  Requests $0.01 per 1,000 PUT or LIST $0.01 per 10,000 GET or HEAD  $0 DELETE
MapReduce Commodity data processing for commodity data
Assumptions that are now invalid Terabyte datasets are hard to work with Code runs on a single machine Sequential code is better than parallel code RAID hardware is the best way to store data Databases are better than filesystems Low-value data isn't worth collecting even if you don't have a use for it now
Shards
Assumptions that are now invalid A single farm needs to scale to infinity You need to provide 100% availability to 100% of users You have to roll out simultaneous updates to the application, changes to the DB schema,  globally
XMPP post extends GoogleChatClientWorkflow { to "smartfrog.two@gmail.com"; login "smartfrog.two"; password xmpp.password; message "hello, world"; }
Assumptions that are now invalid You can't send message to a laptop that moves around behind a firewall. You need to build your own monitoring infrastructure. Blocking RPC is a good metaphor for long-haul communications. You can't send messages to your server farm from your phone IT doesn't have their eyes on your protocol
Problems for us farmers Power management Predictive disk failure management Load balancing for availability, power  File management Billing Routing Security/Isolation How will this change server hardware? Managing/Configuring Machine Images Diagnostics when things go wrong
“ Agile” Routers Handle hundreds to thousands of (concurrent) change requests/second  Integrate with billing Managed throttling to specific hosts Propagation of state to peer rooters 'agile' DHCP -short leases; mobile Monitored bandwidth may trigger VM migration
“ Agile” Operating Systems Design for VM-only use Limited functionality Limited lifespan Fully configurable before initial boot Adapt to changes in surrounding environment Viable licensing model

Farms, Fabrics and Clouds

  • 1.
    Steve Loughran JulioGuijarro HP Laboratories, Bristol, UK November 2008 Farms, Fabrics and Clouds [email_address] [email_address]
  • 2.
    Researcher at HPLaboratories Area of interest: Deployment Author of Ant in Action Steve Loughran
  • 3.
    Julio Guijarro Researcherat HP Laboratories Area of interest: Deployment In charge of OSS release http://smartfrog.org/
  • 4.
    How to hostbig applications across distributed resources Automatically Repeatably Dynamically Correctly Securely How to manage them from installation to removal How to make dynamically allocated servers useful Our research - see smartfrog.org
  • 5.
    Who had breakfastthis morning? Question
  • 6.
    Who harvested wheator corn, or killed an animal for that breakfast? Question
  • 7.
    Farms provide food.It is somebody else's problem
  • 8.
    Who is wearingclothes they wove or knitted themselves? Question
  • 9.
    Provisioning ofclothing -fabrics- is outsourced It is somebody else's problem
  • 10.
    Future applications areon the Web Web Browser, AJAX clients Richer: Flash, XUL, Silverlight "… as a Service " Lots of code running in the server Unpredictable demand Data mining/analysis problems
  • 11.
    Old world installation:single server Single web server, Single DB RAID filestore -SPOF -limitations of scale
  • 12.
    yesterday: clustering Multipleweb servers, Replicated DB RAID Network filestore Load-balancing router -Cost -Complexity -Limitations of scale Maintains the illusion of a single server
  • 13.
    Now: server farms500 web servers, Distributed filestore Rented storage & CPU Scales up No capital outlay Agile infrastructure
  • 14.
  • 15.
    Application architectures anddeployment problems change radically in this world
  • 16.
  • 17.
    Application architectures ROA/RESTVirtualized MapReduce Shards Tuple-spaces XMPP
  • 18.
  • 19.
    Why? Save onhardware (and power, space)‏ Dynamically move running servers Demand creation of new images Testing complex system configurations Redistributing entire machine image 'virtual appliance'
  • 20.
    Assumptions that arenow invalid Systems have a long lifespan It is slow/expensive to create a new system It is expensive to duplicate one Systems can/should be managed by hand Clocks proceed at the same rate Physical RAM doesn’t get swapped out Running machines can't be moved/cloned Virtualization is only for testing.
  • 21.
  • 22.
    Assumptions that arenow invalid System failure is an unusual event 100% availability can be achieved Data is always near the server You need physical access to the servers Databases are the best form of storage You need millions of $/£/€ to play
  • 23.
    Who has theservers? Yahoo!, Google, MSN, Amazon, eBay: services MMORPG Game Vendors: World of Warcraft, Second Life EU Grid: Scientists HP, IBM, Sun: rent to companies (some resold) -focus on CPU performance for enterprise Amazon: rent to anyone with an Amazon account -focus on startups
  • 24.
    Amazon EC2 Payas you go Virtual Machine Hosting No persistent storage other than S3 filestore -uses HTTP GET/PUT/DELETE operations $0.10 per CPU/hour Resold OS images for more (RedHat, Windows)‏ Rent static IP addresses for failover/balancing New: RAID-like storage
  • 25.
    Host Amazon EC2S3 Storage AMI (Xen VM)‏ AMI (Xen VM)‏ /mnt Host AMI (Xen VM)‏ AMI (Xen VM)‏ Public Internet /mnt /mnt /mnt Fast (free) network free access; slow initial read time pay per GET; per megabyte $ $ $ $ $
  • 26.
  • 27.
    EC2 Limitations Can'ttalk to peers using public IP addresses Persistent file system is a premium extra Most addresses are dynamic No managed redundancy/restart No multicast IP No movement of VMs off high-traffic racks
  • 28.
    Amazon S3 Multiplegeo-located data storage No limits on size Cost of write is high (guarantee of written remotely)‏ Read is cheap; may be out of date Cost: Low S3 is a global file system that any project can afford
  • 29.
    Amazon S3 ChargesS3 sets the limit on costs for reliable data storage over the network For Amazon, indexing and writes are the big costs…small files are the enemy Storage $0.15/GB/month Upload $0.10 per GB - all data transfer in Download $0.18 per GB - first 10 TB / month data transfer out $0.16 per GB - next 40 TB / month data transfer out $0.13 per GB - data transfer out / month over 50 TB Requests $0.01 per 1,000 PUT or LIST $0.01 per 10,000 GET or HEAD $0 DELETE
  • 30.
    MapReduce Commodity dataprocessing for commodity data
  • 31.
    Assumptions that arenow invalid Terabyte datasets are hard to work with Code runs on a single machine Sequential code is better than parallel code RAID hardware is the best way to store data Databases are better than filesystems Low-value data isn't worth collecting even if you don't have a use for it now
  • 32.
  • 33.
    Assumptions that arenow invalid A single farm needs to scale to infinity You need to provide 100% availability to 100% of users You have to roll out simultaneous updates to the application, changes to the DB schema, globally
  • 34.
    XMPP post extendsGoogleChatClientWorkflow { to "smartfrog.two@gmail.com"; login "smartfrog.two"; password xmpp.password; message "hello, world"; }
  • 35.
    Assumptions that arenow invalid You can't send message to a laptop that moves around behind a firewall. You need to build your own monitoring infrastructure. Blocking RPC is a good metaphor for long-haul communications. You can't send messages to your server farm from your phone IT doesn't have their eyes on your protocol
  • 36.
    Problems for usfarmers Power management Predictive disk failure management Load balancing for availability, power File management Billing Routing Security/Isolation How will this change server hardware? Managing/Configuring Machine Images Diagnostics when things go wrong
  • 37.
    “ Agile” RoutersHandle hundreds to thousands of (concurrent) change requests/second Integrate with billing Managed throttling to specific hosts Propagation of state to peer rooters 'agile' DHCP -short leases; mobile Monitored bandwidth may trigger VM migration
  • 38.
    “ Agile” OperatingSystems Design for VM-only use Limited functionality Limited lifespan Fully configurable before initial boot Adapt to changes in surrounding environment Viable licensing model

Editor's Notes

  • #2 1/14/2004 This is a presentation initially given at Bristol University; its about trends in server-side computing.