So my goal in this session is to give you a big broad perspective on this space.So for full disclosure I in fact do work for Microsoft.HOWEVER, my goal here today is not to convince you that Windows Azure is exactly what you need in every situation.Rather my goal here is to give you some perspective on some of the things that made the cloud possible today.I will talk about how the cloud is different than traditional hosting which all of you are familiar with.I will talk about just a few scenarios that cloud provides or makes very easy.Then I’ll walk through the differences between IaaS and PaaS then talk about a number of different cloud vendors and about their offers, goods, bads, etc.
So what changes when you think about cloud computing from traditional hosting?
BandwidthCheaperMore of itYou can do far more on a server and deliver that to clients rather than relying on the clients processing power alone (or at all)
In 1998 the average price paid by content owners to deliver video on the web was around $0.15 per MB delivered. That's per bit delivered, not sustained.In 1998 if you wanted to stream a movie at 3Mbs it would cost you $270.Today you Netflix pays about 5 cents to stream the same movie.Sourceshttp://blog.streamingmedia.com/the_business_of_online_vi/2010/01/bandwidth-pricing-trends-cost-to-stream-a-movie-today-five-cents-cost-in-1998-270.htmlNumbershttp://drpeering.net/white-papers/Internet-Transit-Pricing-Historical-And-Projected.php
Back then you couldn’t even stream a 3Mbs movie the best you could get would have been 37kbs.Fast forward to today where guys like Netflix are encoding their content at a bitrate that is 90x what it was in 1998.
Cheaper & Better HardwareProcessorsStorageRAM
Moore’s Law is still running strong.
Sourcehttp://ns1758.ca/winch/winchest.htmlIn 2000 the250GB hard drive on this Macbook Air would have cost $5000. That’s more than double what the entire machine costs today.Not only that but this Macbook Air has a solid state drive. The figures here only apply to regular spinning storage media, not SSD’s.
Today memory is cheaper and faster too.Back in 2000 the 4GB of RAM on this Macbook Air would have cost over $4000.
Virtualization technology has improved greatlyQuick, cheap and easy to spin up VM’sAdvances by VMWare , Xen, Microsoft
Adoption of REST greatly simplified cloud management and accessing resources such as storageCloud management & provisioning via browserOr provisioning within your application itselfREST is the final piece that takes the ability to provision and manage all this cheap hardware completely self-service.
Instagram is a classic example of a service that could not exist without the cloud.These guys grew by 10 million new users in just a few days time when they released their Andoid application earlier this year.
Sourcehttp://instagram-engineering.tumblr.com/Instagram is a tiny company. They have only three engineers who manage all of their infrastructure.Yet with just three people they were able to manage the incredible growth of their service with no loss in performance.
I know we’d all like to be building the next Instagram so we can get bought by Facebook but let’s look at a scenario that’s more down to earth.So companies that build software have to build test labs that they use to run unit tests, integration and load testing before they push it live.If you think what does the load look like for this workload it would look something like this.Load your app, run your tests, tests pass, push your updates live. Done.Then your servers sit idle until your next release.
If I run my own test lab, how many servers do I need?If I need a 100 servers to run the job in 12 hours then I need to buy 100 servers.
Note this doesn’t include ANYTHING else. People to maintain it, etc.
The concept is simple. You want to scale vertically then deploy your app on a bigger server.In some cases you can simply throw larger or more VM’s at a solution and scale.But in reality for real scale you need to design your apps for scale and for the cloud.
Let’s look back at that You Tube example.For instance what if I had an app that consisted of a website with the ability to upload videos, encode them then make them available for people to view.If you had a single app like this one sitting on a single server and tried to scale it by replicating it across many different VM’s it wouldn’t work.You can’t scale an app of this type by simply throwing more hardware at it.Performance is still going to suck.
You need to design for the cloud.Let’s look back at that Cat You Tube example.For instance what if I had an app that consisted of a website with the ability to upload videos, encode them then make them available for people to view.You need to separate the various computation pieces of your site into discreet and separate apps or services. This includes:A website that your users visit to view videos and select to upload their own videos.A service that injests new videos being uploaded.A service that then encodes them.Then a cache to stage them up for quick retrieval by other users coming to watch videos of cats.With each logical part of the app you can enable monitoring to allow it to scale independently of the other components to provide for more efficiency and higher performance.
You can deal with issues of inconsistency in various ways.For instance if your service has a workflow for new users to sign up then you’ll have a consistency issue when the webserver goes back to the slave db for user data.One way of getting around this is to have the system send an email when the replication is complete. Many large sites/services use this exact technique.
Good blog post on thishttp://blog.maxindelicato.com/2008/12/scalability-strategies-primer-database-sharding.html
You use SaaS services today!!!How many of you use Gmail?Well if you use Gmail or any online email service you are a user of SaaS applications.Many companies, especially smaller newer ones are also big fans of SalesForce.Installing and maintaining CRM software is very resource intensive, both in dollars to purchase as well as in manpower to maintain it.Providing CRM software over the browser means zero installation, configuration or maintenance. It also makes licensing easier since you only pay for what you use.So the appeal for SaaS is that it lowers the barrier to entry down to the user level.But since we are all developers I’m not going to spend too much time talking about SaaS.What I will say is that as an application developer it is a very appealing way to deliver services to customers for what are normally complex applications to install or maintain. If you build software that requires some sort of on premise installation this is definitely a good option to consider.
So since we are developers I want to spend the bulk of my time talking about non SaaS providers.So Iaas or Infrastructure as a Service is basically the ability to self provision virtual machines on demand.
Far fewer options than AmazonLB supports, round robin, weighted round robin, least connections, weighted least connections and random
Engine Yard, similar in many ways to Heruku.Also runs on EC2.Btw, they also have a version that is designed for Enterprises and it runs on TerraMark which is an Operations Cloud Provider.
Moving Cloud from Hype To Reality
Mark Brown Community Manager Windows AzureMicrosoft Corporation @markjbrown
• What is the cloud/characteristics• Differences between cloud and hosting• Why makes the cloud possible today?• Considerations for building cloud apps• IaaS Offerings• PaaS Offerings
What is the Cloud? Place to run apps, store data, and more Offers self-service provisioning of resources Provides granular, elastic allocation of resources Charged for the resources you use
Cloud Characteristics Provides high reliability Runs on commodity hardware Should allow for elasticity Should provide economies of scale Resources Cost
From Hosting to Cloud ComputingCapability Hosting Cloud Capacity Reserved On Demand Payment Per Month Per Hour Provisioning Managed Self-Service
Want to stream a Movie in 1998? US Internet Transit Prices per Mbs$1,400$1,200 $270$1,000 $800 $600 per Mbs $400 $200 .05₵ $0 2003 2006 2002 2005 2009 2001 2004 2010 1998 2000 2011 2008 2007 2012 1999
There’s more bandwidth too Average bitrates for streaming media40003500 90x300025002000 Avg BitRates15001000 500 0 1998 2003 2007 2011
Virtualization 2000 – Free BSD jails released 2001 – VMWare first server virtualization 2003 Xen: first open source x86 hypervisor Microsoft releases Virtual PC for Windows 2006 – VMWare Server offered for free Microsoft: Virtual PC offered for free Microsoft – Xen Hypervisor compatibility
REST/Services Simplified managing resources Provisioning: via a browser inside your app Monitoring and management Makes self-service possible Final piece that ties it altogether
Before the Cloud. Could I… Create an iPhone app that could support exponential user growth? Do big data or scientific computing without investing in a massive data center? Create the next hot game on Facebook? YouTube Blog: Today 72 hours of video are uploaded to the site every minute Could NetFlix or YouTube even exist?
Insane user growth Instagram Users450000004000000035000000300000002500000020000000 Users 1500000010000000 5000000 0 Oct Dec Feb Apr Jun Sept Nov Jan Mar May 2010 2010 2011 2011 2011 2011 2011 2012 2012 2012
3 engineers + Cloud• 100+ instances running: – 3 nginx load balancers – 25 Django app servers – 12 hot Postgres, 12 replica – Redis, memcached, solr, workers, monitor, etc.• Not possible with traditional hosting or on- premise!!!
Test Lab Utilization12010080604020 0 Jan Feb Mar Apr May June July Aug Sep Oct Nov Dec Jan
How Many Servers do I buy? Utilization120 This Many!!!10080604020 0 Jan Feb Mar Apr May June July Aug Sep Oct Nov Dec Jan
Some quick math Comparing just the servers themselves Assume: Test needs to run in 12 hr once a month Requires: 100 - 8 core, x64, 68 GB RAM If you bought servers yourself: $5000 * 100 = $500,000 Cloud Spot Instances: 100 quad, XL High CPU, 12 hr/mo = $2200 $2200/month * 3 Years = $80K
Design for the Cloud Build loosely coupled apps Service enabled Use Messaging & Queues Other Considerations: Database replication Database sharding noSQL for high performance storage
Cat VideoDesign for the Cloud Load Balancers My Site Serves videos Injests Web Servers Encodes Cache Caches
Cat VideoDesign for the Cloud Load Balancers Loosely Coupled Service Enabled Messaging/Queues Web Servers Cache Scales independently No blocking Ingest Servers Encoding Servers
Database Replication Writes Master-Slave replication Writes on Master Reads on Slaves Master Heavy read, light write Not always consistent (lag) Slave Slave Slave Reads Reads Reads
Database Sharding Horizontal partitioning Name State Zipcode Country/Continent Marco Ontario M4B 2E1 Postal Codes Cal Tennessee 37201 Date Range Keith Texas 73301 Mark Washington 98034 Consistent hashing
MarcoDatabase Sharding Shard Index: Shard Id, Partitioned Data, Connection String Consistent w/lazy writes or batch Lots of support: Redis Shard Index MongoDB Win Azure DB Shards
noSQL Non-relational data store Super fast Good partitioning support Types: Wide column Hadoop, Casandra Document store mongoDB, CouchDB, ravenDB Key/value membase, redis, DynamoDB
IaaS – Infrastructure as a ServicePaaS – Platform as a ServiceSaaS – Software as a Service IaaS = Bring your own OS PaaS = Bring your own App SaaS = Bring your own Customers
SaaS You use this today Your mom might use it too Also very popular for mom
IaaS Virtualized OS Images Storage, Load Balancing, Databases, CDN, identity, caching, messagi ng + other value-add services
Iaas Strengths/Weaknesses Strengths: Familiar technologies Supports many scenarios Limited code lock-in Can control and configure environment Weaknesses: Must control and configure environment Requires administrative skills to use
Amazon Web Services Select an OS from massive library of images Many flavors of Linux, Windows SQS: Queue hosted as a service on AWS Unlimited queues & messages 64k message size, 14 day TTL Elastic Load Balancing: Distributes traffic across your EC2 instances Detects/removes unhealthy instances & reroutes Works in a single AZ or across Availability Zones
Amazon Web Services Auto Scaling: Monitor instance health Provision/de-provision EC2 instances Use with ELB to monitor latency to ensure adequate compute to handle demand Content Delivery Network: Edge cache site and resources Supports dynamic (per user) content Pay for only what you use Push everything, pay for what’s accessed.
Amazon Web Services RDS: MySQL, Oracle, MSSQL DynamoDB: noSQL, SSD, auto-scale, auto-replicated ElastiCache: Memcached compliant S3: Raw data storage, unlimited objects 1k-5TB Elastic BeanStalk (PaaS): VM-based Java: Tomcat, AWS Toolkit for Eclipse, WAR, CLI PHP: Apache, git .NET: IIS, AWS Toolkit for VS or WebDeploy 2.0
RackSpace Far fewer services than AWS Number of services building on OpenStack (in beta) Load Balancing: Supports: RR, WRR, LC, WLC, random Supports SSL Termination Detects/removes unhealthy instances & reroutes Connection throttling Available in UK for geo-scale Database: MySQL (beta)
RackSpace Files: Built on OpenStack (beta). RackSpace hosted and Akamai option as well Monitoring: (also beta) Web Sites: PHP: Apache, MySQL, Linux .NET: IIS, MSSQL, Windows
PaaS Virtualized OS or multi-tenant instance Load Balancing, DB, CDN, Storage (sometimes) Varies widely from vendor to vendor
The Benefits of PaaS Less Complex There’s less work for developers to do Go from idea to availability more quickly Less Expensive Less admin work to do Hire fewer ops support Less Risky PaaS platforms do more Have to design for the PaaS Fewer opportunities for designing apps incorrectly
The Drawbacks of PaaS Learning curve You have to learn their PaaS platform Less control Platform does more, you do less, thus less control Vendor lock-in Doesn’t match your on-premise test servers Migration Moving existing apps is difficult
Heroku Built on AWS (XL High Mem Instances) Nginx reverse proxy (SSL Term) extra $ for SSL Varnish cache Languages: Ruby, Node.js, Clojure, Java, Python, PHP* Dynamically spins up and shut down apps (dynos) Dyno gets 512MB memory Deploy: git->(app, gem’s)->slug->compiled->deploy Database: PostgreSQL Addons: Memcached, SSL Read only FS
Engine Yard Also built on AWS Languages: Ruby, node.js, PHP (Orchestra) Free: shared multi-tenant. Paid: VM’s Nginx: reverse proxy, cache & web server (no htaccess) Deploy: git & SVN Read only FS, use S3 Addons: couchDB, memcached, MongoDB, solr, ZeroMQ… Frameworks: Zend, Lithium, Symfony/2, FRAPI, Solar
PHPCloud Dev environment in the cloud (different than others) Apps hosted in Zend App Fabric Choice of clouds to deploy production app to AWS, RackSpace, IBM, HP Snapshot – cool for sharing environments Lots of features from Zend Server available: Monitoring Autoscaling CodeTracing Queues
Windows Azure Compute: Web Role, Worker Role Non-persistent VM Database: SQL Azure Access from cloud or on-premise OLAP Sharding via Federation Data Sync (uni/bi-directional)
Windows Azure Storage: Blob: Unstructured/binary (video, images, etc.) Table: key/value (noSQL) Drives: blob mounted as single volume NTFS VHD 3x auto-replication within DC Blobs & Tables geo-replicated automatically Simple CDN integration
Windows Azure Service Bus: Queues Topics/Subscriptions: pub/sub Caching, CDN Traffic Manager: (Load Balancer) Same or different DC’s, geo-scale Auto-failover SDK’s: .NET, Java, Node.js, PHP
The Event Join us at Madrone Studios in San Francisco on June 7th. Youll connect with senior technologists from Microsoft and the Bay area who will introduce you to Windows Azure. Come see the unveiling of the latest in Microsofts cloud-based technology and be part of a dynamic live experience.http://meetwindowsazure.com