Architecting Cloudy Applications<br />David Chou<br />david.chou@microsoft.com<br />blogs.msdn.com/dachou<br />
> Introduction<br />Size matters<br />Facebook (2009)<br />+200B pageviews /month<br />>3.9T feed actions /day<br />+300M ...
> Introduction<br />Cloud levels the playing field<br />2007<br />founded by 6 people<br />2008<br />$29M funding from VC<...
> Introduction<br />Cloud computing<br />Characteristics<br />On-demand self-service<br />Broad network access<br />Resour...
> Introduction<br />Service delivery models<br />(On-Premise)<br />Infrastructure<br />(as a Service)<br />Platform<br />(...
> Architecting for Scale > Vertical Scaling<br />Traditional scale-up architecture<br />Common characteristics<br />synchr...
> Architecting for Scale >Vertical Scaling<br />Traditional scale-up architecture<br />To scale, get bigger servers<br />e...
> Architecting for Scale >Vertical Scaling<br />Traditional scale-up architecture<br />When problems occur<br />bigger fai...
> Architecting for Scale >Vertical Scaling<br />Traditional scale-up architecture<br />When problems occur<br />bigger fai...
> Architecting for Scale >Fundamental Concepts<br />CAP (Consistency, Availability, Partition) Theorem<br />At most two of...
Single site, cluster database, LDAP, xFS file system, etc.
2-phase commit, data replication, etc.</li></ul>A<br />C<br />A<br />A<br />C<br />C<br />Consistency + Partition <br /><u...
Pessimistic locking, minority partition unavailable, etc.</li></ul>P<br />P<br />P<br />Availability + Partition <br /><ul...
Distributed cache, DNS, etc.
Optimistic locking, expiration/leases, etc.</li></ul>Source: “Towards Robust Distributed Systems”, Dr. Eric A. Brewer, UC ...
> Architecting for Scale > Horizontal scaling<br />Use more pieces, not bigger pieces<br />LEGO 7778 Midi-scale Millennium...
356 pieces</li></ul>LEGO 10179 Ultimate Collector's Millennium Falcon<br /><ul><li>33 x 22 x 8.3 inches (L/W/H)
5,195 pieces</li></li></ul><li>> Architecting for Scale > Horizontal scaling<br />Scale-out architecture<br />Common chara...
> Architecting for Scale > Horizontal scaling<br />Scale-out architecture<br />To scale, add more servers<br />not bigger ...
> Architecting for Scale > Horizontal scaling<br />Scale-out architecture<br />When problems occur<br />smaller failure im...
> Architecting for Scale > Horizontal scaling<br />Scale-out architecture<br />When problems occur<br />smaller failure im...
> Architecting for Scale > Horizontal scaling<br />Scale-out architecture + distributed computing<br />parallel tasks<br /...
> Architecting for Scale > Horizontal scaling<br />Scale-out architecture + distributed computing<br />When problems occur...
> Architecting for Scale > Horizontal scaling<br />Scale-out architecture + distributed computing<br />When problems occur...
> Architecting for Scale >Cloud Architecture Patterns<br />Live Journal (from Brad Fitzpatrick, then Founder at Live Journ...
> Architecting for Scale >Cloud Architecture Patterns<br />Flickr (from Cal Henderson, then Director of Engineering at Yah...
> Architecting for Scale >Cloud Architecture Patterns<br />SlideShare(from John Boutelle, CTO at Slideshare, 2008)<br />We...
> Architecting for Scale >Cloud Architecture Patterns<br />Twitter (from John Adams, Ops Engineer at Twitter, 2010)<br />W...
> Architecting for Scale >Cloud Architecture Patterns<br />Distributed<br />Storage<br />Facebook<br />(from Jeff Rothschi...
>Architecting for Scale<br />Fundamental concepts<br />Vertical scaling still works<br />
Upcoming SlideShare
Loading in...5
×

Architecting Cloudy Applications

1,897

Published on

Deck presented at the 2010 SOA & Cloud Symposium

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,897
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
76
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide
  • Microsoft&apos;s Windows Azure platform is a virtualized and abstracted application platform that can be used to build highly scalable and reliable applications, with Java. The environment consists of a set of services such as NoSQL table storage, blob storage, queues, relational database service, internet service bus, access control, and more. Java applications can be built using these services via Web services APIs, and your own Java Virtual Machine, without worrying about the underlying server OS and infrastructure. Highlights of this session will include: • An overview of the Windows Azure environment • How to develop and deploy Java applications in Windows Azure • How to architect horizontally scalable applications in Windows Azure
  • Picture source: http://en.wikipedia.org/wiki/Amdahl%27s_law
  • To build for big scale – use more of the same pieces, not bigger pieces; though a different approach may be needed
  • Source: http://danga.com/words/2007_06_usenix/usenix.pdf
  • Source: http://highscalability.com/blog/2007/11/13/flickr-architecture.html
  • Source: http://www.slideshare.net/jboutelle/scalable-web-architectures-w-ruby-and-amazon-s3
  • Source: http://www.slideshare.net/netik/billions-of-hits-scaling-twitterSource: http://highscalability.com/blog/2009/6/27/scaling-twitter-making-twitter-10000-percent-faster.html
  • Source: http://highscalability.com/blog/2009/10/12/high-performance-at-massive-scale-lessons-learned-at-faceboo-1.html
  • Picture source: http://pdp.protopak.net/Belltheous90/DeathStarII.gif
  • Architecting Cloudy Applications

    1. 1. Architecting Cloudy Applications<br />David Chou<br />david.chou@microsoft.com<br />blogs.msdn.com/dachou<br />
    2. 2. > Introduction<br />Size matters<br />Facebook (2009)<br />+200B pageviews /month<br />>3.9T feed actions /day<br />+300M active users<br />>1B chat mesgs /day<br />100M search queries /day<br />>6B minutes spent /day (ranked #2 on Internet)<br />+20B photos, +2B/month growth<br />600,000 photos served /sec<br />25TB log data /day processed thru Scribe<br />120M queries /sec on memcache<br />Twitter (2009)<br />600 requests /sec<br />avg 200-300 connections /sec; peak at 800<br />MySQL handles 2,400 requests /sec<br />30+ processes for handling odd jobs<br />process a request in 200 milliseconds in Rails<br />average time spent in the database is 50-100 milliseconds<br />+16 GB of memcached<br />Google (2007)<br />+20 petabytes of data processed /day by +100K MapReduce jobs <br />1 petabyte sort took ~6 hours on ~4K servers replicated onto ~48K disks<br />+200 GFS clusters, each at 1-5K nodes, handling +5 petabytes of storage<br />~40 GB /sec aggregate read/write throughput across the cluster<br />+500 servers for each search query < 500ms<br />>1B views / day on Youtube (2009)<br />Myspace(2007)<br />115B pageviews /month<br />5M concurrent users @ peak<br />+3B images, mp3, videos<br />+10M new images/day<br />160 Gbit/sec peak bandwidth<br />Flickr (2007)<br />+4B queries /day<br />+2B photos served<br />~35M photos in squid cache<br />~2M photos in squid’s RAM <br />38k req/sec to memcached (12M objects) <br />2 PB raw storage<br />+400K photos added /day<br />Source: multiple articles, High Scalability<br />http://highscalability.com/<br />
    3. 3. > Introduction<br />Cloud levels the playing field<br />2007<br />founded by 6 people<br />2008<br />$29M funding from VC<br />2009<br />revenue - $270M<br />$180M funding from Digital Sky Technologies<br />2010<br />1,000+ employees<br />$300M funding from Google and Softbank<br />Active unique players<br />75M monthly<br />60M daily<br />1M daily 4 days after launch<br />10M after 60 days<br />Hosted in Amazon Web Services<br />12,000 EC2 nodes<br />3 Gigabits/sec of traffic between FarmVille and Facebook (at peak)<br />caching cluster serves another 1.5 Gigabits/sec to the application<br />Source: “How FarmVille Scales to Harvest 75 Million Players a Month”, 2010.02.08, Tedd Hoff<br />http://highscalability.com/blog/2010/2/8/how-farmville-scales-to-harvest-75-million-players-a-month.html<br />
    4. 4. > Introduction<br />Cloud computing<br />Characteristics<br />On-demand self-service<br />Broad network access<br />Resource pooling<br />Rapid elasticity<br />Measured service<br />Service models<br />Software as a service<br />Platform as a service<br />Infrastructure as a service<br />Deployment models<br />Private cloud<br />Community cloud<br />Public cloud<br />Hybrid cloud<br />“Cloud computing is a model for enabling convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction. This cloud model promotes availability and is composed of five essential characteristics, three service models, and four deployment models.”<br />Source: The NIST Definition of Cloud Computing, Version 15, 2009.10.07, Peter Mell and Tim Grance<br />http://csrc.nist.gov/groups/SNS/cloud-computing/cloud-def-v15.doc<br />
    5. 5. > Introduction<br />Service delivery models<br />(On-Premise)<br />Infrastructure<br />(as a Service)<br />Platform<br />(as a Service)<br />Software<br />(as a Service)<br />You manage<br />Applications<br />Applications<br />Applications<br />Applications<br />You manage<br />Data<br />Data<br />Data<br />Data<br />Runtime<br />Runtime<br />Runtime<br />Runtime<br />Managed by vendor<br />Middleware<br />Middleware<br />Middleware<br />Middleware<br />You manage<br />Managed by vendor<br />O/S<br />O/S<br />O/S<br />O/S<br />Managed by vendor<br />Virtualization<br />Virtualization<br />Virtualization<br />Virtualization<br />Servers<br />Servers<br />Servers<br />Servers<br />Storage<br />Storage<br />Storage<br />Storage<br />Networking<br />Networking<br />Networking<br />Networking<br />
    6. 6. > Architecting for Scale > Vertical Scaling<br />Traditional scale-up architecture<br />Common characteristics<br />synchronous processes<br />sequential units of work<br />tight coupling<br />stateful<br />pessimistic concurrency<br />clustering for HA<br />vertical scaling<br />units of work<br />app server<br />web<br />data store<br />app server<br />web<br />data store<br />
    7. 7. > Architecting for Scale >Vertical Scaling<br />Traditional scale-up architecture<br />To scale, get bigger servers<br />expensive<br />has scaling limits<br />inefficient use of resources<br />app server<br />web<br />data store<br />app server<br />web<br />
    8. 8. > Architecting for Scale >Vertical Scaling<br />Traditional scale-up architecture<br />When problems occur<br />bigger failure impact<br />data store<br />app server<br />web<br />app server<br />web<br />
    9. 9. > Architecting for Scale >Vertical Scaling<br />Traditional scale-up architecture<br />When problems occur<br />bigger failure impact<br />more complex recovery<br />app server<br />web<br />data store<br />web<br />
    10. 10. > Architecting for Scale >Fundamental Concepts<br />CAP (Consistency, Availability, Partition) Theorem<br />At most two of these properties for any shared-data system<br />Consistency + Availability <br /><ul><li>High data integrity
    11. 11. Single site, cluster database, LDAP, xFS file system, etc.
    12. 12. 2-phase commit, data replication, etc.</li></ul>A<br />C<br />A<br />A<br />C<br />C<br />Consistency + Partition <br /><ul><li>Distributed database, distributed locking, etc.
    13. 13. Pessimistic locking, minority partition unavailable, etc.</li></ul>P<br />P<br />P<br />Availability + Partition <br /><ul><li>High scalability
    14. 14. Distributed cache, DNS, etc.
    15. 15. Optimistic locking, expiration/leases, etc.</li></ul>Source: “Towards Robust Distributed Systems”, Dr. Eric A. Brewer, UC Berkeley<br />
    16. 16. > Architecting for Scale > Horizontal scaling<br />Use more pieces, not bigger pieces<br />LEGO 7778 Midi-scale Millennium Falcon<br /><ul><li>9.3 x 6.7 x 3.2 inches (L/W/H)
    17. 17. 356 pieces</li></ul>LEGO 10179 Ultimate Collector's Millennium Falcon<br /><ul><li>33 x 22 x 8.3 inches (L/W/H)
    18. 18. 5,195 pieces</li></li></ul><li>> Architecting for Scale > Horizontal scaling<br />Scale-out architecture<br />Common characteristics<br />small logical units of work<br />loosely-coupled processes<br />stateless<br />event-driven design<br />optimistic concurrency<br />partitioned data<br />redundancy fault-tolerance<br />re-try-based recoverability<br />app server<br />web<br />data store<br />app server<br />web<br />data store<br />
    19. 19. > Architecting for Scale > Horizontal scaling<br />Scale-out architecture<br />To scale, add more servers<br />not bigger servers<br />app server<br />web<br />data store<br />app server<br />web<br />data store<br />app server<br />web<br />data store<br />app server<br />web<br />data store<br />app server<br />web<br />data store<br />app server<br />web<br />data store<br />
    20. 20. > Architecting for Scale > Horizontal scaling<br />Scale-out architecture<br />When problems occur<br />smaller failure impact<br />higher perceived availability<br />app server<br />web<br />data store<br />app server<br />web<br />data store<br />app server<br />web<br />data store<br />app server<br />web<br />data store<br />app server<br />web<br />data store<br />app server<br />web<br />data store<br />
    21. 21. > Architecting for Scale > Horizontal scaling<br />Scale-out architecture<br />When problems occur<br />smaller failure impact<br />higher perceived availability<br />simpler recovery<br />app server<br />web<br />data store<br />app server<br />web<br />data store<br />web<br />app server<br />data store<br />web<br />data store<br />app server<br />web<br />data store<br />app server<br />web<br />data store<br />
    22. 22. > Architecting for Scale > Horizontal scaling<br />Scale-out architecture + distributed computing<br />parallel tasks<br />Scalable performance at extreme scale<br />asynchronous processes<br />parallelization<br />smaller footprint<br />optimized resource usage<br />reduced response time<br />improved throughput<br />app server<br />web<br />data store<br />app server<br />web<br />data store<br />web<br />app server<br />data store<br />app server<br />web<br />data store<br />perceived response time<br />app server<br />web<br />data store<br />app server<br />web<br />data store<br />async tasks<br />
    23. 23. > Architecting for Scale > Horizontal scaling<br />Scale-out architecture + distributed computing<br />When problems occur<br />smaller units of work<br />decoupling shields impact<br />app server<br />web<br />data store<br />app server<br />web<br />data store<br />web<br />app server<br />data store<br />app server<br />web<br />data store<br />app server<br />web<br />data store<br />app server<br />web<br />data store<br />
    24. 24. > Architecting for Scale > Horizontal scaling<br />Scale-out architecture + distributed computing<br />When problems occur<br />smaller units of work<br />decoupling shields impact<br />even simpler recovery<br />app server<br />web<br />data store<br />app server<br />web<br />data store<br />web<br />app server<br />data store<br />app server<br />web<br />data store<br />app server<br />web<br />data store<br />web<br />data store<br />
    25. 25. > Architecting for Scale >Cloud Architecture Patterns<br />Live Journal (from Brad Fitzpatrick, then Founder at Live Journal, 2007)<br />Web Frontend<br />Apps & Services<br />Partitioned Data<br />Distributed<br />Cache<br />Distributed Storage<br />
    26. 26. > Architecting for Scale >Cloud Architecture Patterns<br />Flickr (from Cal Henderson, then Director of Engineering at Yahoo, 2007)<br />Web Frontend<br />Apps & Services<br />Distributed Storage<br />Distributed<br />Cache<br />Partitioned Data<br />
    27. 27. > Architecting for Scale >Cloud Architecture Patterns<br />SlideShare(from John Boutelle, CTO at Slideshare, 2008)<br />Web<br />Frontend<br />Apps &<br />Services<br />Distributed Cache<br />Partitioned Data<br />Distributed Storage<br />
    28. 28. > Architecting for Scale >Cloud Architecture Patterns<br />Twitter (from John Adams, Ops Engineer at Twitter, 2010)<br />Web<br />Frontend<br />Apps &<br />Services<br />Partitioned<br />Data<br />Queues<br />Async<br />Processes<br />Distributed<br />Cache<br />Distributed<br />Storage<br />
    29. 29. > Architecting for Scale >Cloud Architecture Patterns<br />Distributed<br />Storage<br />Facebook<br />(from Jeff Rothschild, VP Technology at Facebook, 2009)<br />2010 stats (Source: http://www.facebook.com/press/info.php?statistics)<br />People<br />+500M active users<br />50% of active users log on in any given day<br />people spend +700B minutes /month<br />Activity on Facebook<br />+900M objects that people interact with<br />+30B pieces of content shared /month<br />Global Reach<br />+70 translations available on the site<br />~70% of users outside the US<br />+300K users helped translate the site through the translations application<br />Platform<br />+1M developers from +180 countries<br />+70% of users engage with applications /month<br />+550K active applications<br />+1M websites have integrated with Facebook Platform <br />+150M people engage with Facebook on external websites /month<br />Web<br />Frontend<br />Apps &<br />Services<br />Distributed<br />Cache<br />Parallel<br />Processes<br />Partitioned<br />Data<br />Async<br />Processes<br />
    30. 30. >Architecting for Scale<br />Fundamental concepts<br />Vertical scaling still works<br />
    31. 31. >Architecting for Scale<br />Fundamental concepts<br />Horizontal scaling for cloud computing<br />Small pieces, loosely coupled<br />Distributed computing best practices<br />asynchronous processes (event-driven design)<br />parallelization<br />idempotent operations (handle duplicity)<br />de-normalized, partitioned data (sharding)<br />shared nothing architecture<br />optimistic concurrency<br />fault-tolerance by redundancy and replication<br />etc.<br />
    32. 32. > Architecting for Scale >Fundamental Concepts<br />Asynchronous processes & parallelization<br />Defer work as late as possible<br />return to user as quickly as possible<br />event-driven design (instead of request-driven)<br />Cloud computing friendly<br />distributes work to more servers (divide & conquer)<br />smaller resource usage/footprint<br />smaller failure surface<br />decouples process dependencies<br />Windows Azure platform services<br />Queue Service<br />AppFabric Service Bus<br />inter-node communication<br />Worker Role<br />Web Role<br />Queues<br />Service Bus<br />Web Role<br />Web Role<br />Web Role<br />Worker Role<br />Worker Role<br />Worker Role<br />
    33. 33. > Architecting for Scale >Fundamental Concepts<br />Partitioned data<br />Shared nothing architecture<br />transaction locality (partition based on an entity that is the “atomic” target of majority of transactional processing)<br />loosened referential integrity (avoid distributed transactions across shard and entity boundaries)<br />design for dynamic redistribution and growth of data (elasticity)<br />Cloud computing friendly<br />divide & conquer<br />size growth with virtually no limits<br />smaller failure surface<br />Windows Azure platform services<br />Table Storage Service<br />SQL Azure<br />read<br />Web Role<br />Queues<br />Web Role<br />Web Role<br />Worker Role<br />Relational Database<br />Relational Database<br />Relational Database<br />Web Role<br />write<br />
    34. 34. > Architecting for Scale >Fundamental Concepts<br />Idempotent operations<br />Repeatable processes<br />allow duplicates (additive)<br />allow re-tries (overwrite)<br />reject duplicates (optimistic locking)<br />stateless design<br />Cloud computing friendly<br />resiliency<br />Windows Azure platform services<br />Queue Service<br />AppFabric Service Bus<br />Worker Role<br />Service Bus<br />Worker Role<br />Worker Role<br />
    35. 35. > Architecting for Scale >Fundamental Concepts<br />CAP (Consistency, Availability, Partition) Theorem<br />At most two of these properties for any shared-data system<br />Consistency + Availability <br /><ul><li>High data integrity
    36. 36. Single site, cluster database, LDAP, xFS file system, etc.
    37. 37. 2-phase commit, data replication, etc.</li></ul>A<br />C<br />A<br />A<br />C<br />C<br />Consistency + Partition <br /><ul><li>Distributed database, distributed locking, etc.
    38. 38. Pessimistic locking, minority partition unavailable, etc.</li></ul>P<br />P<br />P<br />Availability + Partition <br /><ul><li>High scalability
    39. 39. Distributed cache, DNS, etc.
    40. 40. Optimistic locking, expiration/leases, etc.</li></ul>“Towards Robust Distributed Systems”, Dr. Eric A. Brewer, UC Berkeley<br />
    41. 41. > Architecting for Scale >Fundamental Concepts<br />Hybrid architectures<br />Scale-out (horizontal)<br />BASE: Basically Available, Soft state, Eventually consistent<br />focus on “commit”<br />conservative (pessimistic)<br />shared nothing<br />favor extreme size<br />e.g., user requests, data collection & processing, etc.<br />Scale-up (vertical)<br />ACID: Atomicity, Consistency, Isolation, Durability<br />availability first; best effort<br />aggressive (optimistic)<br />transactional<br />favor accuracy/consistency<br />e.g., BI & analytics, financial processing, etc.<br /> Most distributed systems employ both approaches<br />
    42. 42. Thank you!<br />David Chou<br />david.chou@microsoft.com<br />blogs.msdn.com/dachou<br />© 2010 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.<br />The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.<br />
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×