Architecting Cloud Applications - the essential checklist

8,054 views

Published on

Anna Liu - Associate Professor in Services Engineering, School of Computer Science and Engineering, University of NSW. Keynote presentation at the Australian Architecture Forum 2009.

Published in: Technology, Education
0 Comments
32 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
8,054
On SlideShare
0
From Embeds
0
Number of Embeds
1,424
Actions
Shares
0
Downloads
0
Comments
0
Likes
32
Embeds 0
No embeds

No notes for slide

Architecting Cloud Applications - the essential checklist

  1. 1. Architecting Cloud Applications - the essential checklist - Anna Liu Associate Professor in Services Engineering School of Computer Science and Engineering University of New South Wales annaliu@cse.unsw.edu.au
  2. 2. Architect‟s Checklist 1. Remember the „Why‟ 2. Know the platform architecture 3. Appreciate differences across cloud platforms 4. Acknowledge auto-scaling is not all magic 5. Design for eventual consistency 6. Don‟t ignore the network layer 7. Performance attributes = application profile + platform availability + network latency 8. Plan for Monitoring and management 9. Understand Interoperability and standards 10. Believe in Cloud Computing is not just for the longtail
  3. 3. Architect‟s Checklist 1. Remember the „Why‟ 2. Know the platform architecture 3. Appreciate differences across cloud platforms 4. Acknowledge auto-scaling is not all magic 5. Design for eventual consistency 6. Don‟t ignore the network layer 7. Performance attributes = application profile + platform availability + network latency 8. Plan for Monitoring and management 9. Understand Interoperability and standards 10. Believe in Cloud Computing is not just for the longtail
  4. 4. Why Cloud Computing • Economies of scale • Pay per usage • Handling Big Data • Service Delivery platform • Innovative, engaging user experience • Realising Green IT initiatives
  5. 5. Architect‟s Checklist 1. Remember the „Why‟ 2. Know the platform architecture 3. Appreciate differences across cloud platforms 4. Acknowledge auto-scaling is not all magic 5. Design for eventual consistency 6. Don‟t ignore the network layer 7. Performance attributes = application profile + platform availability + network latency 8. Plan for Monitoring and management 9. Understand Interoperability and standards 10. Believe in Cloud Computing is not just for the longtail
  6. 6. Cloud Platform Architecture Cloud Applications Monitoring/Management Tools Development Tools Design and SaaS web app, data-intensive, CDNs, Social, CRM, etc Programming runtime, frameworks, application services Storage, compute, Map-Reduce; workflow, Web 2.0, collaboration, mashups Deploy, Scheduling, Fault-Management, Monitoring, Allocation, Security PaaS Automatic scale, Selection, Coordination, Messaging Data organization techniques, Replication, Load balancing Virtualisation, Resource Management, Routing IaaS Datacentres Datacentres
  7. 7. Architect‟s Checklist 1. Remember the „Why‟ 2. Know the platform architecture 3. Appreciate differences across cloud platforms 4. Acknowledge auto-scaling is not all magic 5. Design for eventual consistency 6. Don‟t ignore the network layer 7. Performance attributes = application profile + platform availability + network latency 8. Plan for Monitoring and management 9. Understand Interoperability and standards 10. Believe in Cloud Computing is not just for the longtail
  8. 8. Different Platforms with Different Target Audience • Google App Engine • Caters for web applications • < 30 sec compute time • PaaS shields you from lots of infrastructure complexity • Microsoft Azure • More general purpose • optimised for .NET • software plus services strategy caters to enterprise scenarios • Amazon EC2/S3/SimpleDB • Virtual compute, storage on demand, • IaaS provides you with lots of flexibility • Third party innovation on top to enhance application development experience (eg. Red Hat/JBoss, MySQL, IBM Websphere, Appistry etc)
  9. 9. Architect‟s Checklist 1. Remember the „Why‟ 2. Know the platform architecture 3. Appreciate differences across cloud platforms 4. Acknowledge auto-scaling is not all magic 5. Design for eventual consistency 6. Don‟t ignore the network layer 7. Performance attributes = application profile + platform availability + network latency 8. Plan for Monitoring and management 9. Understand Interoperability and standards 10. Believe in Cloud Computing is not just for the longtail
  10. 10. Auto scaling behind the scene • Amazon EC2 • CloudWatch – view into VM instance server utilization details, operational performance, disk reads and writes, network • Elastic Load Balancer – distributes apps across EC2 instances, control request load-balancing across single or multiple cloud sites, performs provisioning-related decisions based on dynamic monitoring data reported by CloudWatch • developers specify preconditions eg. average CPU utilisation • Microsoft Azure • Azure Fabric Controller (FC) – monitors, maintains and provisions machines to host applications • Web role, worker roles, instance number configurations parameters
  11. 11. Auto scaling behind the scene • Google App Engine • Handles auto scaling and load balancing of application services based on web traffic • requests/task execution limited to 30 seconds • Moved from Tomcat to Jetty to reduce memory footprint (no need for session handler) • Fault tolerance and persistence of stored data through distributed replication • GAE serves static web content, hence no additional implementation to handle checkpointing and replication to re-instantiate execution state of processes
  12. 12. Architect‟s Checklist 1. Remember the „Why‟ 2. Know the platform architecture 3. Appreciate differences across cloud platforms 4. Acknowledge auto-scaling is not all magic 5. Design for eventual consistency 6. Don‟t ignore the network layer 7. Performance attributes = application profile + platform availability + network latency 8. Plan for Monitoring and management 9. Understand Interoperability and standards 10. Believe in Cloud Computing is not just for the longtail
  13. 13. ACID no more? “Eventual Consistency Amazon SimpleDB keeps multiple copies of each domain. When data is written or updated (using PutAttributes, DeleteAttributes, CreateDomain or DeleteDomain) and Success is returned, all copies of the data are updated. However, it takes time for the update to propagate to all storage locations. The data will eventually be consistent, but an immediate read might not show the change. Consistency is usually reached within seconds, but a high system load or network partition might increase this time. Repeating a read after a short time should return the updated data. “ - Amazon Developer Guide, 2007-11-07
  14. 14. CAP Theorem • Three properties of shared-data systems • Consistency: one update is made, all observers are updated • Availability: all database transactions should be processed accurately and promptly • Tolerance: tolerant to network Partitions • CAP Theorem • Only two properties can be achieved at any time • Network partitions is given in distribute systems • Have to pick one between consistency and availability
  15. 15. Relational no more? • Google App Engine‟s datastore: • Select can be performed on one table only • Intentionlly does not support Join • Inefficient when queries span across machines • Allows disks to fail without system failing • Cannot easily port over existing enterprise relational DB • Microsoft Azure: • Retiring the previous SSDS (no transactional support then) • Azure SQL Services to replace SSDS with relational features and Tx • Amazon • S3 for big storage scenario • Have your own relational DB in the cloud! • Interesting to investigate failover/scalability features here...
  16. 16. What does this mean? • Data reorganisation/restructuring required • Understand trade offs between design (scalability versus portability/interoperability at data layer) • Shopping carts, reference data, vs transactional data/updates, ACID vs BASE • Data portability might be tough for a while • I‟m revising my University lecture notes! So you better re-architect your app and data!
  17. 17. Architect‟s Checklist 1. Remember the „Why‟ 2. Know the platform architecture 3. Appreciate differences across cloud platforms 4. Acknowledge auto-scaling is not all magic 5. Design for eventual consistency 6. Don‟t ignore the network layer 7. Performance attributes = application profile + platform availability + network latency 8. Plan for Monitoring and management 9. Understand Interoperability and standards 10. Believe in Cloud Computing is not just for the longtail
  18. 18. Experiment Setup Azure Web Amazon Web Google App Services Services WSDL WSDL WSDL Interface : HTTP public Result InstantResponse(String value){ ST T RE ES // Echo the receiving value back to client /R P/ AP // Test net response time A SO SO } public Result Read(String value){ // Retrieve data from DB based on the given value WSDL // Test DB read performance } public Result Create(String content){ // Persist given content into DB Client Testing Application // Test DB write performance }
  19. 19. Network conditions Affects User Experience
  20. 20. Questions to ponder about • This is a rather obvious conclusion • My gmail sometimes tells me “reconnecting in 5 sec...” and it‟s ok for me! • Are the user base happy enough? • Will our network improve? • Situation particular bad for us Aussies... • NBN discussion, population of 20mil not enough for vendors to invest? • Is it a matter of just dropping a container here? • Is there a business case for Telstra?
  21. 21. Architect‟s Checklist 1. Remember the „Why‟ 2. Know the platform architecture 3. Appreciate differences across cloud platforms 4. Acknowledge auto-scaling is not all magic 5. Design for eventual consistency 6. Don‟t ignore the network layer 7. Performance attributes = application profile + platform characteristics + network latency 8. Plan for Monitoring and management 9. Understand Interoperability and standards 10. Believe in Cloud Computing is not just for the longtail
  22. 22. Types of Applications Application Types Decision Dimensions • Enterprise, Web applications • Application profile • business apps with web front • Constraints and end to maximise user reach requirements on cloud • Highly connected apps platform, resource models • Web 2.0, CDN, social networking, sensor network • Resource model -> cost • Data intensive • Your business model (how you make money out of • massively parallel, Hadoop/Map-Reduce the app you deploy on the • Analysis yields potentially cloud) surprising results • saving cost or speed up • Compute Intensive versus ability to connect, build shared pool of meta- • Financial risk calculations data, discover surprising • Compare to HPC? results
  23. 23. Architect‟s Checklist 1. Remember the „Why‟ 2. Know the platform architecture 3. Appreciate differences across cloud platforms 4. Acknowledge auto-scaling is not all magic 5. Design for eventual consistency 6. Don‟t ignore the network layer 7. Performance attributes = application profile + platform characteristics+ network latency 8. Plan for Monitoring and management 9. Understand Interoperability and standards 10. Believe in Cloud Computing is not just for the longtail
  24. 24. Wide Area Distributed Systems – the reality • Scalability seems ok • Relatively constant individual response time despite larger request volume • Availability is more of an issue? • Design for occasional unavailability • Plan for it • Try catch, Retry logic, idempotent operations are all still good!
  25. 25. Pressure Tests – App Engine App Engine Storage Create Error Rate in Pressure Test(1024 Byte) Round Type 1:30 4:30 7:30 10:30 13:30 Average All Req. Avg. Rate Round 0 DB Err. 0 1 0 0 2 0.6 Sent Req. 900 857 891 900 900 889.6 900 98.84% Round 1 DB Err. 0 4 0 0 0 0.8 Sent Req. 2699 2134 2242 2700 2700 2495 2700 92.41% Round 2 DB Err. 0 0 4 0 8 2.4 Sent Req. 4500 4180 3873 4500 4032 4217 4500 93.71% Round 3 DB Err. 3 0 0 8 3 2.8 Sent Req. 5403 5173 5681 5792 6065 5622.8 6300 89.25% Round 4 DB Err. 0 0 0 6 3 1.8 Sent Req. 5572 8100 6611 4287 7111 6336.2 8100 78.22% Round 5 DB Err. 2 3 0 4 1 2 Sent Req. 9235 9279 5561 9112 8275 8292.4 9900 83.76% Overall DB Err. 5 8 4 18 17 10.4 Sent Req. 28309 29723 24859 27291 29083 27853 32400 85.97% Err. Rate 0.02% 0.03% 0.02% 0.07% 0.06% 0.04% google.appengine.api.datastore_errors:TransactionFailedError : Too much contetion on these datastore entities. 500 Server Error
  26. 26. What‟s happening here? • Throttling? • Denial of service attack protection mechanism? • Should end user developers have access to Configurable parameter for setting such limit?
  27. 27. Pressure Test – Amazon SimpleDB Amazon SimpleDB Create Error Rate in Pressure Test (1024 Byte) Round Type 3:00 6:00 9:00 12:00 Average All Req. Avg. Rate Round 0 DB Err. 0 0 0 0 0 Sent Req. 900 898 900 900 899.5 900 99.94% Round 1 DB Err. 20 10 9 15 13.5 Sent Req. 2696 2700 2700 2699 2698.75 2700 99.95% Round 2 DB Err. 4 7 7 7 6.25 Sent Req. 4367 4497 4485 3879 4307 4500 95.71% Round 3 DB Err. 17 6 7 13 10.75 Sent Req. 5740 6193 6226 5795 5988.5 6300 95.06% Round 4 DB Err. 13 2 3 13 7.75 Sent Req. 7081 8005 7896 7106 7522 8100 92.86% Round 5 DB Err. 19 9 33 16 19.25 Sent Req. 8926 9694 7857 8195 8668 9900 87.56% Overall DB Err. 73 34 59 64 57.5 Conn. Err. 29710 31987 30064 28574 30083.75 32400 92.85% Err. Rate 0.25% 0.11% 0.20% 0.22% 0.19% Amazon SimpleDB are currently unavailable
  28. 28. Architect‟s Checklist 1. Remember the „Why‟ 2. Know the platform architecture 3. Appreciate differences across cloud platforms 4. Acknowledge auto-scaling is not all magic 5. Design for eventual consistency 6. Don‟t ignore the network layer 7. Performance attributes = application profile + platform availability + network latency 8. Plan for Monitoring and management 9. Understand Interoperability and standards 10. Believe in Cloud Computing is not just for the longtail
  29. 29. Monitoring and Management • Could be a lot better! • We had to build a lot of monitoring code on our own • Some cloud system status available, but not view into your application health status • Service Level Agreement issues • Existing support caters for techies, developers • Need dashboard view into business metric • real time view into how application is running in the cloud • Data point to have the commercial conversation with platform vendors • Integration with existing enterprise monitoring capabilities?
  30. 30. Architect‟s Checklist 1. Remember the „Why‟ 2. Know the platform architecture 3. Appreciate differences across cloud platforms 4. Acknowledge auto-scaling is not all magic 5. Design for eventual consistency 6. Don‟t ignore the network layer 7. Performance attributes = application profile + platform availability + network latency 8. Plan for Monitoring and management 9. Understand Interoperability and standards 10. Believe in Cloud Computing is not just for the longtail
  31. 31. Standards and Interoperability • Cloud Computing Interoperability Forum (CCIF), OMG effort, The Open Group, Open Cloud Manifesto... • Is Standards THE solution? • Competing standards? Timing? Design by committee? • In fact, does it make sense when cloud platform architecture varies significantly? • Individual services already surfaced on the internet • Still want to orchestrate services within a long running workflow, across/from different clouds
  32. 32. Internet Service Bus • REST on .NET Service Bus – Simple to implement for interop across different languages – Less overhead packages • SOAP on .NET Service Bus – Only available for .NET Frameworks communications atm – Other languages are not fully supported (Java can only pass Access Control on .NET Service) – More overhead packages when communicate between C# and Java, than C# to C#
  33. 33. Architect‟s Checklist 1. Remember the „Why‟ 2. Know the platform architecture 3. Appreciate differences across cloud platforms 4. Acknowledge auto-scaling is not all magic 5. Design for eventual consistency 6. Don‟t ignore the network layer 7. Performance attributes = application profile + platform availability + network latency 8. Plan for Monitoring and management 9. Understand Interoperability and standards 10. Is Cloud Computing just for the longtail?
  34. 34. Impedance to Enterprise Adoption of Cloud • Security, Privacy law • Ownership of data, data retention • Portability, fear of vendor lock in • Migration, integration with existing IT assets • Values for startups does not necessarily apply to enterprise • Cost of initial capital investment is already spent • Pay per use is not necessary a business benefit
  35. 35. Some Existing Efforts and Solution Patterns • Analyse risk profiles for your application portfolio • Private cloud (trade off economies of scale?) • „de-value data‟, „partitioning‟, segregation‟ • Enable user choice, „trust‟ • Integration/interoperability solutions • Security – lots of technical solutions • Cloud Security Alliance (CSA) for some guidance on security issues • Upcoming Research Collaboration with SEI CMU/US DoD
  36. 36. Architect‟s Checklist 1. Remember the „Why‟ 2. Know the platform architecture 3. Appreciate differences across cloud platforms 4. Acknowledge auto-scaling is not all magic 5. Design for eventual consistency 6. Don‟t ignore the network layer 7. Performance attributes = application profile + platform availability + network latency 8. Plan for Monitoring and management 9. Understand Interoperability and standards 10. Is Cloud Computing just for the longtail?
  37. 37. Architect‟s Checklist 1. Remember the „Why‟ 2. Know the platform architecture 3. Appreciate differences across cloud platforms 4. Acknowledge auto-scaling is not all magic 5. Design for eventual consistency 6. Don‟t ignore the network layer 7. Performance attributes = application profile + platform availability + network latency 8. Plan for Monitoring and management 9. Understand Interoperability and standards 10. Believe in Cloud Computing is not just for the longtail
  38. 38. An Engineering Analogy... SS Great Britain, I K Brunel
  39. 39. Getting Involved • Collaboration with UNSW • We are recruiting Research Fellows! • Research residential for Architects • Open House Lab • Short term contract research, advisory services • longer term linkage programs (ARC, NICTA, CRC) • Blogs.unsw.edu.au/annaliu
  40. 40. Standing on the shoulders of Giants • UNSW Team • Dr Helen Paik • Mr Liang Zhao • Mr Xiaomin Wu • Mr Fei Teng • Mr Jae Choi • NICTA Team • Dr Jenny Liu, Markus Lachat • Dr Mark Staples • Industry Advisory Team • Mr Kevin Francis (Object Consulting) • Dr Rajiv Ranjan (Smart Service CRC) • Milinda Kotelawele (Longscale)
  41. 41. THANK YOU!

×