Architecting Cloud Applications
   - the essential checklist -
                                             Anna Liu
           Associate Professor in Services Engineering
          School of Computer Science and Engineering
                        University of New South Wales
                            annaliu@cse.unsw.edu.au
Architect‟s Checklist
1.  Remember the „Why‟
2.  Know the platform architecture
3.  Appreciate differences across cloud platforms
4.  Acknowledge auto-scaling is not all magic
5.  Design for eventual consistency
6.  Don‟t ignore the network layer
7.  Performance attributes = application profile +
    platform availability + network latency
8. Plan for Monitoring and management
9. Understand Interoperability and standards
10. Believe in Cloud Computing is not just for the longtail
Architect‟s Checklist
1.  Remember the „Why‟
2.  Know the platform architecture
3.  Appreciate differences across cloud platforms
4.  Acknowledge auto-scaling is not all magic
5.  Design for eventual consistency
6.  Don‟t ignore the network layer
7.  Performance attributes = application profile +
    platform availability + network latency
8. Plan for Monitoring and management
9. Understand Interoperability and standards
10. Believe in Cloud Computing is not just for the longtail
Why Cloud Computing
•   Economies of scale
•   Pay per usage
•   Handling Big Data
•   Service Delivery platform
•   Innovative, engaging user experience
•   Realising Green IT initiatives
Architect‟s Checklist
1.  Remember the „Why‟
2.  Know the platform architecture
3.  Appreciate differences across cloud platforms
4.  Acknowledge auto-scaling is not all magic
5.  Design for eventual consistency
6.  Don‟t ignore the network layer
7.  Performance attributes = application profile +
    platform availability + network latency
8. Plan for Monitoring and management
9. Understand Interoperability and standards
10. Believe in Cloud Computing is not just for the longtail
Cloud Platform Architecture
                            Cloud Applications




                                                                                                     Monitoring/Management Tools
                                                                                 Development Tools
                                                                                 Design and
SaaS           web app, data-intensive, CDNs, Social, CRM, etc

          Programming runtime, frameworks, application services
       Storage, compute, Map-Reduce; workflow, Web 2.0, collaboration, mashups

        Deploy, Scheduling, Fault-Management, Monitoring, Allocation, Security
PaaS
                  Automatic scale, Selection, Coordination, Messaging

               Data organization techniques, Replication, Load balancing

                     Virtualisation, Resource Management, Routing


IaaS

             Datacentres                                    Datacentres
Architect‟s Checklist
1.  Remember the „Why‟
2.  Know the platform architecture
3.  Appreciate differences across cloud platforms
4.  Acknowledge auto-scaling is not all magic
5.  Design for eventual consistency
6.  Don‟t ignore the network layer
7.  Performance attributes = application profile +
    platform availability + network latency
8. Plan for Monitoring and management
9. Understand Interoperability and standards
10. Believe in Cloud Computing is not just for the longtail
Different Platforms with
     Different Target Audience
• Google App Engine
      • Caters for web applications
      • < 30 sec compute time
      • PaaS shields you from lots of infrastructure complexity
• Microsoft Azure
      • More general purpose
      • optimised for .NET
      • software plus services strategy caters to enterprise scenarios
• Amazon EC2/S3/SimpleDB
      • Virtual compute, storage on demand,
      • IaaS provides you with lots of flexibility
      • Third party innovation on top to enhance application development
        experience (eg. Red Hat/JBoss, MySQL, IBM Websphere, Appistry
        etc)
Architect‟s Checklist
1.  Remember the „Why‟
2.  Know the platform architecture
3.  Appreciate differences across cloud platforms
4.  Acknowledge auto-scaling is not all magic
5.  Design for eventual consistency
6.  Don‟t ignore the network layer
7.  Performance attributes = application profile +
    platform availability + network latency
8. Plan for Monitoring and management
9. Understand Interoperability and standards
10. Believe in Cloud Computing is not just for the longtail
Auto scaling behind the scene
• Amazon EC2
     • CloudWatch – view into VM instance server utilization details,
       operational performance, disk reads and writes, network
     • Elastic Load Balancer – distributes apps across EC2
       instances, control request load-balancing across single or
       multiple cloud sites, performs provisioning-related decisions
       based on dynamic monitoring data reported by CloudWatch
     • developers specify preconditions eg. average CPU utilisation
• Microsoft Azure
     • Azure Fabric Controller (FC) – monitors, maintains and
       provisions machines to host applications
     • Web role, worker roles, instance number configurations
       parameters
Auto scaling behind the scene
• Google App Engine
    • Handles auto scaling and load balancing of
      application services based on web traffic
    • requests/task execution limited to 30 seconds
    • Moved from Tomcat to Jetty to reduce memory
      footprint (no need for session handler)
    • Fault tolerance and persistence of stored data
      through distributed replication
    • GAE serves static web content, hence no
      additional implementation to handle checkpointing
      and replication to re-instantiate execution state of
      processes
Architect‟s Checklist
1.  Remember the „Why‟
2.  Know the platform architecture
3.  Appreciate differences across cloud platforms
4.  Acknowledge auto-scaling is not all magic
5.  Design for eventual consistency
6.  Don‟t ignore the network layer
7.  Performance attributes = application profile +
    platform availability + network latency
8. Plan for Monitoring and management
9. Understand Interoperability and standards
10. Believe in Cloud Computing is not just for the longtail
ACID no more?
“Eventual Consistency
Amazon SimpleDB keeps multiple copies of each domain.
When data is written or updated (using PutAttributes,
DeleteAttributes, CreateDomain or DeleteDomain) and
Success is returned, all copies of the data are updated.
However, it takes time for the update to propagate to all
storage locations. The data will eventually be consistent, but
an immediate read might not show the change.
Consistency is usually reached within seconds, but a high
system load or network partition might increase this time.
Repeating a read after a short time should return the updated
data. “
- Amazon Developer Guide, 2007-11-07
CAP Theorem
• Three properties of shared-data systems
    • Consistency: one update is made, all observers
      are updated
    • Availability: all database transactions should be
      processed accurately and promptly
    • Tolerance: tolerant to network Partitions
• CAP Theorem
    • Only two properties can be achieved at any time
    • Network partitions is given in distribute systems
    • Have to pick one between consistency and
      availability
Relational no more?
• Google App Engine‟s datastore:
      •   Select can be performed on one table only
      •   Intentionlly does not support Join
      •   Inefficient when queries span across machines
      •   Allows disks to fail without system failing
      •   Cannot easily port over existing enterprise relational DB

• Microsoft Azure:
      • Retiring the previous SSDS (no transactional support then)
      • Azure SQL Services to replace SSDS with relational features and Tx

• Amazon
      • S3 for big storage scenario
      • Have your own relational DB in the cloud!
      • Interesting to investigate failover/scalability features here...
What does this mean?
• Data reorganisation/restructuring required
• Understand trade offs between design
  (scalability versus portability/interoperability at
  data layer)
• Shopping carts, reference data, vs transactional
  data/updates, ACID vs BASE
• Data portability might be tough for a while
• I‟m revising my University lecture notes! So you
  better re-architect your app and data!
Architect‟s Checklist
1.  Remember the „Why‟
2.  Know the platform architecture
3.  Appreciate differences across cloud platforms
4.  Acknowledge auto-scaling is not all magic
5.  Design for eventual consistency
6.  Don‟t ignore the network layer
7.  Performance attributes = application profile +
    platform availability + network latency
8. Plan for Monitoring and management
9. Understand Interoperability and standards
10. Believe in Cloud Computing is not just for the longtail
Experiment Setup
                                                                          Azure Web
       Amazon Web                          Google App                      Services
         Services



WSDL                               WSDL                          WSDL




                                                                                   Interface :
                            HTTP




                                                                 public Result InstantResponse(String value){
        ST




                                                             T
               RE




                                                          ES       // Echo the receiving value back to client
                                                     /R
             P/




                                                   AP              // Test net response time
           A
         SO




                                                 SO              }
                                                                 public Result Read(String value){
                                                                   // Retrieve data from DB based on the given
                                                                 value
                          WSDL                                     // Test DB read performance
                                                                 }
                                                                 public Result Create(String content){
                                                                   // Persist given content into DB
              Client Testing Application                           // Test DB write performance
                                                                 }
Network conditions Affects User
         Experience
Questions to ponder about
• This is a rather obvious conclusion
• My gmail sometimes tells me
      “reconnecting in 5 sec...” and it‟s ok for me!
• Are the user base happy enough?
• Will our network improve?
• Situation particular bad for us Aussies...
      • NBN discussion, population of 20mil not enough for vendors
        to invest?
      • Is it a matter of just dropping a container here?
      • Is there a business case for Telstra?
Architect‟s Checklist
1.  Remember the „Why‟
2.  Know the platform architecture
3.  Appreciate differences across cloud platforms
4.  Acknowledge auto-scaling is not all magic
5.  Design for eventual consistency
6.  Don‟t ignore the network layer
7.  Performance attributes = application profile +
    platform characteristics + network latency
8. Plan for Monitoring and management
9. Understand Interoperability and standards
10. Believe in Cloud Computing is not just for the longtail
Types of Applications
Application Types                       Decision Dimensions
• Enterprise, Web applications          • Application profile
       • business apps with web front         • Constraints and
         end to maximise user reach
                                                requirements on cloud
• Highly connected apps                         platform, resource models
       • Web 2.0, CDN, social
         networking, sensor network     • Resource model -> cost
• Data intensive                              • Your business model (how
                                                you make money out of
       • massively parallel,
         Hadoop/Map-Reduce                      the app you deploy on the
       • Analysis yields potentially
                                                cloud)
         surprising results                   • saving cost or speed up
• Compute Intensive                             versus ability to connect,
                                                build shared pool of meta-
       • Financial risk calculations
                                                data, discover surprising
       • Compare to HPC?
                                                results
Architect‟s Checklist
1.  Remember the „Why‟
2.  Know the platform architecture
3.  Appreciate differences across cloud platforms
4.  Acknowledge auto-scaling is not all magic
5.  Design for eventual consistency
6.  Don‟t ignore the network layer
7.  Performance attributes = application profile +
    platform characteristics+ network latency
8. Plan for Monitoring and management
9. Understand Interoperability and standards
10. Believe in Cloud Computing is not just for the longtail
Wide Area Distributed Systems
        – the reality
• Scalability seems ok
     • Relatively constant individual response time
       despite larger request volume
• Availability is more of an issue?
     • Design for occasional unavailability
     • Plan for it
     • Try catch, Retry logic, idempotent operations are
       all still good!
Pressure Tests – App Engine
App Engine Storage Create Error Rate in Pressure Test(1024 Byte)
Round Type         1:30       4:30       7:30      10:30     13:30     Average All Req. Avg. Rate
Round 0 DB Err.             0          1         0         0         2       0.6
         Sent Req.       900        857        891       900       900     889.6     900 98.84%
Round 1 DB Err.             0          4         0         0         0       0.8
         Sent Req.      2699       2134       2242      2700      2700      2495    2700 92.41%
Round 2 DB Err.             0          0         4         0         8       2.4
         Sent Req.      4500       4180       3873      4500      4032      4217    4500 93.71%
Round 3 DB Err.             3          0         0         8         3       2.8
         Sent Req.      5403       5173       5681      5792      6065 5622.8       6300 89.25%
Round 4 DB Err.             0          0         0         6         3       1.8
         Sent Req.      5572       8100       6611      4287      7111 6336.2       8100 78.22%
Round 5 DB Err.             2          3         0         4         1         2
         Sent Req.      9235       9279       5561      9112      8275 8292.4       9900 83.76%
Overall DB Err.             5          8         4        18        17      10.4
         Sent Req.     28309     29723      24859     27291     29083     27853    32400 85.97%
         Err. Rate     0.02%      0.03%      0.02%     0.07%     0.06%    0.04%
                   google.appengine.api.datastore_errors:TransactionFailedError :
                    Too much contetion on these datastore entities.
                   500 Server Error
What‟s happening here?
• Throttling?
• Denial of service attack protection
  mechanism?
• Should end user developers have access
  to Configurable parameter for setting such
  limit?
Pressure Test – Amazon SimpleDB
 Amazon SimpleDB Create Error Rate in Pressure Test (1024 Byte)
 Round   Type       3:00       6:00       9:00       12:00      Average All Req. Avg. Rate
 Round 0 DB Err.             0          0          0          0          0
         Sent Req.         900        898        900        900     899.5      900 99.94%
 Round 1 DB Err.            20         10          9         15       13.5
         Sent Req.        2696       2700       2700       2699 2698.75       2700 99.95%
 Round 2 DB Err.             4          7          7          7       6.25
         Sent Req.        4367       4497       4485       3879      4307     4500 95.71%
 Round 3 DB Err.            17          6          7         13     10.75
         Sent Req.        5740       6193       6226       5795    5988.5     6300 95.06%
 Round 4 DB Err.            13          2          3         13       7.75
         Sent Req.        7081       8005       7896       7106      7522     8100 92.86%
 Round 5 DB Err.            19          9         33         16     19.25
         Sent Req.        8926       9694       7857       8195      8668     9900 87.56%
 Overall DB Err.            73         34         59         64       57.5
         Conn. Err.      29710      31987      30064     28574 30083.75      32400 92.85%
         Err. Rate       0.25%      0.11%      0.20%     0.22%      0.19%
                                          Amazon SimpleDB are currently unavailable
Architect‟s Checklist
1.  Remember the „Why‟
2.  Know the platform architecture
3.  Appreciate differences across cloud platforms
4.  Acknowledge auto-scaling is not all magic
5.  Design for eventual consistency
6.  Don‟t ignore the network layer
7.  Performance attributes = application profile +
    platform availability + network latency
8. Plan for Monitoring and management
9. Understand Interoperability and standards
10. Believe in Cloud Computing is not just for the longtail
Monitoring and Management
• Could be a lot better!
      • We had to build a lot of monitoring code on our own
      • Some cloud system status available, but not view into your application
        health status
• Service Level Agreement issues
      •   Existing support caters for techies, developers
      •   Need dashboard view into business metric
      •   real time view into how application is running in the cloud
      •   Data point to have the commercial conversation with platform vendors
• Integration with existing enterprise monitoring capabilities?
Architect‟s Checklist
1.  Remember the „Why‟
2.  Know the platform architecture
3.  Appreciate differences across cloud platforms
4.  Acknowledge auto-scaling is not all magic
5.  Design for eventual consistency
6.  Don‟t ignore the network layer
7.  Performance attributes = application profile +
    platform availability + network latency
8. Plan for Monitoring and management
9. Understand Interoperability and standards
10. Believe in Cloud Computing is not just for the longtail
Standards and Interoperability
• Cloud Computing Interoperability Forum
  (CCIF), OMG effort, The Open Group,
  Open Cloud Manifesto...
• Is Standards THE solution?
    • Competing standards? Timing? Design by
      committee?
    • In fact, does it make sense when cloud platform
      architecture varies significantly?
    • Individual services already surfaced on the internet
    • Still want to orchestrate services within a long
      running workflow, across/from different clouds
Internet Service Bus




•   REST on .NET Service Bus
    – Simple to implement for interop across different languages
    – Less overhead packages
•   SOAP on .NET Service Bus
    – Only available for .NET Frameworks communications atm
    – Other languages are not fully supported (Java can only
      pass Access Control on .NET Service)
    – More overhead packages when communicate between C#
      and Java, than C# to C#
Architect‟s Checklist
1.  Remember the „Why‟
2.  Know the platform architecture
3.  Appreciate differences across cloud platforms
4.  Acknowledge auto-scaling is not all magic
5.  Design for eventual consistency
6.  Don‟t ignore the network layer
7.  Performance attributes = application profile +
    platform availability + network latency
8. Plan for Monitoring and management
9. Understand Interoperability and standards
10. Is Cloud Computing just for the longtail?
Impedance to Enterprise
           Adoption of Cloud
•   Security, Privacy law
•   Ownership of data, data retention
•   Portability, fear of vendor lock in
•   Migration, integration with existing IT assets
•   Values for startups does not necessarily apply to
    enterprise
       • Cost of initial capital investment is already spent
       • Pay per use is not necessary a business benefit
Some Existing Efforts and
        Solution Patterns
• Analyse risk profiles for your application portfolio
• Private cloud (trade off economies of scale?)
• „de-value data‟, „partitioning‟, segregation‟
• Enable user choice, „trust‟
• Integration/interoperability solutions
• Security – lots of technical solutions
• Cloud Security Alliance (CSA) for some guidance on
  security issues
• Upcoming Research Collaboration with SEI CMU/US
  DoD
Architect‟s Checklist
1.  Remember the „Why‟
2.  Know the platform architecture
3.  Appreciate differences across cloud platforms
4.  Acknowledge auto-scaling is not all magic
5.  Design for eventual consistency
6.  Don‟t ignore the network layer
7.  Performance attributes = application profile +
    platform availability + network latency
8. Plan for Monitoring and management
9. Understand Interoperability and standards
10. Is Cloud Computing just for the longtail?
Architect‟s Checklist
1.  Remember the „Why‟
2.  Know the platform architecture
3.  Appreciate differences across cloud platforms
4.  Acknowledge auto-scaling is not all magic
5.  Design for eventual consistency
6.  Don‟t ignore the network layer
7.  Performance attributes = application profile +
    platform availability + network latency
8. Plan for Monitoring and management
9. Understand Interoperability and standards
10. Believe in Cloud Computing is not just for the longtail
An Engineering Analogy...
SS Great Britain, I K Brunel
Getting Involved
• Collaboration with UNSW
    •   We are recruiting Research Fellows!
    •   Research residential for Architects
    •   Open House Lab
    •   Short term contract research, advisory services
    •   longer term linkage programs (ARC, NICTA, CRC)
    •   Blogs.unsw.edu.au/annaliu
Standing on the shoulders of
            Giants
• UNSW Team
     •   Dr Helen Paik
     •   Mr Liang Zhao
     •   Mr Xiaomin Wu
     •   Mr Fei Teng
     •   Mr Jae Choi
• NICTA Team
     • Dr Jenny Liu, Markus Lachat
     • Dr Mark Staples
• Industry Advisory Team
     • Mr Kevin Francis (Object Consulting)
     • Dr Rajiv Ranjan (Smart Service CRC)
     • Milinda Kotelawele (Longscale)
THANK YOU!

Architecting Cloud Applications - the essential checklist

  • 1.
    Architecting Cloud Applications - the essential checklist - Anna Liu Associate Professor in Services Engineering School of Computer Science and Engineering University of New South Wales annaliu@cse.unsw.edu.au
  • 2.
    Architect‟s Checklist 1. Remember the „Why‟ 2. Know the platform architecture 3. Appreciate differences across cloud platforms 4. Acknowledge auto-scaling is not all magic 5. Design for eventual consistency 6. Don‟t ignore the network layer 7. Performance attributes = application profile + platform availability + network latency 8. Plan for Monitoring and management 9. Understand Interoperability and standards 10. Believe in Cloud Computing is not just for the longtail
  • 3.
    Architect‟s Checklist 1. Remember the „Why‟ 2. Know the platform architecture 3. Appreciate differences across cloud platforms 4. Acknowledge auto-scaling is not all magic 5. Design for eventual consistency 6. Don‟t ignore the network layer 7. Performance attributes = application profile + platform availability + network latency 8. Plan for Monitoring and management 9. Understand Interoperability and standards 10. Believe in Cloud Computing is not just for the longtail
  • 4.
    Why Cloud Computing • Economies of scale • Pay per usage • Handling Big Data • Service Delivery platform • Innovative, engaging user experience • Realising Green IT initiatives
  • 5.
    Architect‟s Checklist 1. Remember the „Why‟ 2. Know the platform architecture 3. Appreciate differences across cloud platforms 4. Acknowledge auto-scaling is not all magic 5. Design for eventual consistency 6. Don‟t ignore the network layer 7. Performance attributes = application profile + platform availability + network latency 8. Plan for Monitoring and management 9. Understand Interoperability and standards 10. Believe in Cloud Computing is not just for the longtail
  • 6.
    Cloud Platform Architecture Cloud Applications Monitoring/Management Tools Development Tools Design and SaaS web app, data-intensive, CDNs, Social, CRM, etc Programming runtime, frameworks, application services Storage, compute, Map-Reduce; workflow, Web 2.0, collaboration, mashups Deploy, Scheduling, Fault-Management, Monitoring, Allocation, Security PaaS Automatic scale, Selection, Coordination, Messaging Data organization techniques, Replication, Load balancing Virtualisation, Resource Management, Routing IaaS Datacentres Datacentres
  • 7.
    Architect‟s Checklist 1. Remember the „Why‟ 2. Know the platform architecture 3. Appreciate differences across cloud platforms 4. Acknowledge auto-scaling is not all magic 5. Design for eventual consistency 6. Don‟t ignore the network layer 7. Performance attributes = application profile + platform availability + network latency 8. Plan for Monitoring and management 9. Understand Interoperability and standards 10. Believe in Cloud Computing is not just for the longtail
  • 8.
    Different Platforms with Different Target Audience • Google App Engine • Caters for web applications • < 30 sec compute time • PaaS shields you from lots of infrastructure complexity • Microsoft Azure • More general purpose • optimised for .NET • software plus services strategy caters to enterprise scenarios • Amazon EC2/S3/SimpleDB • Virtual compute, storage on demand, • IaaS provides you with lots of flexibility • Third party innovation on top to enhance application development experience (eg. Red Hat/JBoss, MySQL, IBM Websphere, Appistry etc)
  • 9.
    Architect‟s Checklist 1. Remember the „Why‟ 2. Know the platform architecture 3. Appreciate differences across cloud platforms 4. Acknowledge auto-scaling is not all magic 5. Design for eventual consistency 6. Don‟t ignore the network layer 7. Performance attributes = application profile + platform availability + network latency 8. Plan for Monitoring and management 9. Understand Interoperability and standards 10. Believe in Cloud Computing is not just for the longtail
  • 10.
    Auto scaling behindthe scene • Amazon EC2 • CloudWatch – view into VM instance server utilization details, operational performance, disk reads and writes, network • Elastic Load Balancer – distributes apps across EC2 instances, control request load-balancing across single or multiple cloud sites, performs provisioning-related decisions based on dynamic monitoring data reported by CloudWatch • developers specify preconditions eg. average CPU utilisation • Microsoft Azure • Azure Fabric Controller (FC) – monitors, maintains and provisions machines to host applications • Web role, worker roles, instance number configurations parameters
  • 11.
    Auto scaling behindthe scene • Google App Engine • Handles auto scaling and load balancing of application services based on web traffic • requests/task execution limited to 30 seconds • Moved from Tomcat to Jetty to reduce memory footprint (no need for session handler) • Fault tolerance and persistence of stored data through distributed replication • GAE serves static web content, hence no additional implementation to handle checkpointing and replication to re-instantiate execution state of processes
  • 12.
    Architect‟s Checklist 1. Remember the „Why‟ 2. Know the platform architecture 3. Appreciate differences across cloud platforms 4. Acknowledge auto-scaling is not all magic 5. Design for eventual consistency 6. Don‟t ignore the network layer 7. Performance attributes = application profile + platform availability + network latency 8. Plan for Monitoring and management 9. Understand Interoperability and standards 10. Believe in Cloud Computing is not just for the longtail
  • 13.
    ACID no more? “EventualConsistency Amazon SimpleDB keeps multiple copies of each domain. When data is written or updated (using PutAttributes, DeleteAttributes, CreateDomain or DeleteDomain) and Success is returned, all copies of the data are updated. However, it takes time for the update to propagate to all storage locations. The data will eventually be consistent, but an immediate read might not show the change. Consistency is usually reached within seconds, but a high system load or network partition might increase this time. Repeating a read after a short time should return the updated data. “ - Amazon Developer Guide, 2007-11-07
  • 14.
    CAP Theorem • Threeproperties of shared-data systems • Consistency: one update is made, all observers are updated • Availability: all database transactions should be processed accurately and promptly • Tolerance: tolerant to network Partitions • CAP Theorem • Only two properties can be achieved at any time • Network partitions is given in distribute systems • Have to pick one between consistency and availability
  • 15.
    Relational no more? •Google App Engine‟s datastore: • Select can be performed on one table only • Intentionlly does not support Join • Inefficient when queries span across machines • Allows disks to fail without system failing • Cannot easily port over existing enterprise relational DB • Microsoft Azure: • Retiring the previous SSDS (no transactional support then) • Azure SQL Services to replace SSDS with relational features and Tx • Amazon • S3 for big storage scenario • Have your own relational DB in the cloud! • Interesting to investigate failover/scalability features here...
  • 16.
    What does thismean? • Data reorganisation/restructuring required • Understand trade offs between design (scalability versus portability/interoperability at data layer) • Shopping carts, reference data, vs transactional data/updates, ACID vs BASE • Data portability might be tough for a while • I‟m revising my University lecture notes! So you better re-architect your app and data!
  • 17.
    Architect‟s Checklist 1. Remember the „Why‟ 2. Know the platform architecture 3. Appreciate differences across cloud platforms 4. Acknowledge auto-scaling is not all magic 5. Design for eventual consistency 6. Don‟t ignore the network layer 7. Performance attributes = application profile + platform availability + network latency 8. Plan for Monitoring and management 9. Understand Interoperability and standards 10. Believe in Cloud Computing is not just for the longtail
  • 18.
    Experiment Setup Azure Web Amazon Web Google App Services Services WSDL WSDL WSDL Interface : HTTP public Result InstantResponse(String value){ ST T RE ES // Echo the receiving value back to client /R P/ AP // Test net response time A SO SO } public Result Read(String value){ // Retrieve data from DB based on the given value WSDL // Test DB read performance } public Result Create(String content){ // Persist given content into DB Client Testing Application // Test DB write performance }
  • 19.
  • 20.
    Questions to ponderabout • This is a rather obvious conclusion • My gmail sometimes tells me “reconnecting in 5 sec...” and it‟s ok for me! • Are the user base happy enough? • Will our network improve? • Situation particular bad for us Aussies... • NBN discussion, population of 20mil not enough for vendors to invest? • Is it a matter of just dropping a container here? • Is there a business case for Telstra?
  • 21.
    Architect‟s Checklist 1. Remember the „Why‟ 2. Know the platform architecture 3. Appreciate differences across cloud platforms 4. Acknowledge auto-scaling is not all magic 5. Design for eventual consistency 6. Don‟t ignore the network layer 7. Performance attributes = application profile + platform characteristics + network latency 8. Plan for Monitoring and management 9. Understand Interoperability and standards 10. Believe in Cloud Computing is not just for the longtail
  • 22.
    Types of Applications ApplicationTypes Decision Dimensions • Enterprise, Web applications • Application profile • business apps with web front • Constraints and end to maximise user reach requirements on cloud • Highly connected apps platform, resource models • Web 2.0, CDN, social networking, sensor network • Resource model -> cost • Data intensive • Your business model (how you make money out of • massively parallel, Hadoop/Map-Reduce the app you deploy on the • Analysis yields potentially cloud) surprising results • saving cost or speed up • Compute Intensive versus ability to connect, build shared pool of meta- • Financial risk calculations data, discover surprising • Compare to HPC? results
  • 23.
    Architect‟s Checklist 1. Remember the „Why‟ 2. Know the platform architecture 3. Appreciate differences across cloud platforms 4. Acknowledge auto-scaling is not all magic 5. Design for eventual consistency 6. Don‟t ignore the network layer 7. Performance attributes = application profile + platform characteristics+ network latency 8. Plan for Monitoring and management 9. Understand Interoperability and standards 10. Believe in Cloud Computing is not just for the longtail
  • 24.
    Wide Area DistributedSystems – the reality • Scalability seems ok • Relatively constant individual response time despite larger request volume • Availability is more of an issue? • Design for occasional unavailability • Plan for it • Try catch, Retry logic, idempotent operations are all still good!
  • 25.
    Pressure Tests –App Engine App Engine Storage Create Error Rate in Pressure Test(1024 Byte) Round Type 1:30 4:30 7:30 10:30 13:30 Average All Req. Avg. Rate Round 0 DB Err. 0 1 0 0 2 0.6 Sent Req. 900 857 891 900 900 889.6 900 98.84% Round 1 DB Err. 0 4 0 0 0 0.8 Sent Req. 2699 2134 2242 2700 2700 2495 2700 92.41% Round 2 DB Err. 0 0 4 0 8 2.4 Sent Req. 4500 4180 3873 4500 4032 4217 4500 93.71% Round 3 DB Err. 3 0 0 8 3 2.8 Sent Req. 5403 5173 5681 5792 6065 5622.8 6300 89.25% Round 4 DB Err. 0 0 0 6 3 1.8 Sent Req. 5572 8100 6611 4287 7111 6336.2 8100 78.22% Round 5 DB Err. 2 3 0 4 1 2 Sent Req. 9235 9279 5561 9112 8275 8292.4 9900 83.76% Overall DB Err. 5 8 4 18 17 10.4 Sent Req. 28309 29723 24859 27291 29083 27853 32400 85.97% Err. Rate 0.02% 0.03% 0.02% 0.07% 0.06% 0.04% google.appengine.api.datastore_errors:TransactionFailedError : Too much contetion on these datastore entities. 500 Server Error
  • 26.
    What‟s happening here? •Throttling? • Denial of service attack protection mechanism? • Should end user developers have access to Configurable parameter for setting such limit?
  • 27.
    Pressure Test –Amazon SimpleDB Amazon SimpleDB Create Error Rate in Pressure Test (1024 Byte) Round Type 3:00 6:00 9:00 12:00 Average All Req. Avg. Rate Round 0 DB Err. 0 0 0 0 0 Sent Req. 900 898 900 900 899.5 900 99.94% Round 1 DB Err. 20 10 9 15 13.5 Sent Req. 2696 2700 2700 2699 2698.75 2700 99.95% Round 2 DB Err. 4 7 7 7 6.25 Sent Req. 4367 4497 4485 3879 4307 4500 95.71% Round 3 DB Err. 17 6 7 13 10.75 Sent Req. 5740 6193 6226 5795 5988.5 6300 95.06% Round 4 DB Err. 13 2 3 13 7.75 Sent Req. 7081 8005 7896 7106 7522 8100 92.86% Round 5 DB Err. 19 9 33 16 19.25 Sent Req. 8926 9694 7857 8195 8668 9900 87.56% Overall DB Err. 73 34 59 64 57.5 Conn. Err. 29710 31987 30064 28574 30083.75 32400 92.85% Err. Rate 0.25% 0.11% 0.20% 0.22% 0.19% Amazon SimpleDB are currently unavailable
  • 29.
    Architect‟s Checklist 1. Remember the „Why‟ 2. Know the platform architecture 3. Appreciate differences across cloud platforms 4. Acknowledge auto-scaling is not all magic 5. Design for eventual consistency 6. Don‟t ignore the network layer 7. Performance attributes = application profile + platform availability + network latency 8. Plan for Monitoring and management 9. Understand Interoperability and standards 10. Believe in Cloud Computing is not just for the longtail
  • 35.
    Monitoring and Management •Could be a lot better! • We had to build a lot of monitoring code on our own • Some cloud system status available, but not view into your application health status • Service Level Agreement issues • Existing support caters for techies, developers • Need dashboard view into business metric • real time view into how application is running in the cloud • Data point to have the commercial conversation with platform vendors • Integration with existing enterprise monitoring capabilities?
  • 36.
    Architect‟s Checklist 1. Remember the „Why‟ 2. Know the platform architecture 3. Appreciate differences across cloud platforms 4. Acknowledge auto-scaling is not all magic 5. Design for eventual consistency 6. Don‟t ignore the network layer 7. Performance attributes = application profile + platform availability + network latency 8. Plan for Monitoring and management 9. Understand Interoperability and standards 10. Believe in Cloud Computing is not just for the longtail
  • 37.
    Standards and Interoperability •Cloud Computing Interoperability Forum (CCIF), OMG effort, The Open Group, Open Cloud Manifesto... • Is Standards THE solution? • Competing standards? Timing? Design by committee? • In fact, does it make sense when cloud platform architecture varies significantly? • Individual services already surfaced on the internet • Still want to orchestrate services within a long running workflow, across/from different clouds
  • 38.
    Internet Service Bus • REST on .NET Service Bus – Simple to implement for interop across different languages – Less overhead packages • SOAP on .NET Service Bus – Only available for .NET Frameworks communications atm – Other languages are not fully supported (Java can only pass Access Control on .NET Service) – More overhead packages when communicate between C# and Java, than C# to C#
  • 39.
    Architect‟s Checklist 1. Remember the „Why‟ 2. Know the platform architecture 3. Appreciate differences across cloud platforms 4. Acknowledge auto-scaling is not all magic 5. Design for eventual consistency 6. Don‟t ignore the network layer 7. Performance attributes = application profile + platform availability + network latency 8. Plan for Monitoring and management 9. Understand Interoperability and standards 10. Is Cloud Computing just for the longtail?
  • 40.
    Impedance to Enterprise Adoption of Cloud • Security, Privacy law • Ownership of data, data retention • Portability, fear of vendor lock in • Migration, integration with existing IT assets • Values for startups does not necessarily apply to enterprise • Cost of initial capital investment is already spent • Pay per use is not necessary a business benefit
  • 41.
    Some Existing Effortsand Solution Patterns • Analyse risk profiles for your application portfolio • Private cloud (trade off economies of scale?) • „de-value data‟, „partitioning‟, segregation‟ • Enable user choice, „trust‟ • Integration/interoperability solutions • Security – lots of technical solutions • Cloud Security Alliance (CSA) for some guidance on security issues • Upcoming Research Collaboration with SEI CMU/US DoD
  • 42.
    Architect‟s Checklist 1. Remember the „Why‟ 2. Know the platform architecture 3. Appreciate differences across cloud platforms 4. Acknowledge auto-scaling is not all magic 5. Design for eventual consistency 6. Don‟t ignore the network layer 7. Performance attributes = application profile + platform availability + network latency 8. Plan for Monitoring and management 9. Understand Interoperability and standards 10. Is Cloud Computing just for the longtail?
  • 43.
    Architect‟s Checklist 1. Remember the „Why‟ 2. Know the platform architecture 3. Appreciate differences across cloud platforms 4. Acknowledge auto-scaling is not all magic 5. Design for eventual consistency 6. Don‟t ignore the network layer 7. Performance attributes = application profile + platform availability + network latency 8. Plan for Monitoring and management 9. Understand Interoperability and standards 10. Believe in Cloud Computing is not just for the longtail
  • 44.
    An Engineering Analogy... SSGreat Britain, I K Brunel
  • 45.
    Getting Involved • Collaborationwith UNSW • We are recruiting Research Fellows! • Research residential for Architects • Open House Lab • Short term contract research, advisory services • longer term linkage programs (ARC, NICTA, CRC) • Blogs.unsw.edu.au/annaliu
  • 46.
    Standing on theshoulders of Giants • UNSW Team • Dr Helen Paik • Mr Liang Zhao • Mr Xiaomin Wu • Mr Fei Teng • Mr Jae Choi • NICTA Team • Dr Jenny Liu, Markus Lachat • Dr Mark Staples • Industry Advisory Team • Mr Kevin Francis (Object Consulting) • Dr Rajiv Ranjan (Smart Service CRC) • Milinda Kotelawele (Longscale)
  • 47.