Managing Performance in the Cloud

1,422 views

Published on

How to you manage Performance in the Cloud, in particular in "Platform as a Service (PaaS) environments like Window's Azure or Heroku where you don't have a "virtual machine" to manage?

Even in "Infrastructure as a Service (IaaS)" environments like Amazon EC2 there are limitations on the tools you can deploy into that environment to assist in performance management, troubleshooting etc (e.g. you can't deploy promiscuous mode network sniffing tools in EC2).

James Smith from Adactus will give us an overview of Cloud Services as a whole, and then drill down into some of the issues they have experienced in deployed their "Pulse" Claims Management Solution into the Azure cloud (http://www.pulseclaims.com/home).

Beyond just looking at page speed performance he'll talk about the challenges involved in managing SLA's, Cloud "support" (or lack of it!), performance troubleshooting and the whole "performance lifecycle".

Published in: Technology, Business
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,422
On SlideShare
0
From Embeds
0
Number of Embeds
14
Actions
Shares
0
Downloads
15
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide
  • The Cloud is the perfect (the natural) environment for distributed applications & the idea of service orientation Amazon played a key role in the development of cloud computing by modernising their data centers, which were using as little as 10% of their capacity at any one time, just to leave room for occasional spikes.
  • Source: Intel
  • Source: Horn Group
  • Source: Gartner
  • Azure = .NETHeroku = Ruby, Java, Python, Scala, Clojure, and Node.jsGoogle = Java Force.com = Java
  • Source: Compuware
  • Source: Appdynamics
  • Are youtroubleshootingyourproblemorsomeoneelses?
  • Source http://www.cloudslueth.netData is sampled for regional backbones
  • http://www.hostingreview.com/2012/03/01/massive-data-needs-the-server-requirements-and-costs-for-the-webs-biggest-websites/
  • Centralised logging and reportingOver 60% are still using log files (Source:CloudFoundry)http://www.virtualizationpractice.com/rackspace-buys-cloudkick-implications-for-iaas-performance-management-8697/
  • Source: Appdynamics
  • Source: http://www.cloudslueth.net
  • Source: http://www.cloudslueth.net
  • Source: Redgate Diagnostics Manager (Cerebrata)
  • Source: Gibraltar (http://www.gibraltarsoftware.com/)
  • Cloud Sharding;managed infrastructure, elastic provisioning
  • http://www.computerworld.com/s/article/9223117/Bandwidth_bottlenecks_loom_large_in_the_cloud
  • Providers – tell customers, save phone calls, support and bad PR
  • http://esj.com/blogs/enterprise-insights/2011/03/costs-of-poor-application-performance.aspx
  • Managing Performance in the Cloud

    1. 1. Managing Performance in the Cloud TheDevMgr
    2. 2. BACKGROUND Cloud History
    3. 3. • Desktop internet computing • Shift from local to centralised computing • Software was cheap and hardware was expensive. In the nineties…
    4. 4. • Shift from desktop to mobile • The cloud is born • Bezos and his book company start to shape the future. The carefree noughty days
    5. 5. • Shift from centralised to distributed computing • Commoditisation of computing (PAYG) • Anything-as-a-Service (XaaS). The twenty-tens
    6. 6. THE CLOUD What is it?
    7. 7. Service Models XaaS SaaS PaaS IaaS Anything Software Platform Infrastructure
    8. 8. Infrastructure (IaaS) • Outsource hardware to support operations – Storage, servers, networking components • Service provider owns and hosts equipment • Service provider responsible for management & maintenance.
    9. 9. Platform (Paas) • Paradigm for delivering operating systems and associated services over the Internet • No downloads or installation • Google App Engine, Microsoft Windows Azure, Heroku & Force.com.
    10. 10. Software (SaaS) • Software distribution model in which applications are hosted by a vendor or service provider • Made available to customers over the Internet • SalesForce.com, many...many...more.
    11. 11. Deployment Models Private PublicHybrid
    12. 12. • “Virtualised” infrastructure operated for a single organisation (single tenant) • Hosted internally or externally • Managed internally or by a third-party • Can be secured to meet compliance • More expensive, less flexible. Private Cloud
    13. 13. • Service provider makes resources available to the general public over the Internet – Compute, Storage, O/S, Applications • May be free or pay-per-usage model • Fast deployment, short commitments • Shared services, less control. Public Cloud
    14. 14. • Core platform on private cloud • Burstable capability into public cloud • Brings best of both private and public • Brings problems of both private and public. Hybrid
    15. 15. THE COST OF POOR CLOUD PERFORMANCE Financial and customer satisfaction
    16. 16. Cost • Compuware survey suggests large business losses can exceed £500k due to poor cloud performance • 57% of European IT Directors believe that they can’t manage cloud application performance • You still have to deliver 2 second response times.
    17. 17. Performance • 50% of ops teams have suffered more than one P-1 performance issue in the cloud • 33% experience a P-1 issue every month • 60% of incidents took more than 2 hours to resolve • Good luck webops (cloudops). Source: AppDynamics
    18. 18. COMMON PERFORMANCE CHALLENGES Traditional and new problems
    19. 19. Performance Challenges • Traditional • Connectivity – Bandwidth / Latency • Bottlenecks – CPU, IO, Database • Contemporary • Bigger scale – More stuff • Shared infrastructure – Not your stuff (entirely).
    20. 20. Traditional • Connectivity • Latency, jitter & Packet loss • Bandwidth limitations • Users demand fast access to data • Bottlenecks • Will still occur! • Virtualised hardware – Host Contention – Storage.
    21. 21. Contemporary • Bigger Scale • 10’s, 100’s, 1000’s, 10,0 00’s of servers – VM Sprawl • Dynamically allocated physical resource • Over-provisioning • Hidden billing costs • Shared Resources • Room for one more? • Deal with other peoples problems – DDOS, general stupidity? – Mi casa, es tu casa.
    22. 22. • Elasticity – Planned (scheduled/controlled scaling) – Unplanned (auto-scaling) • Global distribution – Data Centres – Data • Less Control. Paradigm Shift
    23. 23. Data location still matters!
    24. 24. CLOUD EXPERIENCES Stories from the trenches
    25. 25. INFRASTRUCTURE-AS-SERVICE IaaS
    26. 26. • Adactus Food Ordering Platform • Transacts – > 7 million orders & > $100M USD a year – 30% daily of orders taken in1 hour • Adopted as eCommerce platform for Pizza Hut and KFC globally. Application
    27. 27. Platform • Private • Global instances all deployed on private clouds • VMWare ESX Hosts – V-Web’s • Dedicated / Non- Virtualised SQL • Public • Rackspace public cloud • On-Demand – Load Balancers – Web Servers – SQL Servers • High-scale, high- volume.
    28. 28. • Big Scale – A lot more to manage • Virtual Platform – Contention • End-to-End Application Performance Management. Challenges
    29. 29. Solutions • Cloud-centric APM – AppDynamics – CloudKick (now Rackspace APM) – Rightscale • Automated Operations – Chef, Puppet (SysOps) – CloudFoundry, OpenShift (App LifeCycle) – Heroku, AppFog (NoOps?)
    30. 30. PLATFORM-AS-A-SERVICE PaaS
    31. 31. • Adactus Pulse • Claims management solution for the insurance industry delivered as SaaS • Processed over a million claims • Deployed for ISS and Aviva. Application
    32. 32. Platform • Deployed into Windows Azure Platform – Web Roles – Worker Roles – SQL Azure – SQL Azure Reporting Services • Upgrade of traditional ASP.NET application • Continuous Deployment Process.
    33. 33. Challenges • Disproving the “shared resource” impact – Is it the infrastructure? • Database performance is a black-box – Limitations and more limitations • Getting performance data is hard work – Not easy to access, dispersed everywhere • Baseline performance is not linear.
    34. 34. Baseline Performance Large variances in baseline performance.
    35. 35. Windows Azure is more consistent.
    36. 36. Solutions • Instrumentation is king – Aspect Orientation (AOP) • Gibraltar – Does your provider offer a Performance API? • Dedicated Cloud (Azure) Tools • Dynatrace • Cerebrata • You must automate – Deployment (and everything else!) – Consistency is key.
    37. 37. DATABASE-AS-SERVICE DaaS
    38. 38. • Service provider takes responsibility for installing and maintaining the database. • Amazon (mySQL) • Microsoft SQL Azure • Google App Engine Datastore • CouchDB, MongoDB. Overview
    39. 39. Challenges • Most service providers are having performance issues (even Google!) • Database is a (performance) black-box – You will find limitations • Need to handle transient connections – Your database will be there, but not always.
    40. 40. Solutions • Do as much tuning outside of the cloud as possible • Instrument your data access • DB sharding becomes viable easy • Build connection resiliency into your data- framework.
    41. 41. • On-premise databases – Are you sure? • You might be about to create your own data storm? – Too much on-premise data – Too little bandwidth. Caution
    42. 42. SOFTWARE-AS-A-SERVICE SaaS
    43. 43. Overview • Adactus Pulse – Delivered on a SaaS Model • We consume SaaS (heavily) – CRM, Performance, Google Apps, WIKI, Bug Tracking, Testing, Accounting, Planning & Forecasting, Document Management, CMS, Exception Handling, Business Intelligence, Deployment, APM, Collaboration, HRM, ERP and more.
    44. 44. Challenges • Consumer • Good news – Performance is out of your control! • Bad news – Performance is out of your control! • Provider • Expectations are high! – Response times • Performance is still king! – Competitors – Repeat use.
    45. 45. Real User Monitoring • Consumer • It’s your new best friend • Get to know your SLA – Its your new best friend • Simple rules – Be the first to know – Get your money back • Provider • It’s your new best friend • You will live & die by your SLA’s • Simple rules – Be the first to know – Tell your customers.
    46. 46. MonitoringXaaS SaaS PaaS IaaS RUM Instrumentation APM
    47. 47. BEYOND PERFORMANCE Stories from the trenches
    48. 48. Service-Level-Agreements • Critical element for both provider and consumer • Don’t waste time on detailed numerical service level agreements • SLAs need to be based on end-user experience.
    49. 49. Service-Level-Agreements 1. Establish system availability 2. Establish system response time 3. Establish error resolution time 4. Establish a fail over window for disaster recovery 5. Ensure that you can get your data back.
    50. 50. Service-Level-Agreements • IaaS – The O/S is your responsibility • Managed Cloud Platforms are available • PaaS – SLA’s stop at the O/S • Your application still remains your responsibility • SaaS – Know your SLA inside out. Its your responsibility.
    51. 51. Disaster Recovery • It’s hard in the cloud • DR strategies are still emerging • Bandwidth & network capacity limits • Security is still a concern.
    52. 52. Disaster Recovery • There isn’t a single blueprint • Identify critical resources and recovery methods • Architect for redundancy • Back up to/from and restore to/from the cloud • Most cloud SLA’s > 99.5% availability – 4 hours, 39 minutes downtime per month.
    53. 53. THANK YOU. QUESTIONS? That’s all folks!

    ×