Your SlideShare is downloading. ×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Optimizing Your Cloud Applications in RightScale

550

Published on

RightScale Webinar: Performance tuning applications in the public cloud is both easier and harder than on your own server hardware. It's much easier to scale up and scale out in the cloud but you …

RightScale Webinar: Performance tuning applications in the public cloud is both easier and harder than on your own server hardware. It's much easier to scale up and scale out in the cloud but you generally don't have much (if any) control over the hardware. With public cloud, you take the building blocks offered by the cloud infrastructure and design the application architecture to scale based on the capacity planning requirements and scalability testing results. In this session, we'll talk through our experiences scaling and performance tuning the RightScale platform in the cloud and share tips for sizing, auto-scaling, monitoring, and troubleshooting large-scale cloud deployments.

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
550
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
21
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • The cluster monitoring is very powerful in that it provides different types of views into the operation of large clusters of servers
  • The cluster monitoring is very powerful in that it provides different types of views into the operation of large clusters of servers
  • The architecture behind the cluster monitoring is rather extensiveCustomer (i.e. your) servers send monitoring data every 20 seconds to our serversThe data points are cached in-memory on those servers and flushed to disk periodicallyCluster monitoring graphs are produced on separate front-end servers, which pull the data from over 100 monitoring storage serversThe graphs are produced using rrdtool and auto-refresh
  • Walk through ofhow it works: in any deployment, go to the monitoring tab select servers select metric to plot familiar controls to switch time period and graph size displays one graph per server, here core1.rightscale.com through core8.rightscale.com in this example the graphs show cpu utilization for the past week, where blue is busy time and green is idle
  • Individual graphs only work for so many servers, they also don’t show what is happening as an aggregateStacked graphs stack the contribution of each server on top of one anotherWalk through what the graph shows
  • Stacked graphs are great to see the aggregate, but it is often difficult to see abnormal server behaviorHeat maps show many servers on one graph by plotting one horizontal bar per serverThe time axis is the same for all servers and it is shown at the bottom of the graphThe color of the bar shows the value of the metric for the serverWalk through the graphIt’s easy to see that there are 6 servers sharing the load, and two servers that are different
  • At scale this is how all this looks and comes togetherThis example is real, it shows an incident we had with our monitoring cluster a few months agoThis heat map shows 100 servers out of one of our monitoring clusters (we want to be vague here…)When there are more than 100 servers, the heat map shows a sampling of 100Describe the sampling: most recently launched, longest running, some of each server template, rest randomStory:This heat map plots I/O wait for our monitoring servers on a day where we suddenly received a number of alerts for a few serversThe heap map shows these servers clearly as red bands starting between 7am and 8amSo we could clearly see that something was going on with a small number of servers and that it started more or less at the same time on all themTo see what happened in aggregate, we can switch graph type…
  • This shows the same incident as on the previous slide, but with a timescale of a weekIt shows the number of servers handled by each monitoring server, i.e. each color bar shows one serverIt is easy to see that some customer launched a large number of servers right at the time the overload beganFurther investigation showed that due to a bug these servers were allocated unevenly across the cluster causing the overload’
  • Transcript

    • 1. Optimizing Your Cloud Applications in RightScale
      October 13, 2011
      Watch the video of this webinar
    • 2. Your Panel Today
      Presenting
      • Rafael H. Saavedra, VP Engineering, RightScale
      • 3. Raphael Simon, Sr. Systems Architect, RightScale
      Q&A
      Jordan Evans, Account Manager, RightScale
      Please use the “Questions” window to ask questions any time!
    • 4. Agenda
      Introduction
      3-tier application architecture
      Vertical & horizontal scaling
      RightScale monitoring and cluster graphs
      New Relic RPM
      Support for optimizing DB performance
      Load testing
      Please use the “Questions” window to ask questions any time!
    • 5. Multi-tenancy
      Shared resource pooling
      Geo-distribution and ubiquitous network access
      Service oriented
      Dynamic resource provisioning
      Self-organizing
      Utility based pricing
      Cloud computing characteristics
    • 6. No upfront investment
      Lowering operating costs
      Highly scalable
      Easy access
      Reduces business risk and maintenance costs
      Enables process automation
      Cloud computing advantages
    • 7. 3-tier application architecture
      Load balancers
      An array of application servers
      Master-slave
    • 8. Optimizing Your Cloud Applications in RightScale
      Vertical & Horizontal Scaling
    • 9. Instance size (vertical scaling)
      Instance autoscaling (horizontal scaling)
      Server arrays
      RightScale support for performance optimization
      ServerTemplates are configured to capture performance data
      CollectdRightScripts
      Hardware & OS monitoring data
      Specialized plugins – MySQL, HAProxy, Apache, NgInx, IIS, etc
      Monitoring graphs: individual, cluster, stacked, heat maps
      Alerts & escalations
      New Relic RPM
      Cloud performance optimization
    • 10. Compute units vs memory
      Scaling up – spectrum of instance sizes
    • 11. Server arrays provide horizontal scaling
    • 12. The array scales up or down based on performance votes
      Tags allow scaling on an arbitrary decision set
      Decision threshold controls reaction time
      Sleep time allows new resources to have an impact
      Scaling can be time dependent
      Detailed setup instructions: http://bit.ly/c1oLr2
      Fast response to changes in load conditions using alerts
      Allocation of servers to availability zones based on weights
      Deployment-based so configuration is consistent
      Arrays can be pre-scaled to support anticipated demand
      Server arrays provide horizontal scaling
    • 13. Optimizing Your Cloud Applications in RightScale
      Monitoring & Cluster Graphs
      with RightScale
    • 14. Server monitoring graphs
    • 15. Cluster monitoring
      Individual graphs
      Good for a dozen servers
      Displays all standard graphs with full detail
      Stacked graphs
      Displays the contribution of many servers to a total
      Great to see the sum and variability of activity in a cluster
      Difficult to make out individual servers
      Examples: requests/sec, cpu busy cycles, I/O bytes/sec
      Heat maps
      Displays a bar for each server
      Great to see uneven distribution across servers
      Great to quickly spot performance problems across many servers
      Difficult to read absolute values or see the total cluster activity
    • 16. Cluster monitoring architecture
      Architecture
      Monitoring front-end serverspull data from storage servers
      Up to 100 servers on one graph(to be increased)
      monitoring
      storage
      servers
      monitoring
      front-end
      servers
      your servers
    • 17. Cluster monitoring
      Current cluster monitoring: one graph per server
    • 18. Stacked graphs
      Each color band shows contribution of one server
      Servers are stacked on top of one another
    • 19. Heat maps
      Each horizontal strip shows one server
      The color shows how “hot” the server is running
    • 20. Heat map with 100 servers
    • 21. Stacked graph of the same 100 servers
    • 22. Optimizing Your Cloud Applications in RightScale
      Application Performance Analytics with New Relic
    • 23. Real-Time App Performance Analytics
      Supports Ruby, PHP, Java & .Net
      SQL & NoSQL performance
      Web transaction tracing
      Performance notifications
      Availability monitoring
      Scalability analysis
      New Relic RPM
    • 24. New Relic RPM
      Direct access from RightScale dashboard
    • 25. New Relic RPM
      Historical statistics over a period of time
    • 26. New Relic RPM
      Distribution of the most time consuming requests
    • 27. New Relic RPM
      Statistics about response times from different countries
    • 28. New Relic RPM
      Detailed response times by browser
    • 29. An expensive query
      The N+1 query problem
      New Relic RPM – 2 Examples
    • 30. Optimizing Your Cloud Applications in RightScale
      Optimizing Database Performance
    • 31. Optimizing DB performance
      RightScale MySQLServerTemplates
      Configuration files tailored to instance size
      innodb_buffer_pool_size
      key_buffer_size
      thread_size
      sort_buffer_size
      The never ending task of identifying current bottlenecks
      Disk seeks
      Performance of disk operations
      Scale up when working set cannot fit in memory – avoid active swapping
      Constant monitoring of performance graphs, logs and query
      Schema considerations
    • 32. Schema considerations
      Lookups need to be indexed
      Sorting requires an index
      Joins need to be done on indices
      Become slower as tables grow
      Compounded indices should be used consistently
      Do not abuse indices
      Each index requires a disk write
      Compact tables if they become fragmented
      Deleted rows do not remove the corresponding index entries
    • 33. Monitoring DB performance
      Standard collectd statistics
      User vs wait time (disk operations)
      Performance of disk operations
      Scale up when working set cannot fit in memory
      MySQLcollectdplugin
      Monitor INSERT, SELECT, UPDATE operations
      The breakdown of read operations can indicate missing indices
      Monitoring /var/log/mysqlslow.log file
      Identify slow queries
      Use MySQL EXPLAIN command to identify query plan
    • 34. MySQLCollectdPlugin
      Uses MySQL SHOW STATUS command to collect statistics
      A large set of counters that are divided into 10 categories
      Connections
      IO Requests
      Select Rates
      Read Rates
      Key Rates
      Commands Rates
      Query Cache
      Tables
      Memory
      Misc.
    • 35. MySQLCollectdPlugin
      Uses MySQL SHOW STATUS command to collect statistics
    • 36. Mysqlslow.log & explain command
    • 37. MySQL performance depends on locality
      Wait time should be minimum when working set fits in memory
      Performance degrades once wait time is significant
      wait time insignificant
      user time dominates
    • 38. MySQL reads graphs
      Read-random-next represents a table scan
      Read-next represents an index scan
    • 39. Optimizing Your Cloud Applications in RightScale
      Load Testing
    • 40. Load testing using httperf
      RightScale provides ServerTemplates in the marketplace
      https://my.rightscale.com/library/server_templates/Httperf-Load-Tester/24714
      Tutorial on httperf setup and configuration
      http://support.rightscale.com/03-Tutorials/02-AWS/E2E_Examples/E2E_Gaming_Deployment/Adding_Httperf_Load_Tester
    • 41. Getting Started and Q&A
      Contact RightScale:
      Next up in the “I’m in the Cloud – Now What?” series:
      October 20
      Automating Servers in the Cloud
      • Darryl Eaton, Dir. Product Management, RightScale
      www.RightScale.com/now-what
      (866) 720-0208
      sales@rightscale.com
      www.rightscale.com
      More Info
      Webinar archive: RightScale.com/webinars
      Whitepapers: RightScale.com/whitepapers
      Free Edition: RightScale.com/free
      RightScale Conference
      Nov 8-9 in Santa Clara, CA
      www.RightScale.com/Conference
      • Attend technical breakout sessions
      • 42. Talk with RightScale customers
      • 43. Ask questions at the Genius Bar

    ×