Performance and Scalability Tuning


Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Performance and Scalability Tuning

  1. 1. Scalable PerformanceBuilding enterprise-scale web applications that perform
  2. 2. Scalability vs. Performance • Ratio of the increase in • Serving a single request in the throughput to an increase in shortest amount of time resources • Inverse of latency • Support additional users at the least incremental cost • Predictability of application behavior as users are added
  3. 3. Scalable Performance • Number of requests that can be concurrently served (throughput) while meeting a minimum level of service (response time) • Measuring: o Resource utilization o Throughput o Response time
  4. 4. Horizontal vs. Vertical Scaling • Increase the hardware resources • Improve hardware capabilities • Separate types of processing into (cpu, RAM, storage, etc…) tiers • No network bottleneck • Commodity hardware is a • Becomes increasingly expensive predictable cost per user • Practical and financial limitations • Limitations to scaling are dictated to the ability to scale by application architecture • Typically increases performance • Can degrade performance
  5. 5. Horizontal vs. Vertical Scaling
  6. 6. General Observations • Performance decreases in each later tier (LB > web > app > DB) due to increasing complexity o Service requests in the earliest possible tier • Costs of scalability increases in each later tier (LB < web < app < DB) o Architect bottlenecks in the earliest possible tier • Scalable performance is ultimately limited by the operations that do not scale linearly • Ideally, each request that makes it to the database tier would have its own connection o Realistically, this means serving requests in earlier tiers because of constraints to db scaling
  7. 7. JEE Request Processing
  8. 8. Application Bottlenecks • Thread starvation • Thread contention • IO contention • IO performance • Memory limitations • Data access
  9. 9. Resource Utilization
  10. 10. Resource Capacity Settings • Database CPUs • Database connection pool • Application server CPUs • Application server thread pool • JVM Heap settings • Web server thread pool
  11. 11. Resource Capacity Settings • Walk through the application architecture and identify the points where a request could potentially wait. • Open all wait points. • Generate balanced and representative load against the environment. • Identify the limiting wait point’s saturation point. • Tighten all wait points to facilitate only the maximum load of the limiting wait point. • Force all pending requests to wait at the Web server. • Add more resources.
  12. 12. Profiling • Long running http requests • Long running methods • Memory leaks • Deadlocks • Long running queries
  13. 13. Database Tier • System of record • Difficult to scale horizontally and expensive to scale vertically • Keep connections limited to what the server will support o Block at the app tier • Perform data processing on staging server • Each database has its own optimization techniques o Explain plan to locate and eliminate full table scans o Query and table caches o Buffer sizes
  14. 14. Application Tier • Generally cannot be stateless due to security and business requirements • Sticky vs. clustered sessions o Use the HTTPSession sparingly o If the app does not need to be HA, it may not require clean failover and can drop sessions • Scaling horizontally could potentially put more load on the DB o Caching can be used to offset the load
  15. 15. Application Caching • Read-only data can be cached like static data in the web tier • Write-able data can be cached but will impose limitations on clustering o Will probably either need to be in-sync across the cluster or turned off completely • Filters can provide caching of dynamic, secure data at the earliest point in this tier o Caches entire response o Not recommended for user-specific data • Service layer or data access caches provide a simple way to stop requests from continuing to the database o Transparent to the calling code o Cache interceptors
  16. 16. Tomcat Tuning • maxThreads controls actively served connections • backlog controls the number of connections that can be queued • maxThreads + backlog = total accepted connections • connectionTimeout can be used to drop faulty connections • bufferSize is by default set to -1, no buffering of output • Keep heap size manageable, < 2GB
  17. 17. Web Tier • Clustering is easiest in the web tier because they are generally stateless • Web tier clustering can provide super-linear scaling (IO contention, context switching costs) • The web tier should serve all static data (images, static html) • The web tier can serve dynamic requests by caching non- secure data that only depends upon url parameters o Squid reverse proxy o Apache mod cache o Memcached
  18. 18. Apache Tuning • Limit connections in the web tier to prevent overloading later tiers (MaxClients) • ServerLimit x Memory per process < RAM available to limit swapping • Avg connections = ThreadsPerChild x Apache hosts / App Server hosts • ProxyPass max = ThreadsPerChild
  19. 19. Other • Grid Caches o GigaSpaces o Coherence • CDN o Akamai • Compute Appliances o Azul Systems
  20. 20. References • ations/JIA-HASP.pdf • • • • caching.html • • ecenter.tss?l=ProJavaEE_Ch06