Published on

Published in: Technology, Business
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide


  1. 1. Scalability & Availability Paul Greenfield
  2. 2. Building Real Systems <ul><li>Scalable </li></ul><ul><ul><li>Handle expected load with acceptable levels of performance </li></ul></ul><ul><ul><li>Grow easily when load grows </li></ul></ul><ul><li>Available </li></ul><ul><ul><li>Available enough of the time </li></ul></ul><ul><li>Performance and availability cost </li></ul><ul><ul><li>Aim for ‘enough’ of each but not more </li></ul></ul><ul><ul><li>Have to be ‘architected’ in… not added </li></ul></ul>
  3. 3. Scalable <ul><li>Scale-up or… </li></ul><ul><ul><li>Use bigger and faster systems </li></ul></ul><ul><li>… Scale-out </li></ul><ul><ul><li>Systems working together to handle load </li></ul></ul><ul><ul><ul><li>Server farms </li></ul></ul></ul><ul><ul><ul><li>Clusters </li></ul></ul></ul><ul><li>Implications for application design </li></ul><ul><ul><li>Especially state management </li></ul></ul><ul><ul><li>And availability as well </li></ul></ul>
  4. 4. Available <ul><li>Goal is 100% availability </li></ul><ul><ul><li>24x7 operations </li></ul></ul><ul><ul><li>Including time for maintenance </li></ul></ul><ul><li>Redundancy is the key to availability </li></ul><ul><ul><li>No single points of failure </li></ul></ul><ul><ul><li>Spare everything </li></ul></ul><ul><ul><ul><li>Disks, disk channels, processors, power supplies, fans, memory, .. </li></ul></ul></ul><ul><ul><ul><li>Applications, databases, … </li></ul></ul></ul><ul><ul><ul><ul><li>Hot standby, quick changeover on failure </li></ul></ul></ul></ul>
  5. 5. Performance <ul><li>How fast is this system? </li></ul><ul><ul><li>Not the same as scalability but related </li></ul></ul><ul><ul><li>Measured by response time and throughput </li></ul></ul><ul><li>How scalable is this system? </li></ul><ul><ul><li>Scalability is concerned with the upper limits to performance </li></ul></ul><ul><ul><li>How big can it grow? </li></ul></ul><ul><ul><li>How does it grow? (evenly, lumpily?) </li></ul></ul>
  6. 6. Performance Measures <ul><li>Response time </li></ul><ul><ul><li>What delay does the user see? </li></ul></ul><ul><ul><li>Instantaneous is good </li></ul></ul><ul><ul><ul><li>95% under 2 seconds is acceptable? </li></ul></ul></ul><ul><ul><ul><li>Consistency is important psychologically </li></ul></ul></ul><ul><ul><li>Response time varies with ‘heaviness’ of transactions </li></ul></ul><ul><ul><ul><li>Fast read-only transactions </li></ul></ul></ul><ul><ul><ul><li>Slower update transactions </li></ul></ul></ul><ul><ul><ul><li>Effects of resource/database contention </li></ul></ul></ul>
  7. 7. Response Time <ul><li>Each transaction takes… </li></ul><ul><ul><li>Processor time </li></ul></ul><ul><ul><ul><li>Application, system services, database, … </li></ul></ul></ul><ul><ul><ul><li>Shared amongst competing processes </li></ul></ul></ul><ul><ul><li>I/O time </li></ul></ul><ul><ul><ul><li>Largely disk reads/writes </li></ul></ul></ul><ul><ul><ul><li>Large DB caches reduce # of I/Os </li></ul></ul></ul><ul><ul><ul><ul><li>2TB in IBM’s top TPCC entry </li></ul></ul></ul></ul><ul><ul><li>Wait time for shared resources </li></ul></ul><ul><ul><ul><li>Locks, shared structures, … </li></ul></ul></ul>
  8. 8. Response Times
  9. 9. Response Times
  10. 10. Response Times
  11. 11. Throughput <ul><li>How many transactions can be handled in some period of time </li></ul><ul><ul><li>Transactions/second or tpm, tph or tpd </li></ul></ul><ul><ul><li>A measure of overall capacity </li></ul></ul><ul><ul><li>Inverse of response time </li></ul></ul><ul><li>Transaction Processing Council </li></ul><ul><ul><li>Standard benchmarks for TP systems </li></ul></ul><ul><ul><li> </li></ul></ul><ul><ul><li>TPC-C models typical transaction system </li></ul></ul><ul><ul><ul><li>Current record is 4,092,799 tpmc (HP) </li></ul></ul></ul><ul><ul><li>TPC-E approved as TPC-C replacement (2/07) </li></ul></ul>
  12. 12. Throughput <ul><li>Increases until resource saturation </li></ul><ul><ul><li>Start waiting for resources </li></ul></ul><ul><ul><ul><li>Processor, disk & network bandwidth </li></ul></ul></ul><ul><ul><ul><li>Increasing response time with load </li></ul></ul></ul><ul><ul><li>Slowly decreases with contention </li></ul></ul><ul><ul><ul><li>Overheads of sharing, interference </li></ul></ul></ul><ul><ul><li>Some resources share/overload badly </li></ul></ul><ul><ul><ul><li>Contention for shared locks </li></ul></ul></ul><ul><ul><ul><li>Ethernet network performance degrades </li></ul></ul></ul><ul><ul><ul><li>Disk degrades with sharing </li></ul></ul></ul>
  13. 13. Throughput
  14. 14. System Capacity? <ul><li>How many clients can you support? </li></ul><ul><ul><li>Name an acceptable response time </li></ul></ul><ul><ul><li>Average 95% under 2 secs is common </li></ul></ul><ul><ul><ul><li>And what is ‘average’? </li></ul></ul></ul><ul><ul><li>Plot response time vs # of clients </li></ul></ul><ul><li>Great if you can run benchmarks </li></ul><ul><ul><li>Reason for prototyping and proving proposed architectures before leaping into full-scale implementation </li></ul></ul>
  15. 15. System Capacity
  16. 16. Scaling Out <ul><li>More boxes at every level </li></ul><ul><ul><li>Web servers (handling user interface) </li></ul></ul><ul><ul><li>App servers (running business logic) </li></ul></ul><ul><ul><li>Database servers (perhaps… a bit tricky?) </li></ul></ul><ul><ul><li>Just add more boxes to handle more load </li></ul></ul><ul><li>Spread load out across boxes </li></ul><ul><ul><li>Load balancing at every level </li></ul></ul><ul><ul><li>Partitioning or replication for database? </li></ul></ul><ul><ul><li>Impact on application design? </li></ul></ul><ul><ul><li>Impact on system management </li></ul></ul><ul><ul><li>All have impacts on architecture & operations </li></ul></ul>
  17. 17. Scaling Out
  18. 18. ‘ Load Balancing’ <ul><li>A few different but related meanings </li></ul><ul><ul><li>Distributing client bindings across servers or processes </li></ul></ul><ul><ul><ul><li>Needed for stateful systems </li></ul></ul></ul><ul><ul><ul><li>Static allocation of client to server </li></ul></ul></ul><ul><ul><li>Balancing requests across server systems or processes </li></ul></ul><ul><ul><ul><li>Dynamically allocating requests to servers </li></ul></ul></ul><ul><ul><ul><li>Normally only done for stateless systems </li></ul></ul></ul>
  19. 19. Static Load Balancing Client Client Client Name Server Server process Load balancing across application process instances within a server Server process Advertise service Request server reference Return server reference Call server object’s methods Get server object reference
  20. 20. Load Balancing in CORBA <ul><li>Client calls on name server to find the location of a suitable server </li></ul><ul><ul><li>CORBA terminology for object directory </li></ul></ul><ul><li>Name server can spread client objects across multiple servers </li></ul><ul><ul><li>Often ‘round robin’ </li></ul></ul><ul><li>Client is bound to server and stays bound forever </li></ul><ul><ul><li>Can lead to performance problems if server loads are unbalanced </li></ul></ul>
  21. 21. Name Servers <ul><li>Server processes call name server as part of their initialisation </li></ul><ul><ul><li>Advertising their services/objects </li></ul></ul><ul><li>Clients call name server to find the location of a server process/object </li></ul><ul><ul><li>Up to the name server to match clients to servers </li></ul></ul><ul><li>Client then directly calls server process to create or link to objects </li></ul><ul><ul><li>Client-object binding usually static </li></ul></ul>
  22. 22. Dynamic Stateful? <ul><li>Dynamic load balancing with stateful servers/objects? </li></ul><ul><ul><li>Clients can throw away server objects and get new ones every now and again </li></ul></ul><ul><ul><ul><li>In application code or middleware </li></ul></ul></ul><ul><ul><ul><li>Have to save & restore state </li></ul></ul></ul><ul><ul><li>Or object replication in middleware </li></ul></ul><ul><ul><ul><li>Identical copies of objects on all servers </li></ul></ul></ul><ul><ul><ul><li>Replication of changes between servers </li></ul></ul></ul><ul><ul><ul><li>Clients have references to all copies </li></ul></ul></ul>
  23. 23. BEA WLS Load Balancing Clients Clients DBMS MACHINE B MACHINE A EJB Cluster HeartBeat via Multicast backbone EJB Servers instances EJB Servers instances
  24. 24. Threaded Servers <ul><li>No need for load-balancing within a single system </li></ul><ul><ul><li>Multithreaded server process </li></ul></ul><ul><ul><ul><li>Thread pool servicing requests </li></ul></ul></ul><ul><ul><li>All objects live in a single process space </li></ul></ul><ul><ul><li>Any request can be picked up by any thread </li></ul></ul><ul><li>Used by modern app servers </li></ul>
  25. 25. Threaded Servers Client Client Client App DLL COM+ COM+ process Thread pool Shared object space Application code COM+ using thread pools rather than load balancing within a single system
  26. 26. Dynamic Load Balancing <ul><li>Dynamically balance load across servers </li></ul><ul><ul><li>Requests from a client can go to any server </li></ul></ul><ul><li>Requests dynamically routed </li></ul><ul><ul><li>Often used for Web Server farms </li></ul></ul><ul><ul><li>IP sprayer (Cisco etc) </li></ul></ul><ul><ul><li>Network Load Balancer etc </li></ul></ul><ul><li>Routing decision has to be fast & reliable </li></ul><ul><ul><li>Routing in main processing path </li></ul></ul><ul><li>Applications normally stateless </li></ul>
  27. 27. Web Server Farms <ul><li>Web servers are highly scalable </li></ul><ul><ul><li>Web applications are normally stateless </li></ul></ul><ul><ul><ul><li>Next request can go to any Web server </li></ul></ul></ul><ul><ul><ul><li>State comes from client or database </li></ul></ul></ul><ul><ul><li>Just need to spread incoming requests </li></ul></ul><ul><ul><ul><li>IP sprayers (hardware, software) </li></ul></ul></ul><ul><ul><ul><li>Or >1 Web server looking at same IP address with some coordination </li></ul></ul></ul>
  28. 28. Clusters <ul><li>A group of independent computers acting like a single system </li></ul><ul><ul><li>Shared disks </li></ul></ul><ul><ul><li>Single IP address </li></ul></ul><ul><ul><li>Single set of services </li></ul></ul><ul><ul><li>Fail-over to other members of cluster </li></ul></ul><ul><ul><li>Load sharing within the cluster </li></ul></ul><ul><ul><li>DEC, IBM, MS, … </li></ul></ul>
  29. 29. Clusters Client PCs Server A Server B Disk cabinet A Disk cabinet B Heartbeat Cluster management
  30. 30. Clusters <ul><li>Address scalability </li></ul><ul><ul><li>Add more boxes to the cluster </li></ul></ul><ul><ul><li>Replication or shared storage </li></ul></ul><ul><li>Address availability </li></ul><ul><ul><li>Fail-over </li></ul></ul><ul><ul><li>Add & remove boxes from the cluster for upgrades and maintenance </li></ul></ul><ul><li>Can be used as one element of a highly-available system </li></ul>
  31. 31. Scaling State Stores? <ul><li>Scaling stateless logic is easy </li></ul><ul><li>… but how are state stores scaled? </li></ul><ul><li>Bigger, faster box (if this helps at all) </li></ul><ul><ul><li>Could hit lock contention or I/O limits </li></ul></ul><ul><li>Replication </li></ul><ul><ul><li>Multiple copies of shared data </li></ul></ul><ul><ul><li>Apps access their own state stores </li></ul></ul><ul><ul><li>Change anywhere & send to everyone </li></ul></ul>
  32. 32. Scaling State Stores <ul><li>Partitioning </li></ul><ul><ul><li>Multiple servers, each looking after a part of the state store </li></ul></ul><ul><ul><ul><li>Separate customers A-M & N-Z </li></ul></ul></ul><ul><ul><ul><li>Split customers according to state </li></ul></ul></ul><ul><ul><li>Preferably transparent to apps </li></ul></ul><ul><ul><ul><li>e.g. SQL/Server partitioned views </li></ul></ul></ul><ul><li>Or combination of these approaches </li></ul>
  33. 33. Scaling Out Summary Districts 11-20 Districts 1-10 Web server farm (Network Load Balancing) Application farm (Component Load Balancing) Database servers (Cluster Services and partitioning) UI tier Business tier Data tier
  34. 34. Scale-up <ul><li>No need for load-balancing </li></ul><ul><ul><li>Just use a bigger box </li></ul></ul><ul><ul><li>Add processors, memory, …. </li></ul></ul><ul><ul><li>SMP (symmetric multiprocessing) </li></ul></ul><ul><ul><li>May not fix problem! </li></ul></ul><ul><li>Runs into limits eventually </li></ul><ul><li>Could be less available </li></ul><ul><ul><li>What happens on failures? Redundancy? </li></ul></ul><ul><li>Could be easier to manage </li></ul>
  35. 35. Scale-up <ul><li>eBay example </li></ul><ul><ul><li>Server farm of Windows boxes (scale-out) </li></ul></ul><ul><ul><li>Single database server (scale-up) </li></ul></ul><ul><ul><ul><li>64-processor SUN box (max at time) </li></ul></ul></ul><ul><ul><li>More capacity needed? </li></ul></ul><ul><ul><ul><li>Easily add more boxes to Web farm </li></ul></ul></ul><ul><ul><ul><li>Faster DB box? (not available) </li></ul></ul></ul><ul><ul><ul><ul><li>More processors? (not possible) </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Split DB load across multiple DB servers? </li></ul></ul></ul></ul><ul><ul><li>See eBay presentation… </li></ul></ul>
  36. 36. Available System Web Clients Web Server farm Load balanced using WLB App Servers farm using COM+ LB Database installed on cluster for high availability
  37. 37. Availability <ul><li>How much? </li></ul><ul><ul><li>99% 87.6 hours a year </li></ul></ul><ul><ul><li>99.9% 8.76 hours a year </li></ul></ul><ul><ul><li>99.99% 0.876 hours a year </li></ul></ul><ul><li>Need to consider operations as well </li></ul><ul><ul><li>Not just faults and recovery time </li></ul></ul><ul><ul><li>Maintenance, software upgrades, backups, application changes </li></ul></ul>
  38. 38. Availability <ul><li>Often a question of application design </li></ul><ul><ul><li>Stateful vs stateless </li></ul></ul><ul><ul><ul><li>What happens if a server fails? </li></ul></ul></ul><ul><ul><ul><li>Can requests go to any server? </li></ul></ul></ul><ul><ul><li>Synchronous method calls or asynchronous messaging? </li></ul></ul><ul><ul><ul><li>Reduce dependency between components </li></ul></ul></ul><ul><ul><ul><li>Failure tolerant designs </li></ul></ul></ul><ul><ul><li>And manageability decisions to consider </li></ul></ul>
  39. 39. Redundancy=Availability <ul><li>Passive or active standby systems </li></ul><ul><ul><li>Re-route requests on failure </li></ul></ul><ul><ul><li>Continuous service (almost) </li></ul></ul><ul><ul><ul><li>Recover failed system while alternative handles workload </li></ul></ul></ul><ul><ul><ul><li>May be some hand-over time (db recovery?) </li></ul></ul></ul><ul><ul><ul><li>Active standby & log shipping reduce this </li></ul></ul></ul><ul><ul><ul><ul><li>At the expense of 2x system cost… </li></ul></ul></ul></ul><ul><li>What happens to in-flight work? </li></ul><ul><ul><li>State recovers by aborting in-flight ops & doing db recovery but … </li></ul></ul>
  40. 40. Transaction Recovery <ul><li>Could be handled by middleware </li></ul><ul><ul><li>Persistent queues of accepted requests </li></ul></ul><ul><ul><li>Still a failure window though </li></ul></ul><ul><li>Large role for client apps/users </li></ul><ul><ul><li>Did the request get lost on failure? </li></ul></ul><ul><ul><li>Retry on error? </li></ul></ul><ul><li>Large role for server apps </li></ul><ul><ul><li>What to do with duplicate requests? </li></ul></ul><ul><ul><li>Try for idempotency (repeated txns OK) </li></ul></ul><ul><ul><li>Or track and reject duplicates </li></ul></ul>
  41. 41. Fragility <ul><li>Large, distributed, synchronous systems are not robust </li></ul><ul><ul><li>Many independent systems & links… </li></ul></ul><ul><ul><ul><li>Everything always has to be working </li></ul></ul></ul><ul><ul><li>Rationale for Asynchronous Messaging </li></ul></ul><ul><ul><ul><li>Loosen ‘coupling’ between components </li></ul></ul></ul><ul><ul><ul><li>Rely on guaranteed delivery instead </li></ul></ul></ul><ul><ul><ul><li>May just defer error handling though </li></ul></ul></ul><ul><ul><ul><ul><li>Could be much harder to handle later </li></ul></ul></ul></ul><ul><ul><ul><li>To be discussed next time… </li></ul></ul></ul>
  42. 42. Example