High Availability and Scalability: Too Expensive! Architectures for Future Enterprise Systems

2,122 views
1,980 views

Published on

High availability and scalability used to be solved in hardware - but that is quite expensive. This presentation shows how modern technologies like virtualization, cloud, NoSQL and new software architectures provide new and cheaper solutions - that are probably also even better than the traditional approaches.

Published in: Technology
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
2,122
On SlideShare
0
From Embeds
0
Number of Embeds
27
Actions
Shares
0
Downloads
44
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide

High Availability and Scalability: Too Expensive! Architectures for Future Enterprise Systems

  1. 1. High Availability and Scalability: Too Expensive!– Architectures for Future Enterprise Systems Eberhard Wolff Freelance Consultant / Trainer
 Head Technolocy Advisory Board adesso AG Eberhard Wolff - @ewolff
  2. 2. The Dream Foto: http://www.vaxman.de/ Eberhard Wolff - @ewolff
  3. 3. Eberhard Wolff - @ewolff
  4. 4. Eberhard Wolff - @ewolff
  5. 5. Eberhard Wolff - @ewolff
  6. 6. Where Are We? Eberhard Wolff - @ewolff
  7. 7. Non-functional Requirements Eberhard Wolff - @ewolff
  8. 8. Availability Performance Eberhard Wolff - @ewolff
  9. 9. Availability Performance Eberhard Wolff - @ewolff
  10. 10. Availability:
 Traditional Approach Eberhard Wolff - @ewolff
  11. 11. •  Buy highly reliable hardware •  Built a small cluster •  2 machines •  Maybe add a stand-by data center Eberhard Wolff - @ewolff
  12. 12. •  Eventually system will fail •  …and you are in real trouble Eberhard Wolff - @ewolff
  13. 13. True Story •  •  •  •  “Machine rebooted over night.” “Several times.” “No idea how often.” “No idea why…” Eberhard Wolff - @ewolff
  14. 14. Let’s look at an example Eberhard Wolff - @ewolff
  15. 15. Eberhard Wolff - @ewolff
  16. 16. •  Server fails •  Application fails •  No service to the customer •  Can we do better? Eberhard Wolff - @ewolff
  17. 17. Eberhard Wolff - @ewolff
  18. 18. What You Have Just Seen Eberhard Wolff - @ewolff
  19. 19. •  Failing systems do not impact user •  Failing systems are just restarted •  Restarts happen automatically •  System run in different data centers •  i.e. eu-west-1a / b / c Eberhard Wolff - @ewolff
  20. 20. System EU West 1a Elastic Load Balancer System EU West 1b System EU West 1c Eberhard Wolff - @ewolff
  21. 21. What It Takes… •  Virtualization •  +API to start new servers •  Watchdog to detect failed servers •  Redundant data centers if needed Eberhard Wolff - @ewolff
  22. 22. Can be implemented in your datacenter! I have none. So I used the Amazon Cloud Eberhard Wolff - @ewolff
  23. 23. Alternatives Eberhard Wolff - @ewolff
  24. 24. Hardware •  As cheap as it gets •  Not highly available •  Availability in Software Eberhard Wolff - @ewolff
  25. 25. Traditional Servers Eberhard Wolff - @ewolff
  26. 26. Traditional Servers Eberhard Wolff - @ewolff
  27. 27. Highly customized Hard to reproduce Eberhard Wolff - @ewolff
  28. 28. •  Depends on details •  True story: •  Order of patch installations matter Eberhard Wolff - @ewolff
  29. 29. Stateful Eberhard Wolff - @ewolff
  30. 30. Redundancy in Hardware Eberhard Wolff - @ewolff
  31. 31. Traditional Servers Eberhard Wolff - @ewolff
  32. 32. Phoenix Servers Eberhard Wolff - @ewolff
  33. 33. Easy to create a new server Eberhard Wolff - @ewolff
  34. 34. Reliably reproducible Eberhard Wolff - @ewolff
  35. 35. Stateless Eberhard Wolff - @ewolff
  36. 36. Stateless •  No data is lost •  New server can take load immediately Eberhard Wolff - @ewolff
  37. 37. Redundancy in Software Eberhard Wolff - @ewolff
  38. 38. Implementations •  Might use a VM image •  …or a PaaS •  …or provisioning tools Eberhard Wolff - @ewolff
  39. 39. Provisioning Tools Eberhard Wolff - @ewolff
  40. 40. •  Easy to create test environments •  …with other software version Eberhard Wolff - @ewolff
  41. 41. Chaos Monkey •  Tool by Netflix •  Video streaming •  #1 in Internet usage in the US Eberhard Wolff - @ewolff
  42. 42. Chaos Monkey •  Kill random machines •  To ensure system survives hardware failures Eberhard Wolff - @ewolff
  43. 43. Would you rather rely on… …highly available hardware …or a Chaos Monkey tested system? Eberhard Wolff - @ewolff
  44. 44. Resilience Eberhard Wolff - @ewolff
  45. 45. Availability Performance Eberhard Wolff - @ewolff
  46. 46. Availability Performance Eberhard Wolff - @ewolff
  47. 47. Performance: Traditional Approach Eberhard Wolff - @ewolff
  48. 48. •  •  •  •  •  Estimate #Users Use Cases Data volume Etc. •  Add a little bit •  Order servers Eberhard Wolff - @ewolff
  49. 49. Performance: Problems Eberhard Wolff - @ewolff
  50. 50. Problem: Estimate & Scaling •  Performance hard to estimate •  Coarse grained scaling •  Backfires Eberhard Wolff - @ewolff
  51. 51. True Story •  •  •  •  •  •  •  Initial estimate wrong Just need a little more Cluster: two servers Add one About 50% higher costs Order / install server takes time Bad performance until server delivered Eberhard Wolff - @ewolff
  52. 52. Problem: Load Peak •  Business has load peaks •  i.e. events that people register for •  Need to have enough hardware for load peaks •  Costly Eberhard Wolff - @ewolff
  53. 53. Problem: Testing •  Testing •  Need production-like infrastructure •  Prohibitive costs •  Only needed during tests Eberhard Wolff - @ewolff
  54. 54. Eberhard Wolff - @ewolff
  55. 55. System EU West 1b Elastic Load Balancer System EU West 1c System EU West 1c System EU West 1c Eberhard Wolff - @ewolff
  56. 56. What You Have Just Seen •  System tunes itself depending on load •  Same approach as for availability •  +Watchdog for load Eberhard Wolff - @ewolff
  57. 57. Easy to create a new server Redundancy in Software Reliably reproducible ✔ ✔ ✔ Stateless ? Eberhard Wolff - @ewolff
  58. 58. Stateless •  Stateless web servers: best practice •  Some Java framework don’t follow the approach •  Can store HTTP session externally •  i.e. RDBMS, NoSQL, Cache Eberhard Wolff - @ewolff
  59. 59. What about Databases? Eberhard Wolff - @ewolff
  60. 60. Databases •  Often assumed to be just “fast and scalable” •  Large scale doable i.e. Data Warehouse •  Often use traditional approach •  Cluster with two nodes •  Highly available hardware Eberhard Wolff - @ewolff
  61. 61. Database: Problems •  Availability •  Highly available hardware •  Performance •  Limited scaling •  Costly Eberhard Wolff - @ewolff
  62. 62. Databases •  New approaches •  Used by NoSQL databases •  But also i.e. MySQL •  …or in system architecture Eberhard Wolff - @ewolff
  63. 63. Databases •  Replication •  Read performance •  Availability •  Sharding •  Spread data across servers •  Write performance Eberhard Wolff - @ewolff
  64. 64. Scaling MongoDB Replica 1 Replica 1 Replica 2 Replica 2 Replica 3 Replica 3 Shard 1 Shard 2 Eberhard Wolff - @ewolff
  65. 65. Availability Replica 1 Replica 1 Replica 2 Replica 2 Replica 3 Replica 3 Shard 1 Shard 2 Eberhard Wolff - @ewolff
  66. 66. Scaling MongoDB Replica 1 Replica 1 Replica 1 Replica 2 Replica 2 Replica 2 Replica 3 Replica 3 Replica 3 Shard 1 Shard 2 Shard 3 Eberhard Wolff - @ewolff
  67. 67. Scaling MongoDB Replica 1 Replica 2 Replica 1 ? Replica 2 Replica 3 Replica 3 Shard 1 Shard 2 Eberhard Wolff - @ewolff
  68. 68. Replicas & Shards •  Easy to understand •  But: Coarse grained scaling •  Adding another shard means •  Moving lots of data •  Add quite some servers Eberhard Wolff - @ewolff
  69. 69. Amazon Dynamo Model Server A Shard3 Shard1 Server B Shard1 Shard2 Shard4 Shard4 Server D Shard2 Shard4 Server C Shard2 Shard3 Shard3 Shard1 Eberhard Wolff - @ewolff
  70. 70. Amazon Dynamo Model Server A Shard3 Shard1 Server B Shard1 Shard2 Shard4 Shard4 Server D Shard2 Shard4 Server C Shard2 Shard3 Shard3 Shard1 Eberhard Wolff - @ewolff
  71. 71. Amazon Dynamo Model Server A Shard3 Shard1 Server B Shard1 Shard2 Shard4 Shard4 New Server Server D Shard2 Shard4 Server C Shard2 Shard3 Shard3 Shard1 Eberhard Wolff - @ewolff
  72. 72. Amazon Dynamo Model •  Published in the Dynamo paper •  Implementations: Riak, Cassandra etc •  Fine grained scaling •  Can immediately write to new node Eberhard Wolff - @ewolff
  73. 73. Hardware •  Not highly reliable •  Scales by distributing load across servers •  No NAS, SAN, RAID… •  As cheap as it gets Eberhard Wolff - @ewolff
  74. 74. Sum Up •  •  •  •  •  •  •  Virtualization + Phoenix server = Better availability = Better performance = Lower costs Stateless servers NoSQL Eberhard Wolff - @ewolff
  75. 75. Thank You! Eberhard Wolff - @ewolff

×