High Availability and Scalability: Too Expensive! Architectures for Future Enterprise Systems

  • 1,508 views
Uploaded on

High availability and scalability used to be solved in hardware - but that is quite expensive. This presentation shows how modern technologies like virtualization, cloud, NoSQL and new software …

High availability and scalability used to be solved in hardware - but that is quite expensive. This presentation shows how modern technologies like virtualization, cloud, NoSQL and new software architectures provide new and cheaper solutions - that are probably also even better than the traditional approaches.

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
1,508
On Slideshare
0
From Embeds
0
Number of Embeds
5

Actions

Shares
Downloads
32
Comments
0
Likes
3

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. High Availability and Scalability: Too Expensive!– Architectures for Future Enterprise Systems Eberhard Wolff Freelance Consultant / Trainer
 Head Technolocy Advisory Board adesso AG Eberhard Wolff - @ewolff
  • 2. The Dream Foto: http://www.vaxman.de/ Eberhard Wolff - @ewolff
  • 3. Eberhard Wolff - @ewolff
  • 4. Eberhard Wolff - @ewolff
  • 5. Eberhard Wolff - @ewolff
  • 6. Where Are We? Eberhard Wolff - @ewolff
  • 7. Non-functional Requirements Eberhard Wolff - @ewolff
  • 8. Availability Performance Eberhard Wolff - @ewolff
  • 9. Availability Performance Eberhard Wolff - @ewolff
  • 10. Availability:
 Traditional Approach Eberhard Wolff - @ewolff
  • 11. •  Buy highly reliable hardware •  Built a small cluster •  2 machines •  Maybe add a stand-by data center Eberhard Wolff - @ewolff
  • 12. •  Eventually system will fail •  …and you are in real trouble Eberhard Wolff - @ewolff
  • 13. True Story •  •  •  •  “Machine rebooted over night.” “Several times.” “No idea how often.” “No idea why…” Eberhard Wolff - @ewolff
  • 14. Let’s look at an example Eberhard Wolff - @ewolff
  • 15. Eberhard Wolff - @ewolff
  • 16. •  Server fails •  Application fails •  No service to the customer •  Can we do better? Eberhard Wolff - @ewolff
  • 17. Eberhard Wolff - @ewolff
  • 18. What You Have Just Seen Eberhard Wolff - @ewolff
  • 19. •  Failing systems do not impact user •  Failing systems are just restarted •  Restarts happen automatically •  System run in different data centers •  i.e. eu-west-1a / b / c Eberhard Wolff - @ewolff
  • 20. System EU West 1a Elastic Load Balancer System EU West 1b System EU West 1c Eberhard Wolff - @ewolff
  • 21. What It Takes… •  Virtualization •  +API to start new servers •  Watchdog to detect failed servers •  Redundant data centers if needed Eberhard Wolff - @ewolff
  • 22. Can be implemented in your datacenter! I have none. So I used the Amazon Cloud Eberhard Wolff - @ewolff
  • 23. Alternatives Eberhard Wolff - @ewolff
  • 24. Hardware •  As cheap as it gets •  Not highly available •  Availability in Software Eberhard Wolff - @ewolff
  • 25. Traditional Servers Eberhard Wolff - @ewolff
  • 26. Traditional Servers Eberhard Wolff - @ewolff
  • 27. Highly customized Hard to reproduce Eberhard Wolff - @ewolff
  • 28. •  Depends on details •  True story: •  Order of patch installations matter Eberhard Wolff - @ewolff
  • 29. Stateful Eberhard Wolff - @ewolff
  • 30. Redundancy in Hardware Eberhard Wolff - @ewolff
  • 31. Traditional Servers Eberhard Wolff - @ewolff
  • 32. Phoenix Servers Eberhard Wolff - @ewolff
  • 33. Easy to create a new server Eberhard Wolff - @ewolff
  • 34. Reliably reproducible Eberhard Wolff - @ewolff
  • 35. Stateless Eberhard Wolff - @ewolff
  • 36. Stateless •  No data is lost •  New server can take load immediately Eberhard Wolff - @ewolff
  • 37. Redundancy in Software Eberhard Wolff - @ewolff
  • 38. Implementations •  Might use a VM image •  …or a PaaS •  …or provisioning tools Eberhard Wolff - @ewolff
  • 39. Provisioning Tools Eberhard Wolff - @ewolff
  • 40. •  Easy to create test environments •  …with other software version Eberhard Wolff - @ewolff
  • 41. Chaos Monkey •  Tool by Netflix •  Video streaming •  #1 in Internet usage in the US Eberhard Wolff - @ewolff
  • 42. Chaos Monkey •  Kill random machines •  To ensure system survives hardware failures Eberhard Wolff - @ewolff
  • 43. Would you rather rely on… …highly available hardware …or a Chaos Monkey tested system? Eberhard Wolff - @ewolff
  • 44. Resilience Eberhard Wolff - @ewolff
  • 45. Availability Performance Eberhard Wolff - @ewolff
  • 46. Availability Performance Eberhard Wolff - @ewolff
  • 47. Performance: Traditional Approach Eberhard Wolff - @ewolff
  • 48. •  •  •  •  •  Estimate #Users Use Cases Data volume Etc. •  Add a little bit •  Order servers Eberhard Wolff - @ewolff
  • 49. Performance: Problems Eberhard Wolff - @ewolff
  • 50. Problem: Estimate & Scaling •  Performance hard to estimate •  Coarse grained scaling •  Backfires Eberhard Wolff - @ewolff
  • 51. True Story •  •  •  •  •  •  •  Initial estimate wrong Just need a little more Cluster: two servers Add one About 50% higher costs Order / install server takes time Bad performance until server delivered Eberhard Wolff - @ewolff
  • 52. Problem: Load Peak •  Business has load peaks •  i.e. events that people register for •  Need to have enough hardware for load peaks •  Costly Eberhard Wolff - @ewolff
  • 53. Problem: Testing •  Testing •  Need production-like infrastructure •  Prohibitive costs •  Only needed during tests Eberhard Wolff - @ewolff
  • 54. Eberhard Wolff - @ewolff
  • 55. System EU West 1b Elastic Load Balancer System EU West 1c System EU West 1c System EU West 1c Eberhard Wolff - @ewolff
  • 56. What You Have Just Seen •  System tunes itself depending on load •  Same approach as for availability •  +Watchdog for load Eberhard Wolff - @ewolff
  • 57. Easy to create a new server Redundancy in Software Reliably reproducible ✔ ✔ ✔ Stateless ? Eberhard Wolff - @ewolff
  • 58. Stateless •  Stateless web servers: best practice •  Some Java framework don’t follow the approach •  Can store HTTP session externally •  i.e. RDBMS, NoSQL, Cache Eberhard Wolff - @ewolff
  • 59. What about Databases? Eberhard Wolff - @ewolff
  • 60. Databases •  Often assumed to be just “fast and scalable” •  Large scale doable i.e. Data Warehouse •  Often use traditional approach •  Cluster with two nodes •  Highly available hardware Eberhard Wolff - @ewolff
  • 61. Database: Problems •  Availability •  Highly available hardware •  Performance •  Limited scaling •  Costly Eberhard Wolff - @ewolff
  • 62. Databases •  New approaches •  Used by NoSQL databases •  But also i.e. MySQL •  …or in system architecture Eberhard Wolff - @ewolff
  • 63. Databases •  Replication •  Read performance •  Availability •  Sharding •  Spread data across servers •  Write performance Eberhard Wolff - @ewolff
  • 64. Scaling MongoDB Replica 1 Replica 1 Replica 2 Replica 2 Replica 3 Replica 3 Shard 1 Shard 2 Eberhard Wolff - @ewolff
  • 65. Availability Replica 1 Replica 1 Replica 2 Replica 2 Replica 3 Replica 3 Shard 1 Shard 2 Eberhard Wolff - @ewolff
  • 66. Scaling MongoDB Replica 1 Replica 1 Replica 1 Replica 2 Replica 2 Replica 2 Replica 3 Replica 3 Replica 3 Shard 1 Shard 2 Shard 3 Eberhard Wolff - @ewolff
  • 67. Scaling MongoDB Replica 1 Replica 2 Replica 1 ? Replica 2 Replica 3 Replica 3 Shard 1 Shard 2 Eberhard Wolff - @ewolff
  • 68. Replicas & Shards •  Easy to understand •  But: Coarse grained scaling •  Adding another shard means •  Moving lots of data •  Add quite some servers Eberhard Wolff - @ewolff
  • 69. Amazon Dynamo Model Server A Shard3 Shard1 Server B Shard1 Shard2 Shard4 Shard4 Server D Shard2 Shard4 Server C Shard2 Shard3 Shard3 Shard1 Eberhard Wolff - @ewolff
  • 70. Amazon Dynamo Model Server A Shard3 Shard1 Server B Shard1 Shard2 Shard4 Shard4 Server D Shard2 Shard4 Server C Shard2 Shard3 Shard3 Shard1 Eberhard Wolff - @ewolff
  • 71. Amazon Dynamo Model Server A Shard3 Shard1 Server B Shard1 Shard2 Shard4 Shard4 New Server Server D Shard2 Shard4 Server C Shard2 Shard3 Shard3 Shard1 Eberhard Wolff - @ewolff
  • 72. Amazon Dynamo Model •  Published in the Dynamo paper •  Implementations: Riak, Cassandra etc •  Fine grained scaling •  Can immediately write to new node Eberhard Wolff - @ewolff
  • 73. Hardware •  Not highly reliable •  Scales by distributing load across servers •  No NAS, SAN, RAID… •  As cheap as it gets Eberhard Wolff - @ewolff
  • 74. Sum Up •  •  •  •  •  •  •  Virtualization + Phoenix server = Better availability = Better performance = Lower costs Stateless servers NoSQL Eberhard Wolff - @ewolff
  • 75. Thank You! Eberhard Wolff - @ewolff