Distributed Systems: scalability and high availability


Published on

QconSP 2010

Published in: Technology

Distributed Systems: scalability and high availability

  1. 1. Distributed Systems scalability and high availability Renato Lucindo - lucindo.github.com - @rlucindo
  2. 2. Renato Lucindo Call me Lucindo (or Linus) 2002 - Bachelor Computer Science 2007 - M.Sc. Computer Science (Combinatorial Optimization) 7+ year developing Distributed Systems My default answer: "I don't know."
  3. 3. Agenda Scalability High Availability Problems Tips and Tricks Learning More
  4. 4. Distributed Systems Multiple computers that interact with each other over a network to achieve a common goal Purpose Scalability High availability source: http://www.cnds.jhu.edu/
  5. 5. Scalability System ability to handle gracefully a growing amount of work Scale up (vertical) Add resources to a single node Improve existing code to handle more work Scale out (horizontal) Add more nodes to a system Linear (or better) scalability
  6. 6. Scalability - Vertical Add: CPU, Memory, Disks (bigger box) Handling more simultaneous: Connections Operations Users Choose a good I/O and concurrency model Non-blocking I/O Asynchronous I/O Threads (single, pool, per-connection) Event handling patterns (Reactor, Proactor, ...) Memory model? STM
  7. 7. Scalability - Vertical Careful with numbers Requests per second # of Connections Simultaneous operations Event handling Think front-end Slow connections/clients It's slower than other options In doubt, go async Back-end Thread pool (thread per-connection) No events Process per-core
  8. 8. Scalability - Horizontal Add nodes to handle more work Front-end Straightforward Stateless Back-end Master/Slave(s) Partitioning DHT Volatile Index
  9. 9. Scalability - Horizontal Master/Slave Write on single Master Read on Slaves (one or more) Scales reads
  10. 10. Scalability - Horizontal Partitioning (Sharding) Distribute dada across nodes Generally involves data de-normalization Where is some specific data? Master Index Hash (DTH, Consistent Hashing) Volatile Index Joins done in application level NoSQL friendly
  11. 11. Scalability - Horizontal Volatile Index: build and maintain data index as cached information (all clients)
  12. 12. High Availability "Processes, as well as people, die" Handle hardware and software failures Eliminate single point of failure Redundancy Failover Replicas
  13. 13. High Availability - Failover/Redundancy
  14. 14. High Availability - Replicas Two or more copies of same data Replica granularity From node replica to "row" replica Load balancing Write concurrency Replica updates Key for high availability and root of several problems
  15. 15. Problems
  16. 16. Problems - CAP Theorem
  17. 17. Problems - CAP Theorem Consistency: all operations (reads/writes) yield a global consistent state Availability: all requests (on non-failed servers) must have a response Partition Tolerance: nodes may not be able to communicate with each other. Pick Two
  18. 18. Problems - CAP Theorem C + A: network problems might stop the system Examples: Oracle RAC, IBM DB2 Parallel RDBMS (Master/Slave) Google File System HDFS (Hadoop)
  19. 19. Problems - CAP Theorem C + P: clients can't always perform operations Examples: Distributed lock-systems: Chubby, ZooKeeper Paxos protocol (consensus) BigTable, Hbase Hypertable MongoDB
  20. 20. Problems - CAP Theorem A + P: clients may read inconsistent (old or undone) data Examples: Amazon Dynamo Cassandra Voldemort CouchDB Riak Caches
  21. 21. Problem with CAP Theorem In practice, C + A and C + P systems are the same. C + A: not tolerant of network partitions C + P: not available when a network partition occurs Big problem: network partition Not so big (how often does it happens?) Pick two Availability Consistency The forgotten: Latency Or, how long the system waits before considering a partitioned network?
  22. 22. Problems - Real World Every component may fail: Network failure Hardware failure Electricity Natural disasters Code failure
  23. 23. Tips & Tricks
  24. 24. Tips & Tricks - Pyramid Capacity (connections, operations, ...) Pyramid
  25. 25. Tips & Tricks - Reply Fast FAIL Fast Break complex requests into smaller ones Use timeouts No transactions Be aware that a single slow operation or component can generate contention Self-denial attack
  26. 26. Tips & Tricks - Cache Cache: component location, data, dns lookups, previous requests, etc Use negative cache for failed requests (low expiration) Don't rely on cache Your system must work with no cache
  27. 27. Tips & Tricks - Queues Easy way to add asynchronous processing an decouple your system.
  28. 28. Tips & Tricks - DNS
  29. 29. Tips & Tricks - Logs Log everything Use several log levels On every log message User Request host Component involved Version Filename and line If log level not enabled do not process log message Avoid lookup calls (gettimeofday)
  30. 30. Tips & Tricks - Domino Effect Make sure your load balancer won't overload components User smart algorithms Load Balance Resource Allocation
  31. 31. Tips & Tricks - (Zero) Configuration No configuration files Use good defaults Auto-discovery (multicast, gossip, ...) Make everything configurable Administrative command No need to stop for changes Automatic self adjusts when possible
  32. 32. Tips & Tricks - STOP Test With your system under load: kill -STOP <component>
  33. 33. Tips & Tricks - Know your tools load average (uptime) stats tools vmstat iostat mpstat tcpstat, tcprstat, etc tcpdump, nc, netstat tunning /proc/net/* ulimit sysctl oprofile debuging tools (gdb, valgrind) ...
  34. 34. Tips & Tricks - Count Count everything Connections Operations Failures Successes Request times (granularity) Total, average, standard deviation Monitor counters
  35. 35. Tips & Tricks - Stability Patterns Use Timeouts Circuit Breaker Bulkheads Steady State Fail Fast Handshaking Test Harness Decoupling Middleware
  36. 36. Tips & Tricks - Don't Panic!
  37. 37. Learning More - Books TCP/IP Illustrated, Vol. 1: The Protocols
  38. 38. Learning More - Books Unix Network Programming, Vol. 1: The Sockets Networking
  39. 39. Learning More - Books Pattern Oriented Software Architecture, Vol. 2
  40. 40. Learning More - Books Release It!
  41. 41. Learning More - Papers The Google File System Bigtable: A Distributed Storage System for Structured Data Dynamo: Amazon's Highly Available Key-Value Store PNUTS: Yahoo!’s Hosted Data Serving Platform MapReduce: Simplified Data Processing on Large Clusters Towards robust distributed systems Brewer's conjecture and the feasibility of consistent, available, partition-tolerant web services BASE: An Acid Alternative Looking up data in P2P systems
  42. 42. Thanks!!! Questions? lucindo.github.com - @rlucindo