Writing Scalable Software in Java

15,950 views
15,541 views

Published on

Writing Scalable Software in Java - From multi-core to grid-computing

Published in: Technology, News & Politics
4 Comments
28 Likes
Statistics
Notes
No Downloads
Views
Total views
15,950
On SlideShare
0
From Embeds
0
Number of Embeds
1,926
Actions
Shares
0
Downloads
606
Comments
4
Likes
28
Embeds 0
No embeds

No notes for slide
  • Writing Scalable Software in Java

    1. 1. Writing Scalable Software in Java From multi-core to grid-computing
    2. 2. Me • Ruben Badaró • Dev Expert at Changingworlds/Amdocs • PT.JUG Leader • http://www.zonaj.org
    3. 3. What this talk is not about • Sales pitch • Cloud Computing • Service Oriented Architectures • Java EE • How to write multi-threaded code
    4. 4. Summary • Define Performance and Scalability • Vertical Scalability - scaling up • Horizontal Scalability - scaling out • Q&A
    5. 5. Performance != Scalability
    6. 6. Performance Amount of useful work accomplished by a computer system compared to the time and resources used
    7. 7. Scalability Capability of a system to increase the amount of useful work as resources and load are added to the system
    8. 8. Scalability • A system that performs fast with 10 users might not do so with 1000 - it doesn’t scale • Designing for scalability always decreases performance
    9. 9. Linear Scalability Throughput Resources
    10. 10. Reality is sub-linear Throughput Resources
    11. 11. Amdahl’s Law
    12. 12. Scalability is about parallelizing • Parallel decomposition allows division of work • Parallelizing might mean more work • There’s almost always a part of serial computation
    13. 13. Vertical Scalability
    14. 14. Vertical Scalability Somewhat hard
    15. 15. Vertical Scalability Scale Up • Bigger, meaner machines - More cores (and more powerful) - More memory - Faster local storage • Limited - Technical constraints - Cost - big machines get exponentially expensive
    16. 16. Shared State • Need to use those cores • Java - shared-state concurrency - Mutable state protected with locks - Hard to get right - Most developers don’t have experience writing multithreaded code
    17. 17. This is how they look like public static synchronized SomeObject getInstance() { return instance; } public SomeObject doConcurrentThingy() { synchronized(this) { //... } return ..; }
    18. 18. Single vs Multi-threaded • Single-threaded - No scheduling cost - No synchronization cost • Multi-threaded - Context Switching (high cost) - Memory Synchronization (memory barriers) - Blocking
    19. 19. Lock Contention Little’s Law The average number of customers in a stable system is equal to their average arrival rate multiplied by their average time in the system
    20. 20. Reducing Contention • Reduce lock duration • Reduce frequency with which locks are requested (stripping) • Replace exclusive locks with other mechanisms - Concurrent Collections - ReadWriteLocks - Atomic Variables - Immutable Objects
    21. 21. Concurrent Collections • Use lock stripping • Includes putIfAbsent() and replace() methods • ConcurrentHashMap has 16 separate locks by default • Don’t reinvent the wheel
    22. 22. ReadWriteLocks • Pair of locks • Read lock can be held by multiple threads if there are no writers • Write lock is exclusive • Good improvements if object as fewer writers
    23. 23. Atomic Variables • Allow to make check-update type of operations atomically • Without locks - use low-level CPU instructions • It’s volatile on steroids (visibility + atomicity)
    24. 24. Immutable Objects • Immutability makes concurrency simple - thread- safety guaranteed • An immutable object is: - final - fields are final and private - Constructor constructs the object completely - No state changing methods - Copy internal mutable objects when receiving or returning
    25. 25. JVM issues • Caching is useful - storing stuff in memory • Larger JVM heap size means longer garbage collection times • Not acceptable to have long pauses • Solutions - Maximum size for heap 2GB/4GB - Multiple JVMs per machine - Better garbage collectors: G1 might help
    26. 26. Scaling Up: Other Approaches • Change the paradigm - Actors (Erlang and Scala) - Dataflow programming (GParallelizer) - Software Transactional Memory (Pastrami) - Functional languages, such as Clojure
    27. 27. Scaling Up: Other Approaches • Dedicated JVM-friendly hardware - Azul Systems is amazing - Hundreds of cores - Enormous heap sizes with negligible gc pauses - HTM included - Built-in lock elision mechanism
    28. 28. Horizontal Scalability
    29. 29. Horizontal Scalability The hard part
    30. 30. Horizontal Scalability Scale Out • Big machines are expensive - 1 x 32 core normally much more expensive than 4 x 8 core • Increase throughput by adding more machines • Distributed Systems research revisited - not new
    31. 31. Requirements • Scalability • Availability • Reliability • Performance
    32. 32. Typical Server Architecture
    33. 33. ... # of users increases
    34. 34. ... and increases
    35. 35. ... too much load
    36. 36. ... and we loose availability
    37. 37. ... so we add servers
    38. 38. ... and a load balancer
    39. 39. ... and another one rides the bus
    40. 40. ... we create a DB cluster
    41. 41. ... and we cache wherever we can Cache Cache
    42. 42. Challenges • How do we route requests to servers? • How do distribute data between servers? • How do we handle failures? • How do we keep our cache consistent? • How do we handle load peaks?
    43. 43. Technique #1: Partitioning A F K P U ... ... ... ... ... E J O T Z Users
    44. 44. Technique #1: Partitioning • Each server handles a subset of data • Improves scalability by parallelizing • Requires predictable routing • Introduces problems with locality • Move work to where the data is!
    45. 45. Technique #2: Replication Active Backup
    46. 46. Technique #2: Replication • Keep copies of data/state in multiple servers • Used for fail-over - increases availability • Requires more cold hardware • Overhead of replicating might reduce performance
    47. 47. Technique #3: Messaging
    48. 48. Technique #3: Messaging • Use message passing, queues and pub/sub models - JMS • Improves reliability easily • Helps deal with peaks - The queue keeps filling - If it gets too big, extra requests are rejected
    49. 49. Solution #1: De- normalize DB • Faster queries • Additional work to generate tables • Less space efficiency • Harder to maintain consistency
    50. 50. Solution #2: Non-SQL Database • Why not remove the relational part altogether • Bad for complex queries • Berkeley DB is a prime example
    51. 51. Solution #3: Distributed Key/Value Stores • Highly scalable - used in the largest websites in the world, based on Amazon’s Dynamo and Google’s BigTable • Mostly open source • Partitioned • Replicated • Versioned • No SPOF • Voldemort (LinkedIn), Cassandra (Facebook) and HBase are written in Java
    52. 52. Solution #4: MapReduce Map...
    53. 53. Solution #4: MapReduce Map...
    54. 54. Solution #4: MapReduce Divide Work Map...
    55. 55. Solution #4: MapReduce Divide Work Map...
    56. 56. Solution #4: MapReduce Divide Work Map...
    57. 57. Solution #4: MapReduce Map...
    58. 58. Solution #4: MapReduce Compute Map...
    59. 59. Solution #4: MapReduce Return and aggregate Reduce...
    60. 60. Solution #4: MapReduce Return and aggregate Reduce...
    61. 61. Solution #4: MapReduce Return and aggregate Reduce...
    62. 62. Solution #4: MapReduce • Google’s algorithm to split work, process it and reduce to an answer • Used for offline processing of large amounts of data • Hadoop is used everywhere! Other options such as GridGain exist
    63. 63. Solution #5: Data Grid • Data (and computations) • In-memory - low response times • Database back-end (SQL or not) • Partitioned - operations on data executed in specific partition • Replicated - handles failover automatically • Transactional
    64. 64. Solution #5: Data Grid • It’s a distributed cache + computational engine • Can be used as a cache with JPA and the like • Oracle Coherence is very good. • Terracotta, Gridgain, Gemfire, Gigaspaces, Velocity (Microsoft) and Websphere extreme scale (IBM)
    65. 65. Retrospective • You need to scale up and out • Write code thinking of hundreds of cores • Relational might not be the way to go • Cache whenever you can • Be aware of data locality
    66. 66. Q &A Thanks for listening! Ruben Badaró http://www.zonaj.org

    ×