Writing Scalable Software in JavaPresentation Transcript
Writing Scalable
Software in Java
From multi-core to grid-computing
Me
• Ruben Badaró
• Dev Expert at Changingworlds/Amdocs
• PT.JUG Leader
• http://www.zonaj.org
What this talk is not
about
• Sales pitch
• Cloud Computing
• Service Oriented Architectures
• Java EE
• How to write multi-threaded code
Summary
• Define Performance and Scalability
• Vertical Scalability - scaling up
• Horizontal Scalability - scaling out
• Q&A
Performance != Scalability
Performance
Amount of useful work accomplished by a
computer system compared to the time and
resources used
Scalability
Capability of a system to increase the amount of
useful work as resources and load are added to
the system
Scalability
• A system that performs fast with 10 users
might not do so with 1000 - it doesn’t scale
• Designing for scalability always decreases
performance
Linear Scalability
Throughput
Resources
Reality is sub-linear
Throughput
Resources
Amdahl’s Law
Scalability is about
parallelizing
• Parallel decomposition allows division of
work
• Parallelizing might mean more work
• There’s almost always a part of serial
computation
Vertical Scalability
Vertical Scalability
Somewhat hard
Vertical Scalability
Scale Up
• Bigger, meaner machines
- More cores (and more powerful)
- More memory
- Faster local storage
• Limited
- Technical constraints
- Cost - big machines get exponentially
expensive
Shared State
• Need to use those cores
• Java - shared-state concurrency
- Mutable state protected with locks
- Hard to get right
- Most developers don’t have experience
writing multithreaded code
This is how they look
like
public static synchronized SomeObject getInstance() {
return instance;
}
public SomeObject doConcurrentThingy() {
synchronized(this) {
//...
}
return ..;
}
Single vs Multi-threaded
• Single-threaded
- No scheduling cost
- No synchronization cost
• Multi-threaded
- Context Switching (high cost)
- Memory Synchronization (memory barriers)
- Blocking
Lock Contention
Little’s Law
The average number of customers in a stable
system is equal to their average arrival rate
multiplied by their average time in the system
Reducing Contention
• Reduce lock duration
• Reduce frequency with which locks are
requested (stripping)
• Replace exclusive locks with other mechanisms
- Concurrent Collections
- ReadWriteLocks
- Atomic Variables
- Immutable Objects
Concurrent Collections
• Use lock stripping
• Includes putIfAbsent() and replace()
methods
• ConcurrentHashMap has 16 separate locks by
default
• Don’t reinvent the wheel
ReadWriteLocks
• Pair of locks
• Read lock can be held by multiple
threads if there are no writers
• Write lock is exclusive
• Good improvements if object as fewer
writers
Atomic Variables
• Allow to make check-update type of
operations atomically
• Without locks - use low-level CPU
instructions
• It’s volatile on steroids (visibility +
atomicity)
Immutable Objects
• Immutability makes concurrency simple - thread-
safety guaranteed
• An immutable object is:
- final
- fields are final and private
- Constructor constructs the object completely
- No state changing methods
- Copy internal mutable objects when receiving
or returning
JVM issues
• Caching is useful - storing stuff in memory
• Larger JVM heap size means longer garbage
collection times
• Not acceptable to have long pauses
• Solutions
- Maximum size for heap 2GB/4GB
- Multiple JVMs per machine
- Better garbage collectors: G1 might help
Scaling Up: Other
Approaches
• Change the paradigm
- Actors (Erlang and Scala)
- Dataflow programming (GParallelizer)
- Software Transactional Memory
(Pastrami)
- Functional languages, such as Clojure
Scaling Up: Other
Approaches
• Dedicated JVM-friendly hardware
- Azul Systems is amazing
- Hundreds of cores
- Enormous heap sizes with negligible gc
pauses
- HTM included
- Built-in lock elision mechanism
Horizontal Scalability
Horizontal Scalability
The hard part
Horizontal Scalability
Scale Out
• Big machines are expensive - 1 x 32 core
normally much more expensive than 4 x
8 core
• Increase throughput by adding more
machines
• Distributed Systems research revisited -
not new
Challenges
• How do we route requests to servers?
• How do distribute data between servers?
• How do we handle failures?
• How do we keep our cache consistent?
• How do we handle load peaks?
Technique #1: Partitioning
A F K P U
... ... ... ... ...
E J O T Z
Users
Technique #1: Partitioning
• Each server handles a subset of data
• Improves scalability by parallelizing
• Requires predictable routing
• Introduces problems with locality
• Move work to where the data is!
Technique #2: Replication
Active
Backup
Technique #2: Replication
• Keep copies of data/state in multiple
servers
• Used for fail-over - increases availability
• Requires more cold hardware
• Overhead of replicating might reduce
performance
Technique #3: Messaging
Technique #3: Messaging
• Use message passing, queues and pub/sub
models - JMS
• Improves reliability easily
• Helps deal with peaks
- The queue keeps filling
- If it gets too big, extra requests are
rejected
Solution #1: De-
normalize DB
• Faster queries
• Additional work to generate tables
• Less space efficiency
• Harder to maintain consistency
Solution #2: Non-SQL
Database
• Why not remove the relational part
altogether
• Bad for complex queries
• Berkeley DB is a prime example
Solution #3: Distributed
Key/Value Stores
• Highly scalable - used in the largest websites in the
world, based on Amazon’s Dynamo and Google’s
BigTable
• Mostly open source
• Partitioned
• Replicated
• Versioned
• No SPOF
• Voldemort (LinkedIn), Cassandra (Facebook) and HBase
are written in Java
Solution #4:
MapReduce
Map...
Solution #4:
MapReduce
Map...
Solution #4:
MapReduce
Divide Work
Map...
Solution #4:
MapReduce
Divide Work
Map...
Solution #4:
MapReduce
Divide Work
Map...
Solution #4:
MapReduce
Map...
Solution #4:
MapReduce
Compute
Map...
Solution #4:
MapReduce
Return and
aggregate
Reduce...
Solution #4:
MapReduce
Return and
aggregate
Reduce...
Solution #4:
MapReduce
Return and
aggregate
Reduce...
Solution #4:
MapReduce
• Google’s algorithm to split work, process it
and reduce to an answer
• Used for offline processing of large
amounts of data
• Hadoop is used everywhere! Other options
such as GridGain exist
Solution #5: Data Grid
• Data (and computations)
• In-memory - low response times
• Database back-end (SQL or not)
• Partitioned - operations on data executed in
specific partition
• Replicated - handles failover automatically
• Transactional
Solution #5: Data Grid
• It’s a distributed cache + computational
engine
• Can be used as a cache with JPA and the like
• Oracle Coherence is very good.
• Terracotta, Gridgain, Gemfire, Gigaspaces,
Velocity (Microsoft) and Websphere
extreme scale (IBM)
Retrospective
• You need to scale up and out
• Write code thinking of hundreds of cores
• Relational might not be the way to go
• Cache whenever you can
• Be aware of data locality
Q &A
Thanks for listening!
Ruben Badaró
http://www.zonaj.org
Let LinkedIn power your SlideShare experience
+
Let LinkedIn power your SlideShare experience
Customize SlideShare content based on your interests
We will import your LinkedIn profile and you will be visible on SlideShare.
Keep up to date when your LinkedIn contacts post on SlideShare
1–3 of 3 previous next Post a comment