Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Robust and Scalable
Concurrent Programming:
Lessons from the Trenches
Sangjin Lee, Mahesh Somani, &
Debashis Saha
eBay Inc.
Agenda
>    Why Does It Matter?
>    Patterns: the Good, the Bad, and the $#&%@!
>    Lessons Learned




                ...
Why Does It Matter?
>    Concurrency is becoming only more important
         Multi-core systems are becoming the norm of...
Why Does It Matter?
>    Writing safe and concurrent code is hard
         Thread safety is hard for an average Java
    ...
Why Does It Matter?
>    Writing safe and concurrent code is hard
         Writing safe and scalable code is even harder
...
Why Does It Matter?
>    Yet, the penalty for incorrect or non-scalable code
     is severe
         They are often the m...
Patterns
>    Lazy Initialization




                           7
Patterns
[1] Lazy Initialization
>    What’s wrong with this picture?

     public class Singleton { 
         private sta...
Patterns
[1] Lazy Initialization
>    Not thread safe of course! Let’s fix it.

     public class Singleton { 
         pr...
Patterns
[1] Lazy Initialization
>    Fix #1: synchronize!

     public class Singleton { 
         private static Singlet...
Patterns
[1] Lazy Initialization
>    It is now correct (thread safe) but ...
>    We have just introduced a potential loc...
Patterns
[1] Lazy Initialization
>    Impact of a lock contention
                                         Lock Contention...
Patterns
[1] Lazy Initialization
>     Fix #2 (bad): synchronize only when we allocate
     public class Singleton { 
    ...
Patterns
[1] Lazy Initialization
>    There is a name for this pattern 
>    Optimization gone awry: it appears correct a...
Patterns
[1] Lazy Initialization
>      Fix #3: $#&%@!
     public class Singleton { 
         private static Singleton in...
Patterns
[1] Lazy Initialization
>    None of the fixes quite works: they are either
     unsafe, or don’t scale if in the...
Patterns
[1] Lazy Initialization
>    Eager Initialization!
         No reason to initialize lazily: allocations are chea...
Patterns
[1] Lazy Initialization
>    Good fix #1: eager initialization

     public class Singleton { 
         private s...
Patterns
[1] Lazy Initialization
>    Then when does lazy initialization make sense?
         (For a non-singleton) If in...
Patterns
[1] Lazy Initialization
>    What are the right lazy initialization patterns?




                               ...
Patterns
[1] Lazy Initialization
>    Good fix #2: holder pattern (for non-singletons)

     public class Foo { 
         ...
Patterns
[1] Lazy Initialization
>     Good fix #3: corrected double-checked locking
      (only on Java 5 or later)
     ...
Patterns
>    Lazy Initialization
>    Map & Compound Operation




                                23
Patterns
[2] Map & Compound Operation
>    What’s wrong with this picture?

     public class KeyedCounter { 
         pri...
Patterns
[2] Map & Compound Operation
>    It may be blazingly fast but ... it’s not thread safe!

     public class Keyed...
Patterns
[2] Map & Compound Operation
>    “How bad can it get?” Really bad.
         ConcurrentModificationException may...
Patterns
[2] Map & Compound Operation
>    Fix #1 (bad): synchronize the map
     public class KeyedCounter { 
         pr...
Patterns
[2] Map & Compound Operation
>    Fix #1 (bad): synchronize the map
         It is still not thread safe!
     ...
Patterns
[2] Map & Compound Operation
>    Fix #2: synchronize the operations

     public class KeyedCounter { 
         ...
Patterns
[2] Map & Compound Operation
>    Fix #2: synchronize the operations
         It is now correct
         But do...
Patterns
[2] Map & Compound Operation
>    ConcurrentHashMap comes to the rescue!
         In general it scales significa...
Patterns
[2] Map & Compound Operation
>    Good fix: ConcurrentHashMap with atomic
     variables
     public class KeyedC...
Patterns
[2] Map & Compound Operation
>    Even better...
     public class KeyedCounter { 
         private final Concurr...
Patterns
>         Lazy Initialization
>         Map & Compound Operation
>         Other People’s Money Classes
        ...
Patterns
[3] Other People’s Classes #1
>    What’s wrong with this picture?

     public class DateFormatter { 
         p...
Patterns
[3] Other People’s Classes #1
>    DateFormat is not thread safe!
>    Some JDK classes that are not thread safe
...
Patterns
[3] Other People’s Classes #1
>    Fix #1: synchronize!
         It is now safe, but has a potential of turning ...
Patterns
[3] Other People’s Classes #1
>    Good fix #2: allocate it every time
         It is completely safe (stack con...
Patterns
[3] Other People’s Classes #1
>    Good fix #3: use ThreadLocal
         Use it if truly concerned about allocat...
Patterns
>         Lazy Initialization
>         Map & Compound Operation
>         Other People’s Classes
          Unsa...
Patterns
[3] Other People’s Classes #2
>    What’s wrong with this picture?

     public class ForceInit { 

         publ...
Patterns
[3] Other People’s Classes #2
>    Thread safe … but potential lock contention when
     missing class
     publi...
Patterns
[3] Other People’s Classes #2
>    Class.forName()
         Positive results cached but not negative ones
     ...
Patterns
[3] Other People’s Classes #2
>    Fix: trading memory for speed. Works when
     limited range of class names
  ...
Patterns
>         Lazy Initialization
>         Map & Compound Operation
>         Other People’s Classes
          Unsa...
Patterns
[3] Other People’s Classes #3
>    What’s wrong with this picture?


     public class XMLParser { 

         pub...
Patterns
[3] Other People’s Classes #3
>    Depends … newSAXParser() could have a
     potentially significant lock conten...
Patterns
[3] Other People’s Classes #3
>    But why?
         Implementation class determined in following order
        ...
Patterns
[3] Other People’s Classes #3
>    Good fix #1: Set up environment (system
     properties)

     $ java ‐Djavax....
Patterns
[3] Other People’s Classes #3
>     Good fix #2: ThreadLocal
          Beware reuse issues however
     public c...
Patterns
>         Lazy Initialization
>         Map & Compound Operation
>         Other People’s Classes
          Unsa...
Lessons Learned: Best Practices
>    Patterns & anti-patterns
         Common problems and solutions
         “Anti-patt...
Lessons Learned: Best Practices
>    Correctness first: thread safety is non-negotiable
         Premature optimization f...
Lessons: Not All Code is Created Equal
>    Scalability next
         Scalability and lock contention                    ...
Lessons: Best Practices
>    Multi-pronged attack: proactive
           Regular training and documentation
           St...
Lessons: Best Practices
>    eBay’s Systems Lab
          Lab environment to put applications under microscope with
     ...
Q&A
>    Questions?




                  57
Sangjin Lee, Mahesh Somani, &
Debashis Saha
eBay Inc.

Email: sangjin.lee@ebay.com




                                58
Upcoming SlideShare
Loading in …5
×

Robust and Scalable Concurrent Programming: Lesson from the Trenches

8,172 views

Published on

Published in: Technology, News & Politics

Robust and Scalable Concurrent Programming: Lesson from the Trenches

  1. 1. Robust and Scalable Concurrent Programming: Lessons from the Trenches Sangjin Lee, Mahesh Somani, & Debashis Saha eBay Inc.
  2. 2. Agenda >  Why Does It Matter? >  Patterns: the Good, the Bad, and the $#&%@! >  Lessons Learned 2
  3. 3. Why Does It Matter? >  Concurrency is becoming only more important   Multi-core systems are becoming the norm of enterprise servers   Your enterprise software needs to run under high concurrency to take advantage of hardware   There are other ways to scale (virtualization or running multiple processes), but it is not always doable 3
  4. 4. Why Does It Matter? >  Writing safe and concurrent code is hard   Thread safety is hard for an average Java developer   Many still don’t fully understand the Java Memory Model   Thread safety is often an afterthought   Shared data is still the primary mechanism of handling concurrency in Java   Other approaches: Actor-based share-nothing concurrency (Scala), STM, etc.   Thread safety concerns are not easily isolated 4
  5. 5. Why Does It Matter? >  Writing safe and concurrent code is hard   Writing safe and scalable code is even harder   It requires understanding of your software under concurrency   Will this lock become hot under our production traffic usage?   You should tune only if it is shown to be a real hot spot   It requires knowledge of what scales and what doesn’t   Synchronized HashMap vs. ConcurrentHashMap?   When does ReadWriteLock make sense?   Atomic variables? 5
  6. 6. Why Does It Matter? >  Yet, the penalty for incorrect or non-scalable code is severe   They are often the most difficult bugs: hard to reproduce, and happen only under load (a.k.a. your production peak time)   The code works just fine under light load, but it blows up in your face as the traffic peaks 6
  7. 7. Patterns >  Lazy Initialization 7
  8. 8. Patterns [1] Lazy Initialization >  What’s wrong with this picture? public class Singleton {      private static Singleton instance;      public static Singleton getInstance() {          if (instance == null) {              instance = new Singleton();          }          return instance;      }      private Singleton() {}  }  8
  9. 9. Patterns [1] Lazy Initialization >  Not thread safe of course! Let’s fix it. public class Singleton {      private static Singleton instance;      public static Singleton getInstance() {          if (instance == null) {              instance = new Singleton();          }          return instance;      }      private Singleton() {}  }  9
  10. 10. Patterns [1] Lazy Initialization >  Fix #1: synchronize! public class Singleton {      private static Singleton instance;      public static synchronized Singleton getInstance() {          if (instance == null) {              instance = new Singleton();          }          return instance;      }      private Singleton() {}  }  10
  11. 11. Patterns [1] Lazy Initialization >  It is now correct (thread safe) but ... >  We have just introduced a potential lock contention >  It can pose a serious scalability issue if it is in the critical path   “The principal threat to scalability in concurrent applications is the exclusive resource lock.” – Brian Goetz, Java Concurrency in Practice 11
  12. 12. Patterns [1] Lazy Initialization >  Impact of a lock contention Lock Contention Fixed 100 80 Relative TPS (%) 60 40 20 0 0 20 40 60 80 100 CPU (%) 12
  13. 13. Patterns [1] Lazy Initialization >  Fix #2 (bad): synchronize only when we allocate public class Singleton {      private static Singleton instance;      public static Singleton getInstance() {          if (instance == null) {              synchronized (Singleton.class) {                  if (instance == null) {                      instance = new Singleton();                  }              }          }          return instance;      }      private Singleton() {}  }  13
  14. 14. Patterns [1] Lazy Initialization >  There is a name for this pattern  >  Optimization gone awry: it appears correct and efficient, but it is not thread safe 14
  15. 15. Patterns [1] Lazy Initialization >  Fix #3: $#&%@! public class Singleton {      private static Singleton instance;      private static ReentrantReadWriteLock lock = new ReentrantReadWriteLock();      public static Singleton getInstance() {          lock.readLock().lock();          try {              if (instance == null) {                  lock.readLock().unlock();                  lock.writeLock().lock();                  try {                      if (instance == null) {                          instance = new Singleton();                      }                  } finally {                      lock.readLock().lock();                      lock.writeLock().unlock();                  }              }              return instance;          } finally {              lock.readLock().unlock();          }      }      private Singleton() {}  }  15
  16. 16. Patterns [1] Lazy Initialization >  None of the fixes quite works: they are either unsafe, or don’t scale if in the critical path >  Then what is a better fix? 16
  17. 17. Patterns [1] Lazy Initialization >  Eager Initialization!   No reason to initialize lazily: allocations are cheap   Lazy initialization was popular when allocations used to be expensive   Today we do it simply by habit for no good reason   Static initialization takes care of thread safety, and is also lazy   For singletons, you gain nothing by using an explicit lazy initialization 17
  18. 18. Patterns [1] Lazy Initialization >  Good fix #1: eager initialization public class Singleton {      private static final Singleton instance = new Singleton();      public static Singleton getInstance() {          return instance;      }      private Singleton() {}  }  18
  19. 19. Patterns [1] Lazy Initialization >  Then when does lazy initialization make sense?   (For a non-singleton) If instantiation is truly expensive, and it has a chance of not being needed   If you want to retry if it fails the first time   If you want to unload and reload the object   To reduce confusion if there was an exception during eager initialization   ExceptionInInitializerError for the first call   NoClassDefFoundError for subsequent calls 19
  20. 20. Patterns [1] Lazy Initialization >  What are the right lazy initialization patterns? 20
  21. 21. Patterns [1] Lazy Initialization >  Good fix #2: holder pattern (for non-singletons) public class Foo {      public static Bar getBar() {          return BarHolder.getInstance(); // lazily initialized      }       private static class BarHolder {          private static final Bar instance = new Bar();          static Bar getInstance() {              return instance;          }      }  }  21
  22. 22. Patterns [1] Lazy Initialization >  Good fix #3: corrected double-checked locking (only on Java 5 or later) public class Singleton {      private static volatile Singleton instance;      public static Singleton getInstance() {          if (instance == null) {              synchronized (Singleton.class) {                  if (instance == null) {                      instance = new Singleton();                  }              }          }          return instance;      }      private Singleton() {}  }  22
  23. 23. Patterns >  Lazy Initialization >  Map & Compound Operation 23
  24. 24. Patterns [2] Map & Compound Operation >  What’s wrong with this picture? public class KeyedCounter {      private Map<String,Integer> map = new HashMap<String,Integer>();      public void increment(String key) {          Integer old = map.get(key);          int value = (old == null) ? 1 : old.intValue()+1;          map.put(key, value);      }      public Integer getCount(String key) {          return map.get(key);      }  }  24
  25. 25. Patterns [2] Map & Compound Operation >  It may be blazingly fast but ... it’s not thread safe! public class KeyedCounter {      private Map<String,Integer> map = new HashMap<String,Integer>();      public void increment(String key) {          Integer old = map.get(key);          int value = (old == null) ? 1 : old.intValue()+1;          map.put(key, value);      }      public Integer getCount(String key) {          return map.get(key);      }  }  25
  26. 26. Patterns [2] Map & Compound Operation >  “How bad can it get?” Really bad.   ConcurrentModificationException may be your best outcome   Worse, you might end up in an infinite loop   Worse yet, it may simply give you a wrong result without any errors; data corruption, etc. >  Don’t play with fire: never modify an unsynchronized collection concurrently 26
  27. 27. Patterns [2] Map & Compound Operation >  Fix #1 (bad): synchronize the map public class KeyedCounter {      private Map<String,Integer> map =           Collections.synchronizedMap(new HashMap<String,Integer>());      public void increment(String key) {          Integer old = map.get(key);          int value = (old == null) ? 1 : old.intValue()+1;          map.put(key, value);      }      public Integer getCount(String key) {          return map.get(key);      }  }  27
  28. 28. Patterns [2] Map & Compound Operation >  Fix #1 (bad): synchronize the map   It is still not thread safe!   Even though individual operations (get() and put()) are thread safe, the compound operation (KeyedCounter.increment()) is not 28
  29. 29. Patterns [2] Map & Compound Operation >  Fix #2: synchronize the operations public class KeyedCounter {      private Map<String,Integer> map = new HashMap<String,Integer>();      public synchronized void increment(String key) {          Integer old = map.get(key);          int value = (old == null) ? 1 : old.intValue()+1;          map.put(key, value);      }      public synchronized Integer getCount(String key) {          return map.get(key);      }  }  29
  30. 30. Patterns [2] Map & Compound Operation >  Fix #2: synchronize the operations   It is now correct   But does it scale?   As increment() and getCount() get called from multiple threads, they may become a source of lock contention 30
  31. 31. Patterns [2] Map & Compound Operation >  ConcurrentHashMap comes to the rescue!   In general it scales significantly better than a synchronized HashMap under concurrency   Full reader concurrency   Some writer concurrency with finer-grained locking   The ConcurrentMap interface supports additional thread safe methods; e.g. putIfAbsent() 31
  32. 32. Patterns [2] Map & Compound Operation >  Good fix: ConcurrentHashMap with atomic variables public class KeyedCounter {      private final ConcurrentMap<String,AtomicInteger> map =               new ConcurrentHashMap<String,AtomicInteger>();      public void increment(String key) {          AtomicInteger value = new AtomicInteger(0);          AtomicInteger old = map.putIfAbsent(key, value);          if (old != null) { value = old; }          value.incrementAndGet(); // increment the value atomically      }      public Integer getCount(String key) {          AtomicInteger value = map.get(key);          return (value == null) ? null : value.get();      }  }  32
  33. 33. Patterns [2] Map & Compound Operation >  Even better... public class KeyedCounter {      private final ConcurrentMap<String,AtomicInteger> map =               new ConcurrentHashMap<String,AtomicInteger>();      public void increment(String key) {          AtomicInteger value = map.get(key);          if (value == null) {              value = new AtomicInteger(0);              AtomicInteger old = map.putIfAbsent(key, value);              if (old != null) { value = old; }          }          value.incrementAndGet(); // increment the value atomically      }      public Integer getCount(String key) {          AtomicInteger value = map.get(key);          return (value == null) ? null : value.get();      }  }  33
  34. 34. Patterns >  Lazy Initialization >  Map & Compound Operation >  Other People’s Money Classes   Unsafe classes 34
  35. 35. Patterns [3] Other People’s Classes #1 >  What’s wrong with this picture? public class DateFormatter {      private static final DateFormat df = DateFormat.getInstance();      public static String formatDate(Date d) {          return df.format(d);      }  }  35
  36. 36. Patterns [3] Other People’s Classes #1 >  DateFormat is not thread safe! >  Some JDK classes that are not thread safe   DateFormat: declared in javadoc as not thread safe   MessageDigest: no mention of thread safety   CharsetEncoder/CharsetDecoder: declared in javadoc as not thread safe >  Lessons   Do not assume thread safety of others’ classes   Even javadoc may not have a full disclosure 36
  37. 37. Patterns [3] Other People’s Classes #1 >  Fix #1: synchronize!   It is now safe, but has a potential of turning into a lock contention public class DateFormatter {      private static final DateFormat df = DateFormat.getInstance();      public static String formatDate(Date d) {          synchronized (df) {              return df.format(d);          }      }  }  37
  38. 38. Patterns [3] Other People’s Classes #1 >  Good fix #2: allocate it every time   It is completely safe (stack confinement via local variable)   Allocations are not as expensive as you might think public class DateFormatter {      public static String formatDate(Date d) {          DateFormat df = DateFormat.getInstance();          return df.format(d);      }  }  38
  39. 39. Patterns [3] Other People’s Classes #1 >  Good fix #3: use ThreadLocal   Use it if truly concerned about allocation cost public class DateFormatter {      private static final ThreadLocal<DateFormat> tl =          new ThreadLocal<DateFormat>() {              @Override protected DateFormat initialValue() {                  return DateFormat.getInstance();              }          };      public static String formatDate(Date d) {          return tl.get().format(d);      }  }  39
  40. 40. Patterns >  Lazy Initialization >  Map & Compound Operation >  Other People’s Classes   Unsafe classes   Hidden costs 40
  41. 41. Patterns [3] Other People’s Classes #2 >  What’s wrong with this picture? public class ForceInit {      public static void init(String className) {          try {       // Force initialization              Class.forName(className);          } catch (ClassNotFoundException e) {       // May happen.  Do nothing.     }      }  }  41
  42. 42. Patterns [3] Other People’s Classes #2 >  Thread safe … but potential lock contention when missing class public class ForceInit {      public static void init(String className) {          try {       // Force initialization              Class.forName(className);          } catch (ClassNotFoundException e) {       // May happen.  Do nothing.     }      }  }  42
  43. 43. Patterns [3] Other People’s Classes #2 >  Class.forName()   Positive results cached but not negative ones   When class not found, (expensive) search in classpath   Class loading deals with different locks: repeated invocation => lock contention and CPU hot spot! >  Lessons   Measure contention: positive and negative tests   Beware as javadoc may not have disclosure on locks or inner workings 43
  44. 44. Patterns [3] Other People’s Classes #2 >  Fix: trading memory for speed. Works when limited range of class names public class ForceInit {      private static final ConcurrentMap<String,String> cache               = new ConcurrentHashMap<String,String>();      public static void init(String className) {          try {              if (cache.putIfAbsent(className, className) == null) {                  // Force initialization                  Class.forName(className);              }          } catch (ClassNotFoundException e) {              // May happen.  Do nothing.            }      }  }  44
  45. 45. Patterns >  Lazy Initialization >  Map & Compound Operation >  Other People’s Classes   Unsafe classes   Hidden costs   Hidden costs #2 45
  46. 46. Patterns [3] Other People’s Classes #3 >  What’s wrong with this picture? public class XMLParser {      public void parseXML(InputStream is, DefaultHandler handler)               throws Exception {          SAXParser parser =                   SAXParserFactory.newInstance().newSAXParser();          parser.parse(is, handler);      }  }  46
  47. 47. Patterns [3] Other People’s Classes #3 >  Depends … newSAXParser() could have a potentially significant lock contention public class XMLParser {      public void parseXML(InputStream is, DefaultHandler handler)               throws Exception {          SAXParser parser =                   SAXParserFactory.newInstance().newSAXParser();          parser.parse(is, handler);      }  }  47
  48. 48. Patterns [3] Other People’s Classes #3 >  But why?   Implementation class determined in following order   System property  lib/jaxp.properties  META-INF/ services/jaxp… in class path  Platform default   META-INF not cached! Possible lock contention >  Also applies to DOM & XML Reader >  JAXP implementations (e.g., Xerces) may have additional configurations read from classpath >  Lessons   Measure contention   Read documentation! 48
  49. 49. Patterns [3] Other People’s Classes #3 >  Good fix #1: Set up environment (system properties) $ java ‐Djavax.xml.parsers...=IMPL_CLASS ... ... MAIN_CLASS [args]  49
  50. 50. Patterns [3] Other People’s Classes #3 >  Good fix #2: ThreadLocal   Beware reuse issues however public class XMLParser {      private static final ThreadLocal<SAXParser> tlSax = new ThreadLocal<SAXParser>();      private static SAXParser getParser() throws Exception {          SAXParser parser = tlSax.get();          if (parser == null) {              parser = SAXParserFactory.newInstance().newSAXParser();              tlSax.set(parser);          }          return parser;      }      public void parseXML(InputStream is, DefaultHandler handler) throws Exception {    getParser().parse(is, handler);      }  }  50
  51. 51. Patterns >  Lazy Initialization >  Map & Compound Operation >  Other People’s Classes   Unsafe classes   Hidden costs   Hidden costs #2 51
  52. 52. Lessons Learned: Best Practices >  Patterns & anti-patterns   Common problems and solutions   “Anti-patterns” are a good way of educating   Useful way to analyze solutions   Surprisingly repeated use and misuse 52
  53. 53. Lessons Learned: Best Practices >  Correctness first: thread safety is non-negotiable   Premature optimization for scalability could risk correctness   Concurrency issues are insidious, hard to reproduce and debug   Concurrency issues increase nonlinearly with load 53
  54. 54. Lessons: Not All Code is Created Equal >  Scalability next   Scalability and lock contention 100% 90%   A few locks cause most of 80% contention 70% % of Contention   TPS should increase linearly with 60% utilization 50% 40%   Lock contention possible cause if 30% nonlinear behavior 20%   Ways to determine contended 10% locks 0% 1 3 5 7 9 11 13 15 17 19   Lock analyzers, CPU profiling Contended lock (oprofile, tprof)   Thread dumps a few seconds apart 54
  55. 55. Lessons: Best Practices >  Multi-pronged attack: proactive   Regular training and documentation   Static code analysis   Runtime code analysis   Ongoing performance monitoring and measurements >  Multi-pronged attack: reactive   Periodic tuning   Complimentary to programming discipline   Be disciplined: tune only the top hot spots 55
  56. 56. Lessons: Best Practices >  eBay’s Systems Lab   Lab environment to put applications under microscope with different profiling tools   On-going tests to systemically identify and fix bottlenecks   Yielded substantial scalability improvements and server savings 56
  57. 57. Q&A >  Questions? 57
  58. 58. Sangjin Lee, Mahesh Somani, & Debashis Saha eBay Inc. Email: sangjin.lee@ebay.com 58

×