Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Shopzilla On Concurrency

1,857 views

Published on

Slides from my presentation about Shopzilla's concurrency strategies to the Pasadena Java User's Group on April 26, 2010. This is essentially the same material as covered by my colleague Rodney Barlow in an earlier presentation http://www.slideshare.net/rodneypbarlow/shopzilla-on-concurrency, with a few minor tweaks.

Published in: Technology
  • Be the first to comment

Shopzilla On Concurrency

  1. 1. Shopzilla on Concurrency Concurrency as Shopzilla's performance building block Will Gage, Lead Software Engineer 3/2/2010
  2. 2. Agenda <ul><li>Introduction </li></ul><ul><li>History of Java Concurrency </li></ul><ul><li>Java 5 Concurrency Features </li></ul><ul><li>Concurrency in Frameworks </li></ul><ul><li>Concurrency @ Shopzilla </li></ul><ul><li>Future </li></ul>
  3. 3. Shopzilla, Inc. - Online Shopping Network 100M impressions/day 20-29M UV’s per Month 8,000+ searches per second 100M+ Products
  4. 4. Why do we care about concurrency? <ul><li>Correctness </li></ul><ul><ul><li>Avoiding race conditions </li></ul></ul><ul><ul><li>Avoiding visibility problems </li></ul></ul><ul><ul><li>Avoiding “liveness” issues </li></ul></ul><ul><li>Performance </li></ul><ul><ul><li>The limits of Moore’s Law means the rise of Amdahl’s Law </li></ul></ul>
  5. 5. Amdahl’s Law P = portion of your code which can be parallelized N = number of processors (Thanks, Wikipedia!)
  6. 6. History of Java Concurrency <ul><li>Concurrent code pre Java 1.5 was difficult and error prone. </li></ul><ul><li>Misinformation abounded </li></ul><ul><li>Java users need to write reliable multi-threaded software! </li></ul><ul><li>Doug Lea's concurrent package (circa 1998) </li></ul><ul><li>JSR-133 Java Memory Model (threads, locks, volatiles, ...) </li></ul><ul><li>JSR-166 Concurrency Utilities </li></ul><ul><li>Expert groups consisting of Bloch, Goetz, and Lea </li></ul>
  7. 7. Where is Concurrent Code <ul><li>Concurrent code is in our application containers </li></ul><ul><li>Concurrent code is in the frameworks we all use </li></ul><ul><li>Concurrent code is increasingly found in our applications </li></ul>
  8. 8. Building Blocks: Thread Safety <ul><li>Atomicity </li></ul><ul><li>Visibility </li></ul><ul><li>Immutable / stateless objects </li></ul><ul><li>Synchronized keyword </li></ul><ul><li>Volatile keyword </li></ul><ul><li>Atomic* </li></ul>
  9. 9. Immutability <ul><li>Immutability = a class whose instances can't be modified </li></ul><ul><ul><li>Eg; String, boxed primitives, BigDecimal, BigInteger </li></ul></ul><ul><li>Joshua Bloch's Effective Java sets forth guidelines </li></ul><ul><ul><li>Eliminate mutators </li></ul></ul><ul><ul><li>Eliminate extensibility </li></ul></ul><ul><ul><li>All fields final </li></ul></ul><ul><ul><li>Exclusivity for mutable components </li></ul></ul><ul><li>Further, Bloch's Effective Java reminds us </li></ul><ul><ul><li>Immutable objects are inherently thread safe. </li></ul></ul><ul><ul><li>Immutable objects require no synchronization </li></ul></ul><ul><ul><li>Immutable objects can be shared freely </li></ul></ul>
  10. 10. Atomic References <ul><li>Immune to deadlock and other liveness issues </li></ul><ul><li>Offer nonblocking synchronization of single variables </li></ul><ul><li>Offer lower scheduling overhead than traditional synchronization techniques </li></ul><ul><li>Immune to deadlock and other liveness issues </li></ul><ul><li>Effectively volatile variables with extra features </li></ul><ul><li>Modern hardware support through compare-and-swap processor instructions </li></ul>
  11. 11. Atomic References – Unique ID <ul><li>We needed an unique ID value </li></ul><ul><li>Unique across multiple data-centers, and silos </li></ul><ul><li>Configuration elements prime the singleton IdGenerator for distributed uniqueness </li></ul><ul><li>A portion is based on a time value, eg: the seconds since the start of the month </li></ul><ul><li>An additional portion provides uniqueness within a single JVM, using an Atomic Reference </li></ul>
  12. 12. Atomic References – Parent Node <ul><li>AtomicReference's compareAndSet() </li></ul><ul><li>Used for visibility </li></ul><ul><li>Used to enforce data integrity constraints </li></ul><ul><li>Note ImmutableList </li></ul>
  13. 13. Atomic References – Takeaways <ul><li>Volatiles suffice where atomic check-then-act is overkill </li></ul><ul><li>Some atomic nonblocking algorithms involve looping for a failed compareAndSet() </li></ul><ul><li>During high thread contention this could actually mean inefficiency </li></ul><ul><li>Most real-world threads have far more to do than mere lock contention though </li></ul>
  14. 14. Locks – ReadWriteLock <ul><li>Allows for multiple concurrent read locks </li></ul><ul><li>No new read locks once a write lock is placed </li></ul><ul><li>Write lock blocks until read locks complete </li></ul><ul><li>Lock modes; non-fair (default), fair </li></ul><ul><li>Reentrancy </li></ul><ul><li>Downgrading </li></ul>
  15. 15. Blocking Queues <ul><li>Acts as a thread-safe implementation of a producer / consumer pattern </li></ul><ul><li>JMS queue, though not distributed </li></ul><ul><li>Insertion blocks until there is space available (for bounded queues) </li></ul>
  16. 16. Blocking Queues – Data Publish <ul><li>A very large (50GB) flat-file </li></ul><ul><li>Consumers send data to a remote grid cache </li></ul><ul><li>Multiple queue consumers increased throughput </li></ul>
  17. 17. Distributed Cached Data Snapshot <ul><li>n number of clients </li></ul><ul><li>n number of HTTP requests across 6 load balanced Tomcat JVMs </li></ul><ul><li>Threads failing to acquire lock immediately start shipping data </li></ul><ul><li>What about the thread obtaining the lock? </li></ul>
  18. 18. Distributed Cached Data Snapshot
  19. 19. Distributed Cached Data Snapshot <ul><li>Distributed competition for the publishing privilege </li></ul><ul><li>Computation of completeness </li></ul><ul><li>Communicate completeness </li></ul><ul><li>Other threads in other JVMs happily polling for State.DONE </li></ul>
  20. 20. Distributed Cached Data Snapshot <ul><li>Coherence replicated cache supports cluster wide key-based lock </li></ul><ul><li>Locked objects can still be read by other cluster threads without a lock </li></ul><ul><li>Locks are unaffected by server failure (and will failover to a backup server.) Locks are immediately released when the lock owner (client) fails. </li></ul><ul><li>Lock timeouts (-1, 0, 1+) </li></ul>
  21. 21. Distributed Cached Data Snapshot <ul><li>Hazelcast </li></ul><ul><ul><li>http://www.hazelcast.com/ </li></ul></ul><ul><ul><li>Open source clustering and highly scalable data distribution platform for Java </li></ul></ul><ul><ul><li>Distributed data structures </li></ul></ul><ul><ul><ul><li>Queue / Topic </li></ul></ul></ul><ul><ul><ul><li>Map, MultiMap, Set, List </li></ul></ul></ul><ul><ul><ul><li>Lock </li></ul></ul></ul><ul><ul><li>Effectively distributed java.util.concurrent </li></ul></ul><ul><ul><li>Uses TCP/IP </li></ul></ul><ul><ul><li>Cluster wide ID generators </li></ul></ul><ul><ul><li>Distributed executor services </li></ul></ul>
  22. 22. Distributed Cached Data Snapshot <ul><li>Apache Zookeeper </li></ul><ul><ul><li>http://hadoop.apache.org/zookeeper/ </li></ul></ul><ul><li>Terracotta </li></ul><ul><ul><li>http://www.terracotta.org/ </li></ul></ul>
  23. 23. Concurrency in Frameworks <ul><li>Hibernate Core 3.5.0 </li></ul><ul><ul><li>CountDownLatch </li></ul></ul><ul><ul><ul><li>Need a thread to wait until some number of events have occurred </li></ul></ul></ul><ul><ul><ul><li>Constructed with the count of the # events which must occur before release </li></ul></ul></ul><ul><ul><li>Callable, ExecutorService, ReentrantLock, AtomicReference </li></ul></ul>
  24. 24. Concurrency in Frameworks <ul><li>Spring Framework 3.0.1 </li></ul><ul><ul><li>TaskExecutor </li></ul></ul><ul><ul><ul><li>Spring 2.0 supported Java 1.4 </li></ul></ul></ul><ul><ul><ul><li>TaskExecutor did not implement Executor </li></ul></ul></ul><ul><ul><ul><li>In Spring 3.0 TaskExecutor extends Executor </li></ul></ul></ul><ul><ul><ul><li>TaskExecutor sees wide use within Spring framework </li></ul></ul></ul><ul><ul><ul><ul><li>Quartz </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Message Driven POJO </li></ul></ul></ul></ul><ul><ul><li>Spring Enterprise Recipies – Josh Long, Gar Mak </li></ul></ul>
  25. 25. Shopzilla's Website Concurrency <ul><li>Needed sub 650ms server side response time </li></ul><ul><li>Simplify the layers </li></ul><ul><li>Functionally separate, individually testable, loosely coupled web-services </li></ul><ul><li>Define SLAs for individual services </li></ul>
  26. 26. Shopzilla's Website Concurrency <ul><li>How to invoke 30+ web-services and ship a page in <650ms? </li></ul>Concurrency! In fact our pages today ship within 250ms
  27. 27. Shopzilla's Website Concurrency Pods & Service Calls
  28. 28. Shopzilla's Website Concurrency <ul><li>Started simple </li></ul><ul><li>Implement only the concurrency features required </li></ul><ul><li>Concurrency isolated to pods </li></ul><ul><li>Pods responsible for fetching data </li></ul><ul><li>We're using simple building blocks </li></ul><ul><li>Incremental implementation based solely on requirements </li></ul><ul><li>Haven't seen deadlocks </li></ul>
  29. 29. Shopzilla's Website Concurrency <ul><li>Thread longevity configured at the HTTP connection level </li></ul><ul><li>HTTPClient connectionTimeout </li></ul><ul><li>Spring wired HTTPClient implementation </li></ul><ul><li>Ability to add a pod to a controller </li></ul>
  30. 30. Shopzilla's Website Concurrency <ul><li>FuturePodResult implements the PodResult interface </li></ul><ul><li>Abstracts the details of the future </li></ul><ul><li>PodCallable types the pod and command </li></ul>
  31. 31. Shopzilla's Website Concurrency <ul><li>Need to execute pods </li></ul><ul><li>Configurable ExecutorService </li></ul><ul><li>Backed with a queue </li></ul><ul><li>Naming of threads proved useful in initial testing (JMX) </li></ul>
  32. 32. Shopzilla's Website Concurrency <ul><li>Once the concept was proven, interesting feature requests materializing </li></ul><ul><li>Product Review pod </li></ul><ul><li>Distilled, 2 pods needed to share a single result </li></ul><ul><li>Added ServiceInvocation concept </li></ul>
  33. 33. Shopzilla's Website Concurrency <ul><li>Pods now have access to a Service Invocation Map </li></ul><ul><li>get() blocks on the result of the service invocation </li></ul><ul><li>A single service invocation result can be shared between two pods </li></ul>
  34. 34. Shopzilla's Website Concurrency <ul><li>Now we were sharing results, we were done, right? </li></ul><ul><li>Product Review information was now required in-line in the product pod </li></ul><ul><li>Still needed the special Product Review pod too! </li></ul><ul><li>Dependent Service Invocations </li></ul>
  35. 35. Shopzilla's Website Concurrency <ul><li>Service Invocations can now depend on results of others </li></ul><ul><li>Dependent Callable is configured with two callbacks; </li></ul><ul><ul><li>A callback whose result is blocked for </li></ul></ul><ul><ul><li>A callback which is invoked once the blocking result arrives </li></ul></ul>
  36. 36. Future <ul><li>More use of distributed data structures </li></ul><ul><li>Spring 3.0 </li></ul><ul><ul><li>@Async </li></ul></ul><ul><ul><li>@Scheduled </li></ul></ul><ul><li>JSR-315 Servlets 3.0 </li></ul><ul><ul><li>AsyncContext </li></ul></ul><ul><li>More parallelism at hardware level </li></ul><ul><li>Message passing </li></ul>
  37. 37. Reference <ul><li>Java Concurrency in Practice (Goetz) </li></ul><ul><li>Effective Java (Bloch) </li></ul><ul><li>Spring Enterprise Recipes (Long, Mak) </li></ul><ul><li>http://jcp.org/en/jsr/detail?id=133 </li></ul><ul><li>http://jcp.org/en/jsr/detail?id=166 </li></ul><ul><li>Spring 3.0 </li></ul>

×