Speculative Locking: Breaking the Scale Barrier (JAOO 2005)
 

Speculative Locking: Breaking the Scale Barrier (JAOO 2005)

on

  • 1,075 views

This is a 2005 presentation on the use of transactional memory to support parallelism through synchronized block semantics. Measurements done on Azul's Vega hardware, which was the first commercial ...

This is a 2005 presentation on the use of transactional memory to support parallelism through synchronized block semantics. Measurements done on Azul's Vega hardware, which was the first commercial hardware to ship with HTM support. Many lessons learned since then, but a good reference point in time, and with Intel x86 now supporting similar HTM capabilities, we're sure to see this subject revived.

Statistics

Views

Total Views
1,075
Views on SlideShare
911
Embed Views
164

Actions

Likes
2
Downloads
14
Comments
0

25 Embeds 164

http://www.infoq.com 64
http://t.co 32
http://vanillajava.blogspot.com 11
https://twitter.com 8
http://vanillajava.blogspot.co.uk 6
http://vanillajava.blogspot.co.il 4
http://vanillajava.blogspot.fr 4
http://vanillajava.blogspot.com.au 4
http://vanillajava.blogspot.in 3
http://www.javacodegeeks.com 3
http://vanillajava.blogspot.ru 3
http://wa3020m3f.naeast.ad.jpmorganchase.com 2
http://vanillajava.blogspot.de 2
http://vanillajava.blogspot.sg 2
http://www.oschina.net 2
http://getpocket.com 2
http://vanillajava.blogspot.jp 2
http://vanillajava.blogspot.com.es 2
http://vanillajava.blogspot.se 2
http://vanillajava.blogspot.it 1
http://vanillajava.blogspot.nl 1
http://vanillajava.blogspot.ca 1
http://vanillajava.blogspot.be 1
http://feedly.com 1
http://vanillajava.blogspot.hk 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Reader/writer locks are an attempt at solving this but they do not solve this completely because they do not distinguish different buckets usually. <br />
  • In summary, we have discussed the CIO Challenges, the emerging category of BTO and Mercury Interactive’s new BTO solution suite. Let’s discuss how customers get started and see results through BTO. Unlike the mega projects of the last decade, BTO solutions are not monoliths that require tens of millions of dollars, several years of implementation and still present an uncertain outcome. BTO is incremental and immediate – focusing first on improving what the customer has “today”. In 3-6 weeks customers can see improvements in their existing environments. <br />   <br />
  • Talking points: HW is conservative, can detect false collision without data contention. <br />
  • Reader/writer locks are an attempt at solving this but they do not solve this completely because they do not distinguish different buckets usually. <br />
  • Hashtable size: 100 entries <br />
  • Same Hashtable as before, now with some collisions. <br />
  • Make size() cheap and reduce the window of potential data contention inside size(). <br />
  • For the Accounts example one could argue the use of concurrent.Atomics, since this only updates a single variable. Whereas a Point cannot be updated with one atomic operation. <br />

Speculative Locking: Breaking the Scale Barrier (JAOO 2005) Presentation Transcript

  • 1. Speculative Locking: Breaking the Scale Barrier Gil Tene, VP Technology, CTO Ivan Posva, Senior Staff Engineer Azul Systems © 2005 Azul Systems, Inc. | Confidential
  • 2. Multi-threaded Java Apps can Scale www.azulsystems.com New JVM capabilities improve multi-threaded application scalability. How can this affect the way you code? Speculative locking reduces effects of Amdahl's law 2 ©2005 Azul Systems, Inc. Strictly Confidential. Do not distribute or share any of this information without specific approval from Azul Systems.
  • 3. Agenda www.azulsystems.com Why do we care? Lock contention vs. Data contention Transactional synchronized {…} Measurements Effects on how you code Summary 3 ©2005 Azul Systems, Inc. Strictly Confidential. Do not distribute or share any of this information without specific approval from Azul Systems.
  • 4. Why do we care? Multithreaded everywhere www.azulsystems.com • Java™ Applications naturally multi-threaded ─ Thread pools, work queues, shared Collections • Multi-core CPUs from all major vendors ─ 2 or more cores per chip ─ 2 or more threads per core ─ A commodity 4 chip server will soon have 16 threads ─ Heavily multicore/multithreaded chips are here • Amdahl’s law affects everyone ─ Serialized portions of program limit scale 4 ©2005 Azul Systems, Inc. Strictly Confidential. Do not distribute or share any of this information without specific approval from Azul Systems.
  • 5. Amdahl’s Law Serialized portions of program limit scale www.azulsystems.com • efficiency = 1/(N*q + (1-q)) ─ N = # of concurrent threads ─ q = fraction of serialized code 1.0 fraction of serialized code 0.10% 0.50% 1.00% 2.00% 5.00% 20.00% efficiency 0.8 0.6 0.4 0.2 0.0 1 5 11 21 31 41 51 61 71 81 91 processors (N) ©2005 Azul Systems, Inc. Strictly Confidential. Do not distribute or share any of this information without specific approval from Azul Systems.
  • 6. Amdahl’s Law Effect on Throughput www.azulsystems.com 100 0.10% 0.50% 1.00% 2.00% 5.00% 20.00% ideal throughput scale factor 90 80 70 60 50 40 30 20 10 0 1 6 11 21 31 41 51 61 processors 71 81 91 ©2005 Azul Systems, Inc. Strictly Confidential. Do not distribute or share any of this information without specific approval from Azul Systems.
  • 7. Amdahl’s Law Example www.azulsystems.com • The theoretical limit is usually intuitive ─ Assume 10% serialization ─ At best you can do 10x the work of 1 CPU • Efficiency drops are dramatic and may be less intuitive ─ ─ ─ ─ ─ ─ ─ 7 Assume 10% Serialization 10 CPUs will not scale past a speedup of 5.3x (Eff. 0.53) 16 CPUs will not scale past a speedup of 6.4x (Eff. 0.48) 64 CPUs will not scale past a speedup of 8.8x (Eff. 0.14) 99 CPUs will not scale past a speedup of 9.2x (Eff. 0.09) … It will take a whole lot of inefficient CPUs to [never] reach a 10x ©2005 Azul Systems, Inc. Strictly Confidential. Do not distribute or share any of this information without specific approval from Azul Systems.
  • 8. Agenda www.azulsystems.com Why do we care? Lock contention vs. Data contention Transactional synchronized {…} Measurements Effects on how you code Summary 8 ©2005 Azul Systems, Inc. Strictly Confidential. Do not distribute or share any of this information without specific approval from Azul Systems.
  • 9. Lock Contention vs. Data Contention www.azulsystems.com • Lock contention: An attempt by one thread to acquire a lock when another thread is holding it • Data contention: An attempt by one thread to atomically access data when another thread expects to manipulate the same data atomically 9 ©2005 Azul Systems, Inc. Strictly Confidential. Do not distribute or share any of this information without specific approval from Azul Systems.
  • 10. Data Contention in a Shared Data Structure www.azulsystems.com • Readers do not contend • Readers and writers don’t always contend • Even writers may not contend with other writers 10 ©2005 Azul Systems, Inc. Strictly Confidential. Do not distribute or share any of this information without specific approval from Azul Systems.
  • 11. Synchronization and Locking Locks are typically very conservative www.azulsystems.com • Need synchronization for correct execution ─ Critical sections, shared data structures • Intent is to protect against data contention • Can’t easily tell in advance ─ That’s why we lock… • Lock contention >= Data contention ─ In reality: lock contention >>= Data contention 11 ©2005 Azul Systems, Inc. Strictly Confidential. Do not distribute or share any of this information without specific approval from Azul Systems.
  • 12. Database Transactions The industry has already solved a similar problem www.azulsystems.com • Semantics of potential failure exposed to the application • Transactions: atomic group of DB commands ─ All or nothing ─ From “BEGIN TRANSACTION” to “COMMIT” • Data contention results in a rollback ─ Leaves no trace • Application can re-execute until successful • Optimistic concurrency does scale 12 ©2005 Azul Systems, Inc. Strictly Confidential. Do not distribute or share any of this information without specific approval from Azul Systems.
  • 13. Agenda www.azulsystems.com Why do we care? Lock contention vs. Data contention Transactional synchronized {…} Measurements Effects on how you code Summary 13 ©2005 Azul Systems, Inc. Strictly Confidential. Do not distribute or share any of this information without specific approval from Azul Systems.
  • 14. www.azulsystems.com There is no spoon.
  • 15. What does synchronized mean? www.azulsystems.com • It does not actually mean: grab lock, execute block, release lock • It does mean: execute block atomically in relation to other blocks synchronizing on the same object • It can be satisfied by the more conservative: execute block atomically in relation to all other threads • That looks a lot like a transaction “The Java Language Specification”, “The Java Virtual Machine Specification”, JSR133 15 ©2005 Azul Systems, Inc. Strictly Confidential. Do not distribute or share any of this information without specific approval from Azul Systems.
  • 16. Transactional synchronized {…} www.azulsystems.com • Two basic requirements ─ Detect data contention within the block ─ Roll back synchronized block on data contention • synchronized can run concurrently ─ Azul uses hardware assist to detect data contention ─ Azul VM rolls back synchronized blocks that encounter data contention 16 ©2005 Azul Systems, Inc. Strictly Confidential. Do not distribute or share any of this information without specific approval from Azul Systems.
  • 17. Transactional synchronized {…} www.azulsystems.com • The Azul VM maintains the semantic meaning of: execute block atomically in relation to all other threads • Uncontended synchronized blocks run just as fast as before • Data contended synchronized blocks still serialize execution • synchronized blocks without data contention can execute in parallel 17 ©2005 Azul Systems, Inc. Strictly Confidential. Do not distribute or share any of this information without specific approval from Azul Systems.
  • 18. Transactional synchronized {…} It’s all transparent www.azulsystems.com • No changes to Java code ─ The VM handles everything • Nested synchronized blocks ─ Roll back to outermost transactional synchronized • Reduces serialization • Amdahl’s Law now only reflects data contention ─ Desire to reduce data contention 18 ©2005 Azul Systems, Inc. Strictly Confidential. Do not distribute or share any of this information without specific approval from Azul Systems.
  • 19. Implementation in a VM How does it fit in the current locking schemes? www.azulsystems.com • Thin locks handle uncontended synchronized blocks ─ Most common case ─ Uses CAS, no OS interaction • Thick locks handle data contended synchronized blocks ─ Blocks in the OS • Transactional monitors handle contended synchronized blocks that have no data contention ─ Execute synchronized blocks in parallel ─ Uses HW support 19 ©2005 Azul Systems, Inc. Strictly Confidential. Do not distribute or share any of this information without specific approval from Azul Systems.
  • 20. Agenda www.azulsystems.com Why do we care? Lock contention vs. Data contention Transactional synchronized {…} Measurements Effects on how you code Summary 20 ©2005 Azul Systems, Inc. Strictly Confidential. Do not distribute or share any of this information without specific approval from Azul Systems.
  • 21. Data Contention and Hashtables www.azulsystems.com • Examples of no data contention in a Hashtable ─ 2 readers ─ 1 reader, 1 writer, different hash buckets ─ 2 writers, different hash buckets • Examples of data contention in a Hashtable ─ 1 reader, 1 writer in same hash bucket ─ 2 writers in same hash bucket 21 ©2005 Azul Systems, Inc. Strictly Confidential. Do not distribute or share any of this information without specific approval from Azul Systems.
  • 22. Measurements: Hashtable (0% writes) www.azulsystems.com 100000 Locking Spec. Locking 10000 1000 100 10 22 Th re ad s 12 8 Th re ad s 64 Th re ad s 32 Th re ad s 16 Th re ad s 8 4 Th re ad s 1 ©2005 Azul Systems, Inc. Strictly Confidential. Do not distribute or share any of this information without specific approval from Azul Systems.
  • 23. Measurements: Hashtable (5% writes) www.azulsystems.com 100000 Locking Spec. Locking 10000 1000 100 10 23 Th re ad s 12 8 Th re ad s 64 Th re ad s 32 Th re ad s 16 Th re ad s 8 4 Th re ad s 1 ©2005 Azul Systems, Inc. Strictly Confidential. Do not distribute or share any of this information without specific approval from Azul Systems.
  • 24. Agenda www.azulsystems.com Why do we care? Lock contention vs. Data contention Transactional synchronized {…} Measurements Effects on how you code Summary 24 ©2005 Azul Systems, Inc. Strictly Confidential. Do not distribute or share any of this information without specific approval from Azul Systems.
  • 25. Coding Techniques How to make use of this new reality? www.azulsystems.com • Use coarse grain synchronization ─ Simpler data structures, simpler code ─ Simplicity equals stability ─ Easier to optimize • Focus on data contention, not on lock contention • Reduce unavoidable data contention • wait() and notify() can become the dominant reason for serialized execution ─ Stripe queues and other uses of wait()/notify() 25 ©2005 Azul Systems, Inc. Strictly Confidential. Do not distribute or share any of this information without specific approval from Azul Systems.
  • 26. Why Coarse Grain Synchronization? www.azulsystems.com • You can spend effort to reduce lock contention ─ ─ ─ ─ ─ Reader/writer lock Stripe locks per bucket Stripe reader/writer locks per bucket How do you grow the table? Gets complex fast • But there is no lock, it’s a synchronized block • With transactional synchronized ─ Keep synchronization coarse ─ Focus on data contention 26 ©2005 Azul Systems, Inc. Strictly Confidential. Do not distribute or share any of this information without specific approval from Azul Systems.
  • 27. Minimizing Data Contention 1 www.azulsystems.com private Object table[]; private int size; public synchronized void put(Object key, Object val) { … // missed, insert into table table[idx] = new HashEntry(key, val, table[idx]); size++; // writer data contention } public synchronized int size() { return size; } 27 ©2005 Azul Systems, Inc. Strictly Confidential. Do not distribute or share any of this information without specific approval from Azul Systems.
  • 28. Minimizing Data Contention 2 www.azulsystems.com private Object table[]; private int sizes[]; public synchronized void put(Object key, Object val) { … // missed, insert into table table[idx] = new HashEntry(key, val, table[idx]); sizes[idx]++; // reduced writer data contention } public synchronized int size() { int size = 0; for (int i=0; i<sizes.length; i++) size += sizes[i]; return size; } 28 ©2005 Azul Systems, Inc. Strictly Confidential. Do not distribute or share any of this information without specific approval from Azul Systems.
  • 29. Minimizing Data Contention 3 www.azulsystems.com private Object table[]; private int sizes[]; private int cachedSize; public synchronized void put(Object key, Object val) { … // missed, insert into table table[idx] = new HashEntry(key, val, table[idx]); sizes[idx]++; cachedSize = -1; // clear the cache } public synchronized int size() { if (cachedSize < 0) { // reduce size recalculation cachedSize = 0; for (int i=0; i<sizes.length; i++) cachedSize += sizes[i]; } return cachedSize; } 29 ©2005 Azul Systems, Inc. Strictly Confidential. Do not distribute or share any of this information without specific approval from Azul Systems.
  • 30. Minimizing Data Contention 4 www.azulsystems.com private Object table[]; private int sizes[]; private int cachedSize; public synchronized void put(Object key, Object val) { … // missed, insert into table table[idx] = new HashEntry(key, val, table[idx]); sizes[idx]++; if (cachedSize >= 0) cachedSize = -1; // avoid contention } public synchronized int size() { if (cachedSize < 0) { cachedSize = 0; for (int i=0; i<sizes.length; i++) cachedSize += sizes[i]; } return cachedSize; } 30 ©2005 Azul Systems, Inc. Strictly Confidential. Do not distribute or share any of this information without specific approval from Azul Systems.
  • 31. Double Checked Locking Avoided Singleton pattern www.azulsystems.com • Needs to be synchronized at initialization ─ Further synchronization seems to be a waste ─ Web is full of examples of how wrong you can go • Transactional synchronized keeps it simple public class Simple { private Helper helper = null; public synchronized Helper getHelper() { if (helper == null) { helper = new Helper(); } return helper; // no data contention once initialized } // other functions and members … } http://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedLocking.html 31 ©2005 Azul Systems, Inc. Strictly Confidential. Do not distribute or share any of this information without specific approval from Azul Systems.
  • 32. Unavoidable Data Contention www.azulsystems.com Accounts public synchronized void deposit(long amount) { balance += amount; } Points public synchronized void translate(int dx, int dy) { x += dx; y += dy; } 32 ©2005 Azul Systems, Inc. Strictly Confidential. Do not distribute or share any of this information without specific approval from Azul Systems.
  • 33. wait()/notify() Example: striping work queues www.azulsystems.com • Stripe work across multiple queues Task task = new WorkTask(…); Queue queue = queues[task.hashCode() % queues.length]; synchronized (queue) { queue.enqueue(task); queue.notify(); } • Workers can be statically assigned to a queue synchronized (queues) { queue = queues[num_workers++ % queues.length]; } 33 while (true){ synchronized (queue) { queue.wait(); Task task = queue.dequeue(); } task.execute(); } Azul Systems, Inc. Strictly Confidential. Do not distribute or share any of this information without specific approval from Azul Systems. ©2005
  • 34. Agenda www.azulsystems.com Why do we care? Lock contention vs. Data contention Transactional synchronized {…} Measurements Effects on how you code Summary 34 ©2005 Azul Systems, Inc. Strictly Confidential. Do not distribute or share any of this information without specific approval from Azul Systems.
  • 35. Summary www.azulsystems.com • Transparent transactional synchronized() is available • Simplify data structures, save development time ─ Use coarse grain locking ─ Let the VM deal with the scaling problem • Further optimization ─ Be aware of data contention ─ Stripe stats gathering ─ Stripe wait() and notify() 35 ©2005 Azul Systems, Inc. Strictly Confidential. Do not distribute or share any of this information without specific approval from Azul Systems.
  • 36. Our Lawyers made us say this… www.azulsystems.com "Azul Systems, Azul, and the Azul arch logo are trademarks of Azul Systems, Inc. in the United States and other countries. Sun, Sun Microsystems, Java and all Java based trademarks and logos are trademarks or registered trademarks of Sun Microsystems, Inc. in the United States and other countries. Other marks are the property of their respective owners and are used here only for identification purposes." 36 ©2005 Azul Systems, Inc. Strictly Confidential. Do not distribute or share any of this information without specific approval from Azul Systems.
  • 37. Q&A © 2005 Azul Systems, Inc. | Confidential
  • 38. Thank you. gil@azulsystems.com ivan@azulsystems.com © 2005 Azul Systems, Inc. | Confidential