Lock? We don't need no stinkin' locks!

31,282 views

Published on

Published in: Technology
0 Comments
7 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
31,282
On SlideShare
0
From Embeds
0
Number of Embeds
28,735
Actions
Shares
0
Downloads
119
Comments
0
Likes
7
Embeds 0
No embeds

No notes for slide
  • - Concurrency is taught all wrong.\n- What is non-blocking concurrency.\n- Mechanical Sympathy, locks/mutexs are a completely artificial construct\n- MTs concurrency course blocking v. non-blocking.\n- Tools for non-blocking concurrency functions of the CPU, need to look at CPU architecture first.\n
  • - Causality\n- Why CPUs/Compilers reorder\n
  • - Java Memory Model provides serial consistency for race-free programs\n- As-if-serial\n- Disallows out of thin air values\n- First main-stream programming language to include a memory model (C/C++ combination of the CPU and whatever the compiler happens to do.\n
  • \n
  • \n
  • \n
  • - volatile\n- java.util.concurrent.atomic.*\n - Atomic<Long|Integer|Reference>\n - Atomic<Long|Integer|Reference>Array (why use over an array of atomics)\n - Atomic<Long|Integer|Reference>FieldUpdater (can be a bit slow)\n
  • - Fight club\n- If you’re smart enough\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • - Thread wake ups\n- Hard spin\n- Spin with yield\n- PAUSE instruction - please add to Java\n- MONITOR and MWAIT\n
  • \n
  • Lock? We don't need no stinkin' locks!

    1. 1. Locks? We don’t need no stinkin’ Locks! @mikeb2701http://bad-concurrency.blogspot.com Image: http://subcirlce.co.uk
    2. 2. Memory Models
    3. 3. Happens-Before
    4. 4. CausalityCausality Fear will keep the local systems inline. instructions - Grand Moff Wilhuff Tarkin
    5. 5. • Loads are not reordered with other loads.• Stores are not reordered with other stores.• Stores are not reordered with older loads.• In a multiprocessor system, memory ordering obeys causality (memory ordering respects transitive visibility).• In a multiprocessor system, stores to the same location have a total order.• In a multiprocessor system, locked instructions to the same location have a total order.• Loads and Stores are not reordered with locked instructions.
    6. 6. Non-Blocking Primitives
    7. 7. Unsafe
    8. 8. public class AtomicLong extends Number implements Serializable { // ... private volatile long value; // ... /** * Sets to the given value. * * @param newValue the new value */ public final void set(long newValue) { value = newValue; } // ...}
    9. 9. # {method} set (J)V in java/util/concurrent/atomic/AtomicLong# this: rsi:rsi = java/util/concurrent/atomic/AtomicLong# parm0: rdx:rdx = long# [sp+0x20] (sp of caller) mov 0x8(%rsi),%r10d shl $0x3,%r10 cmp %r10,%rax jne 0x00007f1f410378a0 ; {runtime_call} xchg %ax,%ax nopl 0x0(%rax,%rax,1) xchg %ax,%ax push %rbp sub $0x10,%rsp nop mov %rdx,0x10(%rsi) lock addl $0x0,(%rsp) ;*putfield value ; - j.u.c.a.AtomicLong::set@2 (line 112) add $0x10,%rsp pop %rbp test %eax,0xa40fd06(%rip) # 0x00007f1f4b471000 ; {poll_return}
    10. 10. public class AtomicLong extends Number implements Serializable { // setup to use Unsafe.compareAndSwapLong for updates private static final Unsafe unsafe = Unsafe.getUnsafe(); private static final long valueOffset; // ... /** * Eventually sets to the given value. * * @param newValue the new value * @since 1.6 */ public final void lazySet(long newValue) { unsafe.putOrderedLong(this, valueOffset, newValue); } // ...}
    11. 11. # {method} lazySet (J)V in java/util/concurrent/atomic/AtomicLong# this: rsi:rsi = java/util/concurrent/atomic/AtomicLong# parm0: rdx:rdx = long# [sp+0x20] (sp of caller) mov 0x8(%rsi),%r10d shl $0x3,%r10 cmp %r10,%rax jne 0x00007f1f410378a0 ; {runtime_call} xchg %ax,%ax nopl 0x0(%rax,%rax,1) xchg %ax,%ax push %rbp sub $0x10,%rsp nop mov %rdx,0x10(%rsi) ;*invokevirtual putOrderedLong ; - AtomicLong::lazySet@8 (line 122) add $0x10,%rsp pop %rbp test %eax,0xa41204b(%rip) # 0x00007f1f4b471000 ; {poll_return}
    12. 12. public class AtomicInteger extends Number implements Serializable { // setup to use Unsafe.compareAndSwapInt for updates private static final Unsafe unsafe = Unsafe.getUnsafe(); private static final long valueOffset; private volatile int value; //... public final boolean compareAndSet(int expect, int update) { return unsafe.compareAndSwapInt(this, valueOffset, expect, update); }}
    13. 13. # {method} compareAndSet (JJ)Z in java/util/concurrent/atomic/AtomicLong # this: rsi:rsi = java/util/concurrent/atomic/AtomicLong # parm0: rdx:rdx = long # parm1: rcx:rcx = long # [sp+0x20] (sp of caller) mov 0x8(%rsi),%r10d shl $0x3,%r10 cmp %r10,%rax jne 0x00007f6699037a60 ; {runtime_call} xchg %ax,%ax nopl 0x0(%rax,%rax,1) xchg %ax,%ax sub $0x18,%rsp mov %rbp,0x10(%rsp) mov %rdx,%rax lock cmpxchg %rcx,0x10(%rsi) sete %r11b movzbl %r11b,%r11d ;*invokevirtual compareAndSwapLong ; - j.u.c.a.AtomicLong::compareAndSet@9 (line149) mov %r11d,%eax add $0x10,%rsp pop %rbp test %eax,0x91df935(%rip) # 0x00007f66a223e000 ; {poll_return}
    14. 14. set() compareAndSet lazySet() 96.75 4.52.25 0 nanoseconds/op
    15. 15. Example - Disruptor Multi-producerprivate void publish(Disruptor disruptor, long value) { long next = disruptor.next(); disruptor.setValue(next, value); disruptor.publish(next);}
    16. 16. Example - Disruptor Multi-producerpublic long next() { long next; long current; do { current = nextSequence.get(); next = current + 1; while (next > (readSequence.get() + size)) { LockSupport.parkNanos(1L); continue; } } while (!nextSequence.compareAndSet(current, next)); return next;}
    17. 17. Algorithm: Spin - 1public void publish(long sequence) { long sequenceMinusOne = sequence - 1; while (cursor.get() != sequenceMinusOne) { // Spin } cursor.lazySet(sequence);}
    18. 18. Spin - 1 25 18.75million ops/sec 12.5 6.25 0 1 2 3 4 5 6 7 8 Producer Threads
    19. 19. Algorithm: Co-Oppublic void publish(long sequence) { int counter = RETRIES; while (sequence - cursor.get() > pendingPublication.length()) { if (--counter == 0) { Thread.yield(); counter = RETRIES; } } long expectedSequence = sequence - 1; pendingPublication.set((int) sequence & pendingMask, sequence); if (cursor.get() >= sequence) { return; } long nextSequence = sequence; while (cursor.compareAndSet(expectedSequence, nextSequence)) { expectedSequence = nextSequence; nextSequence++; if (pendingPublication.get((int) nextSequence & pendingMask) != nextSequence) { break; } }}
    20. 20. Spin - 1 Co-Op 30 22.5million ops/sec 15 7.5 0 1 2 3 4 5 6 7 8 Producer Threads
    21. 21. Algorithm: Bufferpublic long next() { long next; long current; do { current = cursor.get(); next = current + 1; while (next > (readSequence.get() + size)) { LockSupport.parkNanos(1L); continue; } } while (!cursor.compareAndSet(current, next)); return next;}
    22. 22. Algorithm: Bufferpublic void publish(long sequence) { int publishedValue = (int) (sequence >>> indexShift); published.set(indexOf(sequence), publishedValue);}// Get Valueint availableValue = (int) (current >>> indexShift);int index = indexOf(current);while (published.get(index) != availableValue) { // Spin}
    23. 23. Spin - 1 Co-Op Buffer 70 52.5million ops/sec 35 17.5 0 1 2 3 4 5 6 7 8 Threads
    24. 24. Stuff that sucks...
    25. 25. Q&A• https://github.com/mikeb01/jax2012• http://www.lmax.com/careers• http://www.infoq.com/presentations/Lock- free-Algorithms• http://www.youtube.com/watch? v=DCdGlxBbKU4

    ×