Understanding the Disruptor

5,567
-1

Published on

Presented to the London Java Community on the 11th October 2011.

Published in: Technology, Design
2 Comments
23 Likes
Statistics
Notes
No Downloads
Views
Total Views
5,567
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
257
Comments
2
Likes
23
Embeds 0
No embeds

No notes for slide

Understanding the Disruptor

  1. 1. Understanding the DisruptorA Beginners Guide to Hardcore Concurrency
  2. 2. Why is concurrency so difficult ?
  3. 3. OrderingProgram Order: Execution Order (maybe):int w = 10; int x = 20;int x = 20; int y = 30;int y = 30; int b = x * y;int z = 40; int w = 10;int a = w + z; int z = 40;int b = x * y; int a = w + z;
  4. 4. Visibility
  5. 5. Why should we care about the details ?
  6. 6. Increment a Counterstatic long foo = 0;private static void increment() { for (long l = 0; l < 500000000L; l++) { foo++; }}
  7. 7. Using a Lockpublic static long foo = 0;public static Lock lock = new Lock();private static void increment() { for (long l = 0; l < 500000000L; l++) { lock.lock(); try { foo++; } finally { lock.unlock(); } }}
  8. 8. Using an AtomicLongstatic AtomicLong foo = new AtomicLong(0);private static void increment() { for (long l = 0; l < 500000000L; l++) { foo.getAndIncrement(); }}
  9. 9. The Cost of Contention Increment a counter 500 000 000 times.● One Thread : 300 ms
  10. 10. The Cost of Contention Increment a counter 500 000 000 times.● One Thread : 300 ms● One Thread (volatile): 4 700 ms (15x)
  11. 11. The Cost of Contention Increment a counter 500 000 000 times.● One Thread : 300 ms● One Thread (volatile): 4 700 ms (15x)● One Thread (Atomic) : 5 700 ms (19x)
  12. 12. The Cost of Contention Increment a counter 500 000 000 times.● One Thread : 300 ms● One Thread (volatile): 4 700 ms (15x)● One Thread (Atomic) : 5 700 ms (19x)● One Thread (Lock) : 10 000 ms (33x)
  13. 13. The Cost of Contention Increment a counter 500 000 000 times.● One Thread : 300 ms● One Thread (volatile): 4 700 ms (15x)● One Thread (Atomic) : 5 700 ms (19x)● One Thread (Lock) : 10 000 ms (33x)● Two Threads (Atomic) : 30 000 ms (100x)
  14. 14. The Cost of Contention Increment a counter 500 000 000 times.● One Thread : 300 ms● One Thread (volatile): 4 700 ms (15x)● One Thread (Atomic) : 5 700 ms (19x)● One Thread (Lock) : 10 000 ms (33x)● Two Threads (Atomic) : 30 000 ms (100x)● Two Threads (Lock) : 224 000 ms (746x) ^^^^^^^^ ~4 minutes!!!
  15. 15. Parallel v. Serial - String SplittingGuy Steele @ Strangle Loop:http://www.infoq.com/presentations/Thinking-Parallel-ProgrammingScala Implementation and Brute Force version in Java:https://github.com/mikeb01/folklore/
  16. 16. Performance TestParallel (Scala): 440 ops/secSerial (Java) : 1768 ops/sec
  17. 17. CPUs Are Getting Faster Single threaded string split on different CPUs
  18. 18. What problem were we trying to solve ?
  19. 19. Classic Approach to the Problem
  20. 20. The Problems We Found
  21. 21. Why Queues Suck
  22. 22. Why Queues Suck - Linked List
  23. 23. Why Queues Suck - Linked List
  24. 24. Contention Free Design
  25. 25. Now our Pipeline Looks Like...
  26. 26. How Fast Is It - Throughput
  27. 27. How Fast Is It - Latency ABQ Disruptor Min Latency (ns) 145 29 Mean Latency (ns) 32 757 52 99 Percentile Latency (ns) 2 097 152 128 99.99 Percentile Latency (ns) 4 194 304 8 192 Max Latency (ns) 5 069 086 175 567
  28. 28. How does it all work ?
  29. 29. Ordering and Visibility private static final int SIZE = 32; private final Object[] data = new Object[SIZE]; private volatile long sequence = -1; private long nextValue = -1; public void publish(Object value) { long index = ++nextValue; data[(int)(index % SIZE)] = value; sequence = index; } public Object get(long index) { if (index <= sequence) { return data[(int)(index % SIZE)]; } return null; }
  30. 30. Ordering and Visibility - Storemov $0x1,%ecxadd 0x18(%rsi),%rcx ;*ladd;...lea (%r12,%r8,8),%r11 ;*getfield data;...mov %r12b,(%r11,%r10,1)mov %rcx,0x10(%rsi)lock addl $0x0,(%rsp) ;*ladd
  31. 31. Ordering and Visibility - Loadmov %eax,-0x6000(%rsp)push %rbpsub $0x20,%rsp ;*synchronization entry ; - RingBuffer::get@-1 (line 17)mov 0x10(%rsi),%r10 ;*getfield sequence ; - RingBuffer::get@2 (line 17)cmp %r10,%rdxjl 0x00007ff92505f22d ;*iflt ; - RingBuffer::get@6 (line 17)mov %edx,%r11d ;*l2i ; - RingBuffer::get@14 (line 19)
  32. 32. Look Ma No Memory BarrierAtomicLong sequence = new AtomicLong(-1);public void publish(Object value) { long index = ++nextValue; data[(int)(index % SIZE)] = value; sequence.lazySet(index);}
  33. 33. False Sharing - Hidden Contention
  34. 34. Cache Line Paddingpublic class PaddedAtomicLong extends AtomicLong { public volatile long p1, p2, p3, p4, p5, p6 = 7L; //... lines omitted public long sumPaddingToPreventOptimisation() { return p1 + p2 + p3 + p4 + p5 + p6; }}
  35. 35. In Summary● Concurrency is a tool● Ordering and visibility are the key challenges● For performance the details matter● Dont believe everything you read ○ Come up with your own theories and test them!
  36. 36. Q&Arecruitment@lmax.com

×