3. Ordering
Program Order: Execution Order (maybe):
int w = 10; int x = 20;
int x = 20; int y = 30;
int y = 30; int b = x * y;
int z = 40;
int w = 10;
int a = w + z; int z = 40;
int b = x * y; int a = w + z;
6. Increment a Counter
static long foo = 0;
private static void increment() {
for (long l = 0; l < 500000000L; l++) {
foo++;
}
}
7. Using a Lock
public static long foo = 0;
public static Lock lock = new Lock();
private static void increment() {
for (long l = 0; l < 500000000L; l++) {
lock.lock();
try {
foo++;
} finally {
lock.unlock();
}
}
}
8. Using an AtomicLong
static AtomicLong foo = new AtomicLong(0);
private static void increment() {
for (long l = 0; l < 500000000L; l++) {
foo.getAndIncrement();
}
}
9. The Cost of Contention
Increment a counter 500 000 000 times.
● One Thread : 300 ms
10. The Cost of Contention
Increment a counter 500 000 000 times.
● One Thread : 300 ms
● One Thread (volatile): 4 700 ms (15x)
11. The Cost of Contention
Increment a counter 500 000 000 times.
● One Thread : 300 ms
● One Thread (volatile): 4 700 ms (15x)
● One Thread (Atomic) : 5 700 ms (19x)
12. The Cost of Contention
Increment a counter 500 000 000 times.
● One Thread : 300 ms
● One Thread (volatile): 4 700 ms (15x)
● One Thread (Atomic) : 5 700 ms (19x)
● One Thread (Lock) : 10 000 ms (33x)
13. The Cost of Contention
Increment a counter 500 000 000 times.
● One Thread : 300 ms
● One Thread (volatile): 4 700 ms (15x)
● One Thread (Atomic) : 5 700 ms (19x)
● One Thread (Lock) : 10 000 ms (33x)
● Two Threads (Atomic) : 30 000 ms (100x)
14. The Cost of Contention
Increment a counter 500 000 000 times.
● One Thread : 300 ms
● One Thread (volatile): 4 700 ms (15x)
● One Thread (Atomic) : 5 700 ms (19x)
● One Thread (Lock) : 10 000 ms (33x)
● Two Threads (Atomic) : 30 000 ms (100x)
● Two Threads (Lock) : 224 000 ms (746x)
^^^^^^^^
~4 minutes!!!
15. Parallel v. Serial - String Splitting
Guy Steele @ Strangle Loop:
http://www.infoq.com/presentations/Thinking-Parallel-
Programming
Scala Implementation and Brute Force version in Java:
https://github.com/mikeb01/folklore/
29. Ordering and Visibility
private static final int SIZE = 32;
private final Object[] data = new Object[SIZE];
private volatile long sequence = -1;
private long nextValue = -1;
public void publish(Object value) {
long index = ++nextValue;
data[(int)(index % SIZE)] = value;
sequence = index;
}
public Object get(long index) {
if (index <= sequence) {
return data[(int)(index % SIZE)];
}
return null;
}
30. Ordering and Visibility - Store
mov $0x1,%ecx
add 0x18(%rsi),%rcx ;*ladd
;...
lea (%r12,%r8,8),%r11 ;*getfield data
;...
mov %r12b,(%r11,%r10,1)
mov %rcx,0x10(%rsi)
lock addl $0x0,(%rsp) ;*ladd
32. Look Ma' No Memory Barrier
AtomicLong sequence = new AtomicLong(-1);
public void publish(Object value) {
long index = ++nextValue;
data[(int)(index % SIZE)] = value;
sequence.lazySet(index);
}
34. Cache Line Padding
public class PaddedAtomicLong extends AtomicLong {
public volatile long p1, p2, p3, p4, p5, p6 = 7L;
//... lines omitted
public long sumPaddingToPreventOptimisation() {
return p1 + p2 + p3 + p4 + p5 + p6;
}
}
35. In Summary
● Concurrency is a tool
● Ordering and visibility are the key challenges
● For performance the details matter
● Don't believe everything you read
○ Come up with your own theories and test them!