Paractical Solutions for Multicore Programming

Practical Solutions
for Multicore Programming

Dr. Guy Korland

Process 1
a = acc.get()
a = a + 100

Process 2
b = acc.get()
b = b + 50
acc.set(b)

acc.set(a)
... Lost Update! ...

Process 1

Process 2

lock(A)
lock(B)
….
lock(A)
lock(A)
... DeadLock! ...

Process 1

Process 2

atomic{
a = acc.get()
a = a + 100
acc.set(a)
}

atomic{
b = acc.get()
b = b + 50
acc.set(b)
}

... WIIIII! ...

Intel TSX
if(_xbegin()==-1) {
if( !fallback_mutex.is_acquired() ) {
tions.
sums[mygroup] += data[i];e instruc
impl
} else {
d to s
e
_xabort(1);
● Limit
ll-back
fa
}
erency
Coh
uires
● _xend();
Req
Cache
} else {
ing on
●fallback_mutex.acquire();
Relay
sums[mygroup] += data[i];
fallback_mutex.release();
}

We still need
Software Transactional Memory

DSTM2

Maurice Herlihy et al, A flexible framework … [OOPSLA06]

@atomic public interface INode{
int getValue ();
void setValue (int value );
jects.
}
to Ob
d

imite
L
sive.
Factory<INode> factory ru
int = Thread.makeFactory(INode.class );
aries.
●
final INodeVery factory.create(); ort libr
node =
factory result = Thread.doIt(new Callable<Boolean>() {
’t supp e (fork).
n
● Does
public Boolean call nc
rma () {
return node.setValue(value);
perfo
● Bad
} });
●

JVSTM

João Cachopo and António Rito-Silva, Versioned boxes as the
basis for memory transactions [SCOOL05]

public class Account{
private VBox<Long> balance = new aries.
VBox<Long>();

}

rt libr
suppo
public @Atomic void withdraw(long amount) {
esn’t
● Do
e. - amount); hared fields
balance.put rusiv
int(balance.get() nce” s
● Less
}
nnou
to “A
● Need

Atom-Java

B. Hindman and D. Grossman. Atomicity via source-tosource
translation. [MSPC06]

public void update ( double value) {
Atomic {
ord.
w
commission += value; erved
a res
tion.
● Add
}
ompila ries.
pre-c
}
ibra
●
eed

N
ort l
’t supp sive.
n
● Does
s intru
n Les
● Eve

Deuce STM - API
G. Korland, N. Shavit and P. Felber, “Noninvasive Java
Concurrency with Deuce STM”, [MultiProg '10]

public class Bank{
rds.
ed wo
private double commission = 10;
serv

No re
ased.
nb
tion.
@Atomicnnotatio
mpila
● A
re co
pac1,-Account ac2,rdouble amount){
public void transaction( Account
ies.
d for
ee
● No n (amount + commission);lib
al ra ol
ac1.balance -=
xtern
ac2.balanceppamount;e
+= orts
rch to
● Su
resea
}
able –
d
● Exten
}
●

Benchmarks

(Sun UltraSPARC T2 Plus – 2 x Quad x 8 HT)

Benchmarks

(Azul – Vega2 – 2 x 48)

Benchmark - the dark side
1.2

1

0.8

0.6

0.4

0.2

0
1

2

3

4

5

6

7

8

9

10

Overhead
●

Contention – Retries, Aborts, Contention Manager …

●

STM Algorithm – Data structures, optimistic, pessimistic…

●

Semantic – Consistency model, Privatization…

●

Instrumented Memory access – Linear overhead on every read/write

Static analysis Optimizations
1. Avoiding instrumentation of accesses to immutable and
transaction-local memory.
2. Avoiding lock acquisition and releases for
local memory.

thread-

3. Avoiding readset population in read-only transactions.

Novel Static analysis
Optimizations

Y. Afek, G. Korland, and A. Zilberstein,
“Lowering STM Overhead with Static Analysis”, LCPC'10

1. Reduce amount of instrumented memory reads using load
elimination.
2. Reduce amount of instrumented memory writes using scalar
promotion.
3. Avoid writeset lookups for memory not yet written to.
4. Avoid writeset record keeping for memory that will not be read.
5. Reduce false conflicts by Transaction re-scoping.
...

We still need
Fine-Grained
Concurrent Data Structures

e.g. Pool
• P1

• Get( )

• Put(x)

• C2

• P2

•.
•.
•.

• C1

• Put(y)

• Get( )

• Pn • Put(z)

• Get( )

• pool

•.
•.
•.
• Cn

Java - pools
1. SynchronousQueue/Stack -

pairing up function without buffering.
Producers and consumers wait for one another

labilty.
/FIFO
Sca
LIFO and leave,
mited
● Li
2. LinkedBlockingQueuet- Producers put their value
' need
n
Consumers wait l does become available.
for a value to
● Poo

3. ConcurrentLinkedQueue - Producers put their value and leave,
Consumers return null if the pool is empty.

ED-Tree
Scalable Producer-Consumr Pools Based on Elimination-Diffraction Trees
(Y. Afek, G. Korland, M. Natanzon, N. Shavit)

:
ucture
● Merge
ee Str
ng-Tr
fracti
● Dif
ach)
d Zem
cture
an
havit
e Stru
(S
n-Tre
inatio
● Elim
itou)
nd Tou
ueue
a v it a
(Sh
kingQ
Bloc
ed
● Link

Do we really need Linearizability?

The solution:
Relax the Linearizability Requirements
Y. Afek, G. Korland, and A. Yanovsky,
“Quasi-Linearizability: relaxed consistency for improved concurrency”,
OPODIS'10

e.g. Task Queue
Tail

Head

Task

Task Consumers

Task

Task

Task

Task

Task Producers

K-Quasi Task Queue
k
Tail

Head
Task

Task

Task

Task
Consumer

Task

Task

Task
Consumer

Task

Task

Quasi Linearizable Definition

H’

1

2

3

4

5

6

H

4

1

2

3

5

6

Distance 3

More motivation...
●

Statistical Counter

●

ID generator

●

Web Cache

Paractical Solutions for Multicore Programming

Paractical Solutions for Multicore Programming

More Related Content

What's hot

Similar to Paractical Solutions for Multicore Programming

More from Guy Korland

Recently uploaded

Paractical Solutions for Multicore Programming

Editor's Notes