SlideShare a Scribd company logo
LLVM Greedy Register
Allocation
Kai
hsiangkai@gmail.com
Outline
• Introduction to Register Allocation Problem
• LLVM Register Allocation Template Method
• LLVM Basic Register Allocation
• LLVM Greedy Register Allocation
Introduction to Register
Allocation
• Definition
• Register allocation is the problem of mapping
program variables to either machine registers or
memory addresses.
• Best solution
• minimise the number of loads/stores from/to memory
• NP-complete
int main()
{
int i, j;
int answer;
for (i = 1; i < 10; i++)
for (j = 1; j < 10; j++) {
answer = i * j;
}
return 0;
}
_main:
@ BB#0:
sub sp, #16
movs r0, #0
str r0, [sp, #12]
movs r0, #1
str r0, [sp, #8]
b LBB0_2
LBB0_1:
adds r1, #1
str r1, [sp, #8]
LBB0_2:
ldr r1, [sp, #8]
cmp r1, #9
bgt LBB0_6
@ BB#3:
str r0, [sp, #4]
b LBB0_5
LBB0_4:
ldr r2, [sp, #4]
muls r1, r2, r1
str r1, [sp]
ldr r1, [sp, #4]
adds r1, #1
Graph Coloring
• For an arbitrary graph G; a coloring of G assigns a
color to each node in G so that no pair of adjacent
nodes have the same color.
2-colorable 3-colorable
Graph Coloring for RA
• Node: Live interval
• Edge: Two live intervals have interference
• Color: Physical register
• Find a optimal colouring for the graph
…
a0 = …
b0 = …
… = b0
d0 = …
c0 = …
…
d1 = c0
… = a0
… = d1
B0
B1 B2
B3
…
LIa = …
LIb = …
… = LIb
LIc = …
…
LId = LIc
… = LIa
… = LId
B0
B1 B2
B3
LIa
LIb LIc
LId
…
LIa = …
LIb = …
… = LIb
LIc = …
…
LId = LIc
… = LIa
… = LId
B0
B1 B2
B3
LLVM Register Allocation
• Basic
• Provide a minimal implementation of the basic register allocator
• Greedy
• Global live range splitting.
• Fast
• This register allocator allocates registers to a basic block at a
time.
• PBQP
• Partitioned Boolean Quadratic Programming (PBQP) based
register allocator for LLVM
Template Method
• Define the skeleton of an algorithm in an operation,
deferring some steps to subclasses.
LLVM Register Allocation Template Method
Enqueue All
LiveInterval
selectOrSplit for One
LiveInterval
Assign the Physical
Register
Enqueue Split
LiveInterval
dequeue
physical register is available
split live interval
allocatePhysRegs
enqueue
seedLiveRegs
Q
customised by new RA algorithm
for (unsigned i = 0, e = MRI->getNumVirtRegs(); i != e; ++i) {
unsigned Reg = TargetRegisterInfo::index2VirtReg(i);
if (MRI->reg_nodbg_empty(Reg))
continue;
enqueue(&LIS->getInterval(Reg));
}
Basic Register
Allocation
LLVM Basic Register Allocation
Calculate
LiveInterval Weight
Enqueue All
LiveInterval
RABasic::selectOrSplit
Assign the Physical
Register
Enqueue Split
LiveInterval
dequeue
physical register is available
split live interval
update LiveInterval.weight
(spill cost)
allocatePhysRegs
enqueue
seedLiveRegs
priority Q
(spill cost)
customised by RABasic algorithm
struct CompSpillWeight {
bool operator()(LiveInterval *A, LiveInterval *B) const {
return A->weight < B->weight;
}
};
1. Assign physical registers to Live Interval with highest spill cost.
2. If there is no physical registers for current Live Interval, select

the highest spill cost Live Interval between current one and 

interferences to assign physical registers.
3. Spill the unassigned Live Intervals.
LiveInterval Weight
• Weight for one instruction with the register
• weight = (isDef + isUse) * (Block Frequency / Entry Frequency)
• loop induction variable: weight *= 3
• For all instructions with the register
• totalWeight += weight
• Hint: totalWeight *= 1.01
• Re-materializable: totalWeight *= 0.5
• LiveInterval.weight = totalWeight / size of LiveInterval
Greedy Register
Allocation
• Example (assign physical registers by length)
Q0
D0 D1
Q1
D2 D3
V1
V2
V3 V4
V5
Q0
D0 D1
Q1
D2 D3
V1
V2
V3 V4
V5
• No physical register for V1
Q0
D0 D1
Q1
D2 D3
V1
V2
V3 V4
V5
• Evict V2 (evict Live Interval with lower spill cost)
Q0
D0 D1
Q1
D2 D3
V1
V2
V3V4
V5
stack
• Split V2
Q0
D0 D1
Q1
D2 D3
V1
V2b
V3V4
V5
V2a
V2c
• Split V2
Q0
D0 D1
Q1
D2 D3
V1
V2b
V3V4
V5
V2a
V2c
stack
Greedy RA Stages
• RS_New: created
• RS_Assign: enqueue
• RS_Split: need to split
• RS_Split2
• used for split products that may not be making progress
• RS_Spill: need to spill
• RS_Done: assigned a physical register or created by spill
RS_Split2
• The live intervals created by split will enqueue to
process again.
• There is a risk of creating infinite loops.
… = vreg1 …
… = vreg1 …
… = vreg1 …
vreg2 = COPY vreg1
… = vreg2 …
vreg3 = COPY vreg1
… = vreg3 …
… = vreg3 …
RS_New
RS_Split2
Greedy Register Allocation
try to assign physical register
try to evict to find better register
enter RS_Split
stage
try last chance
recoloring
split
spill
pick a physical register and evict all
interference
found
register
stage >= RS_Done or
Live Interval is unspillable
stage < RS_Split
selectOrSplit(d+1)
selectOrSplit(d)
stage is RS_Split
or RS_Split2
Last Chance Recoloring
• Try to assign a physical register to Live Interval by
evicting all its interferences.
• The recoloring process may recursively use the
last chance recoloring. Therefore, when a virtual
register has been assigned a color by this
mechanism, it is marked as Fixed.
vA can use {R1, R2 }
vB can use { R2, R3}
vC can use {R1 }
vA => R1
vB => R2
vC => fails
vA => R2
vB => R3
vC => R1 (fixed)
selectOrSplit(d) selectOrSplit(d + 1)
How to Split?
is stage
beyond
RS_Spill?
is in one BB? tryLocalSplit
tryInstructionSplit
No
Yes
tryRegionSplit
is stage less
than RS_Split2?
No
spill
Yes
success?
No
success?
spill
No
tryBlockSplit
Yes
No
success?
No
success?
spill
No
done
Yes
Yes
done
Yes
Yes
tryLocalSplit
• Try to split virtual register interval into smaller
intervals inside its only basic block.
• calculate gap weights
• adjust the split region
Calculate Gap Weights
NumGaps = 4
define
use
use
use
use
Calculate Gap Weights
LI.weight
VirtReg Live Interval
If there is a physical register occupied by VirtReg.0
0
define
use
use
use
use
Calculate Gap Weights
LI.weight
physical Live Interval
If there is a fixed physical register.0
0
huge_valf
define
use
use
use
use
Adjust Split Region
SplitAfter = 1
SplitBefore = 0
normalise
spill weight >
max gap
if Diff > BestDiff:
BestBefore = SplitBefore
BestAfter = SplitAfter
SplitAfter++
SplitBefore++
YesNo
normalise spill weight = spill cost / distance
= (#gap * block_freq) / distance(SplitBefore, SplitAfter)
Adjust Split Region
BestAfter
BestBefore
normalise
spill weight >
max gap
if Diff > BestDiff:
BestBefore = SplitBefore
BestAfter = SplitAfter
SplitAfter++
SplitBefore++
YesNo
normalise spill weight = spill cost / distance
= (#gap * block_freq) / distance(SplitBefore, SplitAfter)
RS_New
(or RS_Split2)
RS_New
Go through all physical registers.
Find the most critical range.
tryRegionSplit
• Use Hopfield Network to find optimal splits.
• Guaranteed to converge to a local minimum.
Hopfield Network
a(t)s⇥1 =
⇢
ps⇥1 : t = 0
S(Ws⇥s ⇥ a(t 1)s⇥1 + bs⇥1) : t 1
S(x) =
⇢
+1 : x ✓
1 : x < ✓
tryRegionSplit
1. For every physical register, construct Hopfield Network
• Initialize border constraints
• Initialize Hopfield Network nodes according to
border constraints
• Add links to Hopfield Network and iterate
2. Get the best candidate
3. Do region split
Initialize Border Constraints
• No Interference.
LiveIn ? PrefReg : DontCare;
LiveOut ? PrefReg : DontCare;
enum BorderConstraint {
DontCare,
PrefReg,
PrefSpill,
PrefBoth,
MustSpill
};
Initialize Border Constraints
• There are Interferences.
MustSpill PrefSpill
FirstInstr
LastInstr
PrefReg/DontCare
FirstInstr
LastInstr
FirstInstr
LastInstr
MustSpill
FirstInstr
LastInstr
FirstInstr
LastInstr
FirstInstr
LastInstr
PrefSpill PrefReg/DontCare
Edge Bundle
BB #0
BB #1
BB #3
BB #2
BB #4 BB #5
BB #6
// Join the outgoing bundle with the ingoing bundles of all successors.
for (MachineBasicBlock::const_succ_iterator SI = MBB.succ_begin(),
SE = MBB.succ_end(); SI != SE; ++SI)
EC.join(OutE, 2 * (*SI)->getNumber());
EC:
(BB#0, in) Bundle #0: 0 0 0
(BB#0, out) Bundle #1: 1 1 1
(BB#1, in) Bundle #2: 2 1 1
(BB#1, out) Bundle #3: 3 3 2
(BB#2, in) Bundle #4: 4 3 2
(BB#2, out) Bundle #5: 5 5 3
(BB#3, in) Bundle #6: 6 5 3
(BB#3, out) Bundle #7: 7 7 4
(BB#4, in) Bundle #8: 8 7 4
(BB#4, out) Bundle #9: 9 5 3
(BB#5, in) Bundle #10: 10 7 4
(BB#5, out) Bundle #11: 11 11 -> 1 1
(BB#6, in) Bundle #12: 12 3 2
(BB#6, out) Bundle #13: 13 13 5
void join(unsigned a, unsigned b) {
unsigned eca = EC[a];
unsigned ecb = EC[b];
while (eca != ecb)
if (eca < ecb)
EC[b] = eca, b = ecb, ecb = EC[b];
else
EC[a] = ecb, a = eca, eca = EC[a];
}
Edge Bundle
BB #0
BB #1
BB #3
BB #2
BB #4 BB #5
BB #6 Blocks:
Bundle #0: BB#0
Bundle #1: BB#0, BB#1, BB#5
Bundle #2: BB#1, BB#2, BB#6
Bundle #3: BB#2, BB#3, BB#4
Bundle #4: BB#3, BB#4, BB#5
Bundle #5: BB#6
Bundle #6:
Bundle #7:
Bundle #8:
Bundle #9:
Bundle #10:
Bundle #11:
Bundle #12:
Bundle #13:
EC:
(BB#0, in) Bundle #0: 0 0 0
(BB#0, out) Bundle #1: 1 1 1
(BB#1, in) Bundle #2: 2 1 1
(BB#1, out) Bundle #3: 3 3 2
(BB#2, in) Bundle #4: 4 3 2
(BB#2, out) Bundle #5: 5 5 3
(BB#3, in) Bundle #6: 6 5 3
(BB#3, out) Bundle #7: 7 7 4
(BB#4, in) Bundle #8: 8 7 4
(BB#4, out) Bundle #9: 9 5 3
(BB#5, in) Bundle #10: 10 7 4
(BB#5, out) Bundle #11: 11 1 1
(BB#6, in) Bundle #12: 12 3 2
(BB#6, out) Bundle #13: 13 13 5
Edge Bundle
BB #0
BB #1
BB #3
BB #2
BB #4 BB #5
BB #6 Blocks:
Bundle #0: BB#0
Bundle #1: BB#0, BB#1, BB#5
Bundle #2: BB#1, BB#2, BB#6
Bundle #3: BB#2, BB#3, BB#4
Bundle #4: BB#3, BB#4, BB#5
Bundle #5: BB#6
Bundle #6:
Bundle #7:
Bundle #8:
Bundle #9:
Bundle #10:
Bundle #11:
Bundle #12:
Bundle #13:
EC:
(BB#0, in) Bundle #0: 0 0 0
(BB#0, out) Bundle #1: 1 1 1
(BB#1, in) Bundle #2: 2 1 1
(BB#1, out) Bundle #3: 3 3 2
(BB#2, in) Bundle #4: 4 3 2
(BB#2, out) Bundle #5: 5 5 3
(BB#3, in) Bundle #6: 6 5 3
(BB#3, out) Bundle #7: 7 7 4
(BB#4, in) Bundle #8: 8 7 4
(BB#4, out) Bundle #9: 9 5 3
(BB#5, in) Bundle #10: 10 7 4
(BB#5, out) Bundle #11: 11 1 1
(BB#6, in) Bundle #12: 12 3 2
(BB#6, out) Bundle #13: 13 13 5
Edge Bundle
BB #0
BB #1
BB #3
BB #2
BB #4 BB #5
BB #6 Blocks:
Bundle #0: BB#0
Bundle #1: BB#0, BB#1, BB#5
Bundle #2: BB#1, BB#2, BB#6
Bundle #3: BB#2, BB#3, BB#4
Bundle #4: BB#3, BB#4, BB#5
Bundle #5: BB#6
Bundle #6:
Bundle #7:
Bundle #8:
Bundle #9:
Bundle #10:
Bundle #11:
Bundle #12:
Bundle #13:
EC:
(BB#0, in) Bundle #0: 0 0 0
(BB#0, out) Bundle #1: 1 1 1
(BB#1, in) Bundle #2: 2 1 1
(BB#1, out) Bundle #3: 3 3 2
(BB#2, in) Bundle #4: 4 3 2
(BB#2, out) Bundle #5: 5 5 3
(BB#3, in) Bundle #6: 6 5 3
(BB#3, out) Bundle #7: 7 7 4
(BB#4, in) Bundle #8: 8 7 4
(BB#4, out) Bundle #9: 9 5 3
(BB#5, in) Bundle #10: 10 7 4
(BB#5, out) Bundle #11: 11 1 1
(BB#6, in) Bundle #12: 12 3 2
(BB#6, out) Bundle #13: 13 13 5
Edge Bundle
BB #0
BB #1
BB #3
BB #2
BB #4 BB #5
BB #6 Blocks:
Bundle #0: BB#0
Bundle #1: BB#0, BB#1, BB#5
Bundle #2: BB#1, BB#2, BB#6
Bundle #3: BB#2, BB#3, BB#4
Bundle #4: BB#3, BB#4, BB#5
Bundle #5: BB#6
Bundle #6:
Bundle #7:
Bundle #8:
Bundle #9:
Bundle #10:
Bundle #11:
Bundle #12:
Bundle #13:
EC:
(BB#0, in) Bundle #0: 0 0 0
(BB#0, out) Bundle #1: 1 1 1
(BB#1, in) Bundle #2: 2 1 1
(BB#1, out) Bundle #3: 3 3 2
(BB#2, in) Bundle #4: 4 3 2
(BB#2, out) Bundle #5: 5 5 3
(BB#3, in) Bundle #6: 6 5 3
(BB#3, out) Bundle #7: 7 7 4
(BB#4, in) Bundle #8: 8 7 4
(BB#4, out) Bundle #9: 9 5 3
(BB#5, in) Bundle #10: 10 7 4
(BB#5, out) Bundle #11: 11 1 1
(BB#6, in) Bundle #12: 12 3 2
(BB#6, out) Bundle #13: 13 13 5
Edge Bundle
BB #0
BB #1
BB #3
BB #2
BB #4 BB #5
BB #6 Blocks:
Bundle #0: BB#0
Bundle #1: BB#0, BB#1, BB#5
Bundle #2: BB#1, BB#2, BB#6
Bundle #3: BB#2, BB#3, BB#4
Bundle #4: BB#3, BB#4, BB#5
Bundle #5: BB#6
Bundle #6:
Bundle #7:
Bundle #8:
Bundle #9:
Bundle #10:
Bundle #11:
Bundle #12:
Bundle #13:
EC:
(BB#0, in) Bundle #0: 0 0 0
(BB#0, out) Bundle #1: 1 1 1
(BB#1, in) Bundle #2: 2 1 1
(BB#1, out) Bundle #3: 3 3 2
(BB#2, in) Bundle #4: 4 3 2
(BB#2, out) Bundle #5: 5 5 3
(BB#3, in) Bundle #6: 6 5 3
(BB#3, out) Bundle #7: 7 7 4
(BB#4, in) Bundle #8: 8 7 4
(BB#4, out) Bundle #9: 9 5 3
(BB#5, in) Bundle #10: 10 7 4
(BB#5, out) Bundle #11: 11 1 1
(BB#6, in) Bundle #12: 12 3 2
(BB#6, out) Bundle #13: 13 13 5
Edge Bundle
BB #0
BB #1
BB #3
BB #2
BB #4 BB #5
BB #6 Blocks:
Bundle #0: BB#0
Bundle #1: BB#0, BB#1, BB#5
Bundle #2: BB#1, BB#2, BB#6
Bundle #3: BB#2, BB#3, BB#4
Bundle #4: BB#3, BB#4, BB#5
Bundle #5: BB#6
Bundle #6:
Bundle #7:
Bundle #8:
Bundle #9:
Bundle #10:
Bundle #11:
Bundle #12:
Bundle #13:
EC:
(BB#0, in) Bundle #0: 0 0 0
(BB#0, out) Bundle #1: 1 1 1
(BB#1, in) Bundle #2: 2 1 1
(BB#1, out) Bundle #3: 3 3 2
(BB#2, in) Bundle #4: 4 3 2
(BB#2, out) Bundle #5: 5 5 3
(BB#3, in) Bundle #6: 6 5 3
(BB#3, out) Bundle #7: 7 7 4
(BB#4, in) Bundle #8: 8 7 4
(BB#4, out) Bundle #9: 9 5 3
(BB#5, in) Bundle #10: 10 7 4
(BB#5, out) Bundle #11: 11 1 1
(BB#6, in) Bundle #12: 12 3 2
(BB#6, out) Bundle #13: 13 13 5
Edge Bundle
BB #0
BB #1
BB #3
BB #2
BB #4 BB #5
BB #6 Blocks:
Bundle #0: BB#0
Bundle #1: BB#0, BB#1, BB#5
Bundle #2: BB#1, BB#2, BB#6
Bundle #3: BB#2, BB#3, BB#4
Bundle #4: BB#3, BB#4, BB#5
Bundle #5: BB#6
Bundle #6:
Bundle #7:
Bundle #8:
Bundle #9:
Bundle #10:
Bundle #11:
Bundle #12:
Bundle #13:
EC:
(BB#0, in) Bundle #0: 0 0 0
(BB#0, out) Bundle #1: 1 1 1
(BB#1, in) Bundle #2: 2 1 1
(BB#1, out) Bundle #3: 3 3 2
(BB#2, in) Bundle #4: 4 3 2
(BB#2, out) Bundle #5: 5 5 3
(BB#3, in) Bundle #6: 6 5 3
(BB#3, out) Bundle #7: 7 7 4
(BB#4, in) Bundle #8: 8 7 4
(BB#4, out) Bundle #9: 9 5 3
(BB#5, in) Bundle #10: 10 7 4
(BB#5, out) Bundle #11: 11 1 1
(BB#6, in) Bundle #12: 12 3 2
(BB#6, out) Bundle #13: 13 13 5
Initialize Hopfield Network Node
• update BiasN, BiasP according to BorderConstraint
BB #n (freq)
… = Y op …
PrefReg
PrefSpill
Bundle ib
BiasP += freq
Bundle ob
BiasN += freq
void addBias(BlockFrequency freq, BorderConstraint direction) {
switch (direction) {
default:
break;
case PrefReg:
BiasP += freq;
break;
case PrefSpill:
BiasN += freq;
break;
case MustSpill:
BiasN = BlockFrequency::getMaxFrequency(); // (uint64_t)-1ULL
break;
}
}
Add Links to Hopfield Network
• add weight to links
Live Through
BB #n (freq)
Bundle ib
Bundle ob
void addLink(unsigned b, BlockFrequency w) {
// Update cached sum.
SumLinkWeights += w;
// There can be multiple links to the same bundle, add them up.
for (LinkVector::iterator I = Links.begin(), E = Links.end(); I !=
if (I->second == b) {
I->first += w;
return;
}
// This must be the first link to b.
Links.push_back(std::make_pair(w, b));
}
(freq, ob)
(freq, ib)
Update Hopfield Network
Bundle X
BiasN
BiasP
Value = 0
Bundle A
Value = -1
Bundle B
Value = 1
Bundle C
Value = 1
Bundle D
Value = 1
SumN = BiasN + freqA
SunP = BiasP + freqB + freqC + freqD
(freqA, A) (freqB, B) (freqC, C) (freqD, D)
if (SumN >= SumP + Threshold)
Value = -1;
else if (SumP >= SumN + Threshold)
Value = 1;
else
Value = 0;
a(t)s⇥1 =
⇢
ps⇥1 : t = 0
S(Ws⇥s ⇥ a(t 1)s⇥1 + bs⇥1) : t 1
2
6
6
6
6
4
· · ·
· · ·
· · ·
· · ·
FA FB FC FD 0
3
7
7
7
7
5
⇥
2
6
6
6
6
4
1
1
1
1
0
3
7
7
7
7
5
+
2
6
6
6
6
6
4
...
Biasp Biasn
3
7
7
7
7
7
5
Region Split
• splitLiveThroughBlock
• splitRegInBlock
• splitRegOutBlock
splitLiveThroughBlock
Bundle ib
Value == 1
Bundle ob
Value != 1
Live Through
LiveOut on Stack
first non-PHI
Start
New Int
Bundle ib
Value != 1
Bundle ob
Value == 1
Live Through
LiveIn on Stack
last split point
End
New Int
Live Through
No Interference
Bundle ib
Value == 1
Bundle ob
Value == 1
End
New Int
Start
splitLiveThroughBlock
Bundle ib
Value == 1
Bundle ob
Value == 1
LiveThrough
Non-overlapping interference
New Int
Interference.first()
Interference.last()
New Int
Bundle ib
Value == 1
Bundle ob
Value == 1
LiveThrough
Overlapping interference
New Int
Interference.first()
Interference.last()
New Int
splitRegInBlock
Bundle ib
Value == 1
No LiveOut
Interference after kill
Start
New Int
Bundle ib
Value == 1
Bundle ob
Value != 1
LiveOut on Stack
Interference after last use
LiveOut on Stack
Interference after last use
Interference.fist()
LastInstr
LastInstr
last split point
New Int
Start
Bundle ib
Value == 1
Bundle ob
Value != 1
LastInstr
last split point
New Int
Start
Interference.fist()
Interference.fist()
splitRegInBlock
Bundle ib
Value == 1
LiveOut on Stack
Interference overlapping uses
Start
New Int
Bundle ib
Value == 1
Interference.fist()
LastInstr
last split point
New Int
Start
New Int
Interference.fist()
LastInstr
last split point
New Int
Bundle ob
Value != 1
Bundle ob
Value != 1
LiveOut on Stack
Interference overlapping uses
splitRegOutBlock
No LiveIn
Interference before def
End
New Int
Bundle ib
Value != 1
Bundle ob
Value == 1
Live Through
Interference before def
Live Through
Interference overlapping uses
Interference.last()
FirstInstr
Bundle ib
Value != 1
Bundle ob
Value == 1
Bundle ob
Value == 1
End
New Int
Interference.last()
FirstInstr
last split point
End
New Int
Interference.last()
FirstInstr
New Int

More Related Content

What's hot

from Binary to Binary: How Qemu Works
from Binary to Binary: How Qemu Worksfrom Binary to Binary: How Qemu Works
from Binary to Binary: How Qemu Works
Zhen Wei
 
Binary exploitation - AIS3
Binary exploitation - AIS3Binary exploitation - AIS3
Binary exploitation - AIS3
Angel Boy
 
The Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast StorageThe Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast Storage
Kernel TLV
 
Debug Information And Where They Come From
Debug Information And Where They Come FromDebug Information And Where They Come From
Debug Information And Where They Come From
Min-Yih Hsu
 
Memory model
Memory modelMemory model
Memory model
Yi-Hsiu Hsu
 
Character drivers
Character driversCharacter drivers
Character drivers
pradeep_tewani
 
Vivado hlsのシミュレーションとhlsストリーム
Vivado hlsのシミュレーションとhlsストリームVivado hlsのシミュレーションとhlsストリーム
Vivado hlsのシミュレーションとhlsストリーム
marsee101
 
Stack pivot
Stack pivotStack pivot
Stack pivot
sounakano
 
Project ACRN: SR-IOV implementation
Project ACRN: SR-IOV implementationProject ACRN: SR-IOV implementation
Project ACRN: SR-IOV implementation
Geoffroy Van Cutsem
 
LLVM Backend Porting
LLVM Backend PortingLLVM Backend Porting
LLVM Backend Porting
Shiva Chen
 
BUD17-302: LLVM Internals #2
BUD17-302: LLVM Internals #2 BUD17-302: LLVM Internals #2
BUD17-302: LLVM Internals #2
Linaro
 
Windows 10 Nt Heap Exploitation (English version)
Windows 10 Nt Heap Exploitation (English version)Windows 10 Nt Heap Exploitation (English version)
Windows 10 Nt Heap Exploitation (English version)
Angel Boy
 
Part II: LLVM Intermediate Representation
Part II: LLVM Intermediate RepresentationPart II: LLVM Intermediate Representation
Part II: LLVM Intermediate Representation
Wei-Ren Chen
 
Qemu device prototyping
Qemu device prototypingQemu device prototyping
Qemu device prototyping
Yan Vugenfirer
 
Interpreter, Compiler, JIT from scratch
Interpreter, Compiler, JIT from scratchInterpreter, Compiler, JIT from scratch
Interpreter, Compiler, JIT from scratch
National Cheng Kung University
 
Slab Allocator in Linux Kernel
Slab Allocator in Linux KernelSlab Allocator in Linux Kernel
Slab Allocator in Linux Kernel
Adrian Huang
 
为啥别读HotSpot VM的源码(2012-03-03)
为啥别读HotSpot VM的源码(2012-03-03)为啥别读HotSpot VM的源码(2012-03-03)
为啥别读HotSpot VM的源码(2012-03-03)
Kris Mok
 
Vivado hls勉強会2(レジスタの挿入とpipelineディレクティブ)
Vivado hls勉強会2(レジスタの挿入とpipelineディレクティブ)Vivado hls勉強会2(レジスタの挿入とpipelineディレクティブ)
Vivado hls勉強会2(レジスタの挿入とpipelineディレクティブ)
marsee101
 
Linux kernel debugging
Linux kernel debuggingLinux kernel debugging
Linux kernel debugging
libfetion
 

What's hot (20)

from Binary to Binary: How Qemu Works
from Binary to Binary: How Qemu Worksfrom Binary to Binary: How Qemu Works
from Binary to Binary: How Qemu Works
 
Binary exploitation - AIS3
Binary exploitation - AIS3Binary exploitation - AIS3
Binary exploitation - AIS3
 
The Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast StorageThe Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast Storage
 
Debug Information And Where They Come From
Debug Information And Where They Come FromDebug Information And Where They Come From
Debug Information And Where They Come From
 
Memory model
Memory modelMemory model
Memory model
 
Character drivers
Character driversCharacter drivers
Character drivers
 
Vivado hlsのシミュレーションとhlsストリーム
Vivado hlsのシミュレーションとhlsストリームVivado hlsのシミュレーションとhlsストリーム
Vivado hlsのシミュレーションとhlsストリーム
 
Stack pivot
Stack pivotStack pivot
Stack pivot
 
Project ACRN: SR-IOV implementation
Project ACRN: SR-IOV implementationProject ACRN: SR-IOV implementation
Project ACRN: SR-IOV implementation
 
LLVM
LLVMLLVM
LLVM
 
LLVM Backend Porting
LLVM Backend PortingLLVM Backend Porting
LLVM Backend Porting
 
BUD17-302: LLVM Internals #2
BUD17-302: LLVM Internals #2 BUD17-302: LLVM Internals #2
BUD17-302: LLVM Internals #2
 
Windows 10 Nt Heap Exploitation (English version)
Windows 10 Nt Heap Exploitation (English version)Windows 10 Nt Heap Exploitation (English version)
Windows 10 Nt Heap Exploitation (English version)
 
Part II: LLVM Intermediate Representation
Part II: LLVM Intermediate RepresentationPart II: LLVM Intermediate Representation
Part II: LLVM Intermediate Representation
 
Qemu device prototyping
Qemu device prototypingQemu device prototyping
Qemu device prototyping
 
Interpreter, Compiler, JIT from scratch
Interpreter, Compiler, JIT from scratchInterpreter, Compiler, JIT from scratch
Interpreter, Compiler, JIT from scratch
 
Slab Allocator in Linux Kernel
Slab Allocator in Linux KernelSlab Allocator in Linux Kernel
Slab Allocator in Linux Kernel
 
为啥别读HotSpot VM的源码(2012-03-03)
为啥别读HotSpot VM的源码(2012-03-03)为啥别读HotSpot VM的源码(2012-03-03)
为啥别读HotSpot VM的源码(2012-03-03)
 
Vivado hls勉強会2(レジスタの挿入とpipelineディレクティブ)
Vivado hls勉強会2(レジスタの挿入とpipelineディレクティブ)Vivado hls勉強会2(レジスタの挿入とpipelineディレクティブ)
Vivado hls勉強会2(レジスタの挿入とpipelineディレクティブ)
 
Linux kernel debugging
Linux kernel debuggingLinux kernel debugging
Linux kernel debugging
 

Similar to LLVM Register Allocation (2nd Version)

An Example MIPS
An Example  MIPSAn Example  MIPS
An Example MIPS
Sandra Long
 
Happy To Use SIMD
Happy To Use SIMDHappy To Use SIMD
Happy To Use SIMD
Wei-Ta Wang
 
Verilog Lecture2 thhts
Verilog Lecture2 thhtsVerilog Lecture2 thhts
Verilog Lecture2 thhts
Béo Tú
 
Day2 Verilog HDL Basic
Day2 Verilog HDL BasicDay2 Verilog HDL Basic
Day2 Verilog HDL BasicRon Liu
 
Continuation Passing Style and Macros in Clojure - Jan 2012
Continuation Passing Style and Macros in Clojure - Jan 2012Continuation Passing Style and Macros in Clojure - Jan 2012
Continuation Passing Style and Macros in Clojure - Jan 2012
Leonardo Borges
 
Introduction to Debuggers
Introduction to DebuggersIntroduction to Debuggers
Introduction to Debuggers
Saumil Shah
 
A Deep Dive Into Understanding Apache Cassandra
A Deep Dive Into Understanding Apache CassandraA Deep Dive Into Understanding Apache Cassandra
A Deep Dive Into Understanding Apache Cassandra
DataStax Academy
 
Python fundamentals - basic | WeiYuan
Python fundamentals - basic | WeiYuanPython fundamentals - basic | WeiYuan
Python fundamentals - basic | WeiYuan
Wei-Yuan Chang
 
04 sequentialbasics 1
04 sequentialbasics 104 sequentialbasics 1
04 sequentialbasics 1
Poornima Prasad
 
Lect-06Lect-06Lect-06Lect-06Lect-06Lect-06Lect-06Lect-06
Lect-06Lect-06Lect-06Lect-06Lect-06Lect-06Lect-06Lect-06Lect-06Lect-06Lect-06Lect-06Lect-06Lect-06Lect-06Lect-06
Lect-06Lect-06Lect-06Lect-06Lect-06Lect-06Lect-06Lect-06
ManhHoangVan
 
HHVM on AArch64 - BUD17-400K1
HHVM on AArch64 - BUD17-400K1HHVM on AArch64 - BUD17-400K1
HHVM on AArch64 - BUD17-400K1
Linaro
 
Swug July 2010 - windows debugging by sainath
Swug July 2010 - windows debugging by sainathSwug July 2010 - windows debugging by sainath
Swug July 2010 - windows debugging by sainath
Dennis Chung
 
Pepe Vila - Cache and Syphilis [rooted2019]
Pepe Vila - Cache and Syphilis [rooted2019]Pepe Vila - Cache and Syphilis [rooted2019]
Pepe Vila - Cache and Syphilis [rooted2019]
RootedCON
 
Verilog Lecture4 2014
Verilog Lecture4 2014Verilog Lecture4 2014
Verilog Lecture4 2014
Béo Tú
 
Good news, everybody! Guile 2.2 performance notes (FOSDEM 2016)
Good news, everybody! Guile 2.2 performance notes (FOSDEM 2016)Good news, everybody! Guile 2.2 performance notes (FOSDEM 2016)
Good news, everybody! Guile 2.2 performance notes (FOSDEM 2016)
Igalia
 
Writing Metasploit Plugins
Writing Metasploit PluginsWriting Metasploit Plugins
Writing Metasploit Plugins
amiable_indian
 
Virtual Machine for Regular Expressions
Virtual Machine for Regular ExpressionsVirtual Machine for Regular Expressions
Virtual Machine for Regular Expressions
Alexander Yakushev
 
MuVM: Higher Order Mutation Analysis Virtual Machine for C
MuVM: Higher Order Mutation Analysis Virtual Machine for CMuVM: Higher Order Mutation Analysis Virtual Machine for C
MuVM: Higher Order Mutation Analysis Virtual Machine for C
Susumu Tokumoto
 
Reverse Engineering Dojo: Enhancing Assembly Reading Skills
Reverse Engineering Dojo: Enhancing Assembly Reading SkillsReverse Engineering Dojo: Enhancing Assembly Reading Skills
Reverse Engineering Dojo: Enhancing Assembly Reading Skills
Asuka Nakajima
 
WCTF 2018 binja Editorial
WCTF 2018 binja EditorialWCTF 2018 binja Editorial
WCTF 2018 binja Editorial
Charo_IT
 

Similar to LLVM Register Allocation (2nd Version) (20)

An Example MIPS
An Example  MIPSAn Example  MIPS
An Example MIPS
 
Happy To Use SIMD
Happy To Use SIMDHappy To Use SIMD
Happy To Use SIMD
 
Verilog Lecture2 thhts
Verilog Lecture2 thhtsVerilog Lecture2 thhts
Verilog Lecture2 thhts
 
Day2 Verilog HDL Basic
Day2 Verilog HDL BasicDay2 Verilog HDL Basic
Day2 Verilog HDL Basic
 
Continuation Passing Style and Macros in Clojure - Jan 2012
Continuation Passing Style and Macros in Clojure - Jan 2012Continuation Passing Style and Macros in Clojure - Jan 2012
Continuation Passing Style and Macros in Clojure - Jan 2012
 
Introduction to Debuggers
Introduction to DebuggersIntroduction to Debuggers
Introduction to Debuggers
 
A Deep Dive Into Understanding Apache Cassandra
A Deep Dive Into Understanding Apache CassandraA Deep Dive Into Understanding Apache Cassandra
A Deep Dive Into Understanding Apache Cassandra
 
Python fundamentals - basic | WeiYuan
Python fundamentals - basic | WeiYuanPython fundamentals - basic | WeiYuan
Python fundamentals - basic | WeiYuan
 
04 sequentialbasics 1
04 sequentialbasics 104 sequentialbasics 1
04 sequentialbasics 1
 
Lect-06Lect-06Lect-06Lect-06Lect-06Lect-06Lect-06Lect-06
Lect-06Lect-06Lect-06Lect-06Lect-06Lect-06Lect-06Lect-06Lect-06Lect-06Lect-06Lect-06Lect-06Lect-06Lect-06Lect-06
Lect-06Lect-06Lect-06Lect-06Lect-06Lect-06Lect-06Lect-06
 
HHVM on AArch64 - BUD17-400K1
HHVM on AArch64 - BUD17-400K1HHVM on AArch64 - BUD17-400K1
HHVM on AArch64 - BUD17-400K1
 
Swug July 2010 - windows debugging by sainath
Swug July 2010 - windows debugging by sainathSwug July 2010 - windows debugging by sainath
Swug July 2010 - windows debugging by sainath
 
Pepe Vila - Cache and Syphilis [rooted2019]
Pepe Vila - Cache and Syphilis [rooted2019]Pepe Vila - Cache and Syphilis [rooted2019]
Pepe Vila - Cache and Syphilis [rooted2019]
 
Verilog Lecture4 2014
Verilog Lecture4 2014Verilog Lecture4 2014
Verilog Lecture4 2014
 
Good news, everybody! Guile 2.2 performance notes (FOSDEM 2016)
Good news, everybody! Guile 2.2 performance notes (FOSDEM 2016)Good news, everybody! Guile 2.2 performance notes (FOSDEM 2016)
Good news, everybody! Guile 2.2 performance notes (FOSDEM 2016)
 
Writing Metasploit Plugins
Writing Metasploit PluginsWriting Metasploit Plugins
Writing Metasploit Plugins
 
Virtual Machine for Regular Expressions
Virtual Machine for Regular ExpressionsVirtual Machine for Regular Expressions
Virtual Machine for Regular Expressions
 
MuVM: Higher Order Mutation Analysis Virtual Machine for C
MuVM: Higher Order Mutation Analysis Virtual Machine for CMuVM: Higher Order Mutation Analysis Virtual Machine for C
MuVM: Higher Order Mutation Analysis Virtual Machine for C
 
Reverse Engineering Dojo: Enhancing Assembly Reading Skills
Reverse Engineering Dojo: Enhancing Assembly Reading SkillsReverse Engineering Dojo: Enhancing Assembly Reading Skills
Reverse Engineering Dojo: Enhancing Assembly Reading Skills
 
WCTF 2018 binja Editorial
WCTF 2018 binja EditorialWCTF 2018 binja Editorial
WCTF 2018 binja Editorial
 

More from Wang Hsiangkai

Debug Line Issues After Relaxation.
Debug Line Issues After Relaxation.Debug Line Issues After Relaxation.
Debug Line Issues After Relaxation.
Wang Hsiangkai
 
Machine Trace Metrics
Machine Trace MetricsMachine Trace Metrics
Machine Trace Metrics
Wang Hsiangkai
 
GCC LTO
GCC LTOGCC LTO
LTO plugin
LTO pluginLTO plugin
LTO plugin
Wang Hsiangkai
 
Something About Dynamic Linking
Something About Dynamic LinkingSomething About Dynamic Linking
Something About Dynamic Linking
Wang Hsiangkai
 
Effective Modern C++
Effective Modern C++Effective Modern C++
Effective Modern C++
Wang Hsiangkai
 
GCC GENERIC
GCC GENERICGCC GENERIC
GCC GENERIC
Wang Hsiangkai
 
Perf File Format
Perf File FormatPerf File Format
Perf File Format
Wang Hsiangkai
 
Introduction to Perf
Introduction to PerfIntroduction to Perf
Introduction to Perf
Wang Hsiangkai
 
SSA - PHI-functions Placements
SSA - PHI-functions PlacementsSSA - PHI-functions Placements
SSA - PHI-functions Placements
Wang Hsiangkai
 

More from Wang Hsiangkai (10)

Debug Line Issues After Relaxation.
Debug Line Issues After Relaxation.Debug Line Issues After Relaxation.
Debug Line Issues After Relaxation.
 
Machine Trace Metrics
Machine Trace MetricsMachine Trace Metrics
Machine Trace Metrics
 
GCC LTO
GCC LTOGCC LTO
GCC LTO
 
LTO plugin
LTO pluginLTO plugin
LTO plugin
 
Something About Dynamic Linking
Something About Dynamic LinkingSomething About Dynamic Linking
Something About Dynamic Linking
 
Effective Modern C++
Effective Modern C++Effective Modern C++
Effective Modern C++
 
GCC GENERIC
GCC GENERICGCC GENERIC
GCC GENERIC
 
Perf File Format
Perf File FormatPerf File Format
Perf File Format
 
Introduction to Perf
Introduction to PerfIntroduction to Perf
Introduction to Perf
 
SSA - PHI-functions Placements
SSA - PHI-functions PlacementsSSA - PHI-functions Placements
SSA - PHI-functions Placements
 

Recently uploaded

Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604
Fermin Galan
 
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamOpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
takuyayamamoto1800
 
Enterprise Resource Planning System in Telangana
Enterprise Resource Planning System in TelanganaEnterprise Resource Planning System in Telangana
Enterprise Resource Planning System in Telangana
NYGGS Automation Suite
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
Adele Miller
 
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Globus
 
Text-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptx
Text-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptxText-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptx
Text-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptx
ShamsuddeenMuhammadA
 
Launch Your Streaming Platforms in Minutes
Launch Your Streaming Platforms in MinutesLaunch Your Streaming Platforms in Minutes
Launch Your Streaming Platforms in Minutes
Roshan Dwivedi
 
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdfDominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
AMB-Review
 
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Globus
 
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
Juraj Vysvader
 
Understanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSageUnderstanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSage
Globus
 
BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024
Ortus Solutions, Corp
 
Cyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdfCyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdf
Cyanic lab
 
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Shahin Sheidaei
 
APIs for Browser Automation (MoT Meetup 2024)
APIs for Browser Automation (MoT Meetup 2024)APIs for Browser Automation (MoT Meetup 2024)
APIs for Browser Automation (MoT Meetup 2024)
Boni García
 
Graphic Design Crash Course for beginners
Graphic Design Crash Course for beginnersGraphic Design Crash Course for beginners
Graphic Design Crash Course for beginners
e20449
 
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns
 
Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024
Paco van Beckhoven
 
Enhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdfEnhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdf
Globus
 

Recently uploaded (20)

Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604
 
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamOpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
 
Enterprise Resource Planning System in Telangana
Enterprise Resource Planning System in TelanganaEnterprise Resource Planning System in Telangana
Enterprise Resource Planning System in Telangana
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
 
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
 
Text-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptx
Text-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptxText-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptx
Text-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptx
 
Launch Your Streaming Platforms in Minutes
Launch Your Streaming Platforms in MinutesLaunch Your Streaming Platforms in Minutes
Launch Your Streaming Platforms in Minutes
 
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdfDominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
 
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
 
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
 
Understanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSageUnderstanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSage
 
BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024
 
Cyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdfCyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdf
 
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
 
APIs for Browser Automation (MoT Meetup 2024)
APIs for Browser Automation (MoT Meetup 2024)APIs for Browser Automation (MoT Meetup 2024)
APIs for Browser Automation (MoT Meetup 2024)
 
Graphic Design Crash Course for beginners
Graphic Design Crash Course for beginnersGraphic Design Crash Course for beginners
Graphic Design Crash Course for beginners
 
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology Solutions
 
Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024
 
Enhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdfEnhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdf
 

LLVM Register Allocation (2nd Version)

  • 2. Outline • Introduction to Register Allocation Problem • LLVM Register Allocation Template Method • LLVM Basic Register Allocation • LLVM Greedy Register Allocation
  • 3. Introduction to Register Allocation • Definition • Register allocation is the problem of mapping program variables to either machine registers or memory addresses. • Best solution • minimise the number of loads/stores from/to memory • NP-complete
  • 4. int main() { int i, j; int answer; for (i = 1; i < 10; i++) for (j = 1; j < 10; j++) { answer = i * j; } return 0; } _main: @ BB#0: sub sp, #16 movs r0, #0 str r0, [sp, #12] movs r0, #1 str r0, [sp, #8] b LBB0_2 LBB0_1: adds r1, #1 str r1, [sp, #8] LBB0_2: ldr r1, [sp, #8] cmp r1, #9 bgt LBB0_6 @ BB#3: str r0, [sp, #4] b LBB0_5 LBB0_4: ldr r2, [sp, #4] muls r1, r2, r1 str r1, [sp] ldr r1, [sp, #4] adds r1, #1
  • 5. Graph Coloring • For an arbitrary graph G; a coloring of G assigns a color to each node in G so that no pair of adjacent nodes have the same color. 2-colorable 3-colorable
  • 6. Graph Coloring for RA • Node: Live interval • Edge: Two live intervals have interference • Color: Physical register • Find a optimal colouring for the graph
  • 7. … a0 = … b0 = … … = b0 d0 = … c0 = … … d1 = c0 … = a0 … = d1 B0 B1 B2 B3 … LIa = … LIb = … … = LIb LIc = … … LId = LIc … = LIa … = LId B0 B1 B2 B3
  • 8. LIa LIb LIc LId … LIa = … LIb = … … = LIb LIc = … … LId = LIc … = LIa … = LId B0 B1 B2 B3
  • 9. LLVM Register Allocation • Basic • Provide a minimal implementation of the basic register allocator • Greedy • Global live range splitting. • Fast • This register allocator allocates registers to a basic block at a time. • PBQP • Partitioned Boolean Quadratic Programming (PBQP) based register allocator for LLVM
  • 10. Template Method • Define the skeleton of an algorithm in an operation, deferring some steps to subclasses.
  • 11. LLVM Register Allocation Template Method Enqueue All LiveInterval selectOrSplit for One LiveInterval Assign the Physical Register Enqueue Split LiveInterval dequeue physical register is available split live interval allocatePhysRegs enqueue seedLiveRegs Q customised by new RA algorithm for (unsigned i = 0, e = MRI->getNumVirtRegs(); i != e; ++i) { unsigned Reg = TargetRegisterInfo::index2VirtReg(i); if (MRI->reg_nodbg_empty(Reg)) continue; enqueue(&LIS->getInterval(Reg)); }
  • 13. LLVM Basic Register Allocation Calculate LiveInterval Weight Enqueue All LiveInterval RABasic::selectOrSplit Assign the Physical Register Enqueue Split LiveInterval dequeue physical register is available split live interval update LiveInterval.weight (spill cost) allocatePhysRegs enqueue seedLiveRegs priority Q (spill cost) customised by RABasic algorithm struct CompSpillWeight { bool operator()(LiveInterval *A, LiveInterval *B) const { return A->weight < B->weight; } }; 1. Assign physical registers to Live Interval with highest spill cost. 2. If there is no physical registers for current Live Interval, select
 the highest spill cost Live Interval between current one and 
 interferences to assign physical registers. 3. Spill the unassigned Live Intervals.
  • 14. LiveInterval Weight • Weight for one instruction with the register • weight = (isDef + isUse) * (Block Frequency / Entry Frequency) • loop induction variable: weight *= 3 • For all instructions with the register • totalWeight += weight • Hint: totalWeight *= 1.01 • Re-materializable: totalWeight *= 0.5 • LiveInterval.weight = totalWeight / size of LiveInterval
  • 16. • Example (assign physical registers by length) Q0 D0 D1 Q1 D2 D3 V1 V2 V3 V4 V5
  • 18. • No physical register for V1 Q0 D0 D1 Q1 D2 D3 V1 V2 V3 V4 V5
  • 19. • Evict V2 (evict Live Interval with lower spill cost) Q0 D0 D1 Q1 D2 D3 V1 V2 V3V4 V5 stack
  • 20. • Split V2 Q0 D0 D1 Q1 D2 D3 V1 V2b V3V4 V5 V2a V2c
  • 21. • Split V2 Q0 D0 D1 Q1 D2 D3 V1 V2b V3V4 V5 V2a V2c stack
  • 22. Greedy RA Stages • RS_New: created • RS_Assign: enqueue • RS_Split: need to split • RS_Split2 • used for split products that may not be making progress • RS_Spill: need to spill • RS_Done: assigned a physical register or created by spill
  • 23. RS_Split2 • The live intervals created by split will enqueue to process again. • There is a risk of creating infinite loops. … = vreg1 … … = vreg1 … … = vreg1 … vreg2 = COPY vreg1 … = vreg2 … vreg3 = COPY vreg1 … = vreg3 … … = vreg3 … RS_New RS_Split2
  • 24. Greedy Register Allocation try to assign physical register try to evict to find better register enter RS_Split stage try last chance recoloring split spill pick a physical register and evict all interference found register stage >= RS_Done or Live Interval is unspillable stage < RS_Split selectOrSplit(d+1) selectOrSplit(d) stage is RS_Split or RS_Split2
  • 25. Last Chance Recoloring • Try to assign a physical register to Live Interval by evicting all its interferences. • The recoloring process may recursively use the last chance recoloring. Therefore, when a virtual register has been assigned a color by this mechanism, it is marked as Fixed. vA can use {R1, R2 } vB can use { R2, R3} vC can use {R1 } vA => R1 vB => R2 vC => fails vA => R2 vB => R3 vC => R1 (fixed) selectOrSplit(d) selectOrSplit(d + 1)
  • 26. How to Split? is stage beyond RS_Spill? is in one BB? tryLocalSplit tryInstructionSplit No Yes tryRegionSplit is stage less than RS_Split2? No spill Yes success? No success? spill No tryBlockSplit Yes No success? No success? spill No done Yes Yes done Yes Yes
  • 27. tryLocalSplit • Try to split virtual register interval into smaller intervals inside its only basic block. • calculate gap weights • adjust the split region
  • 28. Calculate Gap Weights NumGaps = 4 define use use use use
  • 29. Calculate Gap Weights LI.weight VirtReg Live Interval If there is a physical register occupied by VirtReg.0 0 define use use use use
  • 30. Calculate Gap Weights LI.weight physical Live Interval If there is a fixed physical register.0 0 huge_valf define use use use use
  • 31. Adjust Split Region SplitAfter = 1 SplitBefore = 0 normalise spill weight > max gap if Diff > BestDiff: BestBefore = SplitBefore BestAfter = SplitAfter SplitAfter++ SplitBefore++ YesNo normalise spill weight = spill cost / distance = (#gap * block_freq) / distance(SplitBefore, SplitAfter)
  • 32. Adjust Split Region BestAfter BestBefore normalise spill weight > max gap if Diff > BestDiff: BestBefore = SplitBefore BestAfter = SplitAfter SplitAfter++ SplitBefore++ YesNo normalise spill weight = spill cost / distance = (#gap * block_freq) / distance(SplitBefore, SplitAfter) RS_New (or RS_Split2) RS_New Go through all physical registers. Find the most critical range.
  • 33. tryRegionSplit • Use Hopfield Network to find optimal splits. • Guaranteed to converge to a local minimum.
  • 34. Hopfield Network a(t)s⇥1 = ⇢ ps⇥1 : t = 0 S(Ws⇥s ⇥ a(t 1)s⇥1 + bs⇥1) : t 1 S(x) = ⇢ +1 : x ✓ 1 : x < ✓
  • 35. tryRegionSplit 1. For every physical register, construct Hopfield Network • Initialize border constraints • Initialize Hopfield Network nodes according to border constraints • Add links to Hopfield Network and iterate 2. Get the best candidate 3. Do region split
  • 36. Initialize Border Constraints • No Interference. LiveIn ? PrefReg : DontCare; LiveOut ? PrefReg : DontCare; enum BorderConstraint { DontCare, PrefReg, PrefSpill, PrefBoth, MustSpill };
  • 37. Initialize Border Constraints • There are Interferences. MustSpill PrefSpill FirstInstr LastInstr PrefReg/DontCare FirstInstr LastInstr FirstInstr LastInstr MustSpill FirstInstr LastInstr FirstInstr LastInstr FirstInstr LastInstr PrefSpill PrefReg/DontCare
  • 38. Edge Bundle BB #0 BB #1 BB #3 BB #2 BB #4 BB #5 BB #6 // Join the outgoing bundle with the ingoing bundles of all successors. for (MachineBasicBlock::const_succ_iterator SI = MBB.succ_begin(), SE = MBB.succ_end(); SI != SE; ++SI) EC.join(OutE, 2 * (*SI)->getNumber()); EC: (BB#0, in) Bundle #0: 0 0 0 (BB#0, out) Bundle #1: 1 1 1 (BB#1, in) Bundle #2: 2 1 1 (BB#1, out) Bundle #3: 3 3 2 (BB#2, in) Bundle #4: 4 3 2 (BB#2, out) Bundle #5: 5 5 3 (BB#3, in) Bundle #6: 6 5 3 (BB#3, out) Bundle #7: 7 7 4 (BB#4, in) Bundle #8: 8 7 4 (BB#4, out) Bundle #9: 9 5 3 (BB#5, in) Bundle #10: 10 7 4 (BB#5, out) Bundle #11: 11 11 -> 1 1 (BB#6, in) Bundle #12: 12 3 2 (BB#6, out) Bundle #13: 13 13 5 void join(unsigned a, unsigned b) { unsigned eca = EC[a]; unsigned ecb = EC[b]; while (eca != ecb) if (eca < ecb) EC[b] = eca, b = ecb, ecb = EC[b]; else EC[a] = ecb, a = eca, eca = EC[a]; }
  • 39. Edge Bundle BB #0 BB #1 BB #3 BB #2 BB #4 BB #5 BB #6 Blocks: Bundle #0: BB#0 Bundle #1: BB#0, BB#1, BB#5 Bundle #2: BB#1, BB#2, BB#6 Bundle #3: BB#2, BB#3, BB#4 Bundle #4: BB#3, BB#4, BB#5 Bundle #5: BB#6 Bundle #6: Bundle #7: Bundle #8: Bundle #9: Bundle #10: Bundle #11: Bundle #12: Bundle #13: EC: (BB#0, in) Bundle #0: 0 0 0 (BB#0, out) Bundle #1: 1 1 1 (BB#1, in) Bundle #2: 2 1 1 (BB#1, out) Bundle #3: 3 3 2 (BB#2, in) Bundle #4: 4 3 2 (BB#2, out) Bundle #5: 5 5 3 (BB#3, in) Bundle #6: 6 5 3 (BB#3, out) Bundle #7: 7 7 4 (BB#4, in) Bundle #8: 8 7 4 (BB#4, out) Bundle #9: 9 5 3 (BB#5, in) Bundle #10: 10 7 4 (BB#5, out) Bundle #11: 11 1 1 (BB#6, in) Bundle #12: 12 3 2 (BB#6, out) Bundle #13: 13 13 5
  • 40. Edge Bundle BB #0 BB #1 BB #3 BB #2 BB #4 BB #5 BB #6 Blocks: Bundle #0: BB#0 Bundle #1: BB#0, BB#1, BB#5 Bundle #2: BB#1, BB#2, BB#6 Bundle #3: BB#2, BB#3, BB#4 Bundle #4: BB#3, BB#4, BB#5 Bundle #5: BB#6 Bundle #6: Bundle #7: Bundle #8: Bundle #9: Bundle #10: Bundle #11: Bundle #12: Bundle #13: EC: (BB#0, in) Bundle #0: 0 0 0 (BB#0, out) Bundle #1: 1 1 1 (BB#1, in) Bundle #2: 2 1 1 (BB#1, out) Bundle #3: 3 3 2 (BB#2, in) Bundle #4: 4 3 2 (BB#2, out) Bundle #5: 5 5 3 (BB#3, in) Bundle #6: 6 5 3 (BB#3, out) Bundle #7: 7 7 4 (BB#4, in) Bundle #8: 8 7 4 (BB#4, out) Bundle #9: 9 5 3 (BB#5, in) Bundle #10: 10 7 4 (BB#5, out) Bundle #11: 11 1 1 (BB#6, in) Bundle #12: 12 3 2 (BB#6, out) Bundle #13: 13 13 5
  • 41. Edge Bundle BB #0 BB #1 BB #3 BB #2 BB #4 BB #5 BB #6 Blocks: Bundle #0: BB#0 Bundle #1: BB#0, BB#1, BB#5 Bundle #2: BB#1, BB#2, BB#6 Bundle #3: BB#2, BB#3, BB#4 Bundle #4: BB#3, BB#4, BB#5 Bundle #5: BB#6 Bundle #6: Bundle #7: Bundle #8: Bundle #9: Bundle #10: Bundle #11: Bundle #12: Bundle #13: EC: (BB#0, in) Bundle #0: 0 0 0 (BB#0, out) Bundle #1: 1 1 1 (BB#1, in) Bundle #2: 2 1 1 (BB#1, out) Bundle #3: 3 3 2 (BB#2, in) Bundle #4: 4 3 2 (BB#2, out) Bundle #5: 5 5 3 (BB#3, in) Bundle #6: 6 5 3 (BB#3, out) Bundle #7: 7 7 4 (BB#4, in) Bundle #8: 8 7 4 (BB#4, out) Bundle #9: 9 5 3 (BB#5, in) Bundle #10: 10 7 4 (BB#5, out) Bundle #11: 11 1 1 (BB#6, in) Bundle #12: 12 3 2 (BB#6, out) Bundle #13: 13 13 5
  • 42. Edge Bundle BB #0 BB #1 BB #3 BB #2 BB #4 BB #5 BB #6 Blocks: Bundle #0: BB#0 Bundle #1: BB#0, BB#1, BB#5 Bundle #2: BB#1, BB#2, BB#6 Bundle #3: BB#2, BB#3, BB#4 Bundle #4: BB#3, BB#4, BB#5 Bundle #5: BB#6 Bundle #6: Bundle #7: Bundle #8: Bundle #9: Bundle #10: Bundle #11: Bundle #12: Bundle #13: EC: (BB#0, in) Bundle #0: 0 0 0 (BB#0, out) Bundle #1: 1 1 1 (BB#1, in) Bundle #2: 2 1 1 (BB#1, out) Bundle #3: 3 3 2 (BB#2, in) Bundle #4: 4 3 2 (BB#2, out) Bundle #5: 5 5 3 (BB#3, in) Bundle #6: 6 5 3 (BB#3, out) Bundle #7: 7 7 4 (BB#4, in) Bundle #8: 8 7 4 (BB#4, out) Bundle #9: 9 5 3 (BB#5, in) Bundle #10: 10 7 4 (BB#5, out) Bundle #11: 11 1 1 (BB#6, in) Bundle #12: 12 3 2 (BB#6, out) Bundle #13: 13 13 5
  • 43. Edge Bundle BB #0 BB #1 BB #3 BB #2 BB #4 BB #5 BB #6 Blocks: Bundle #0: BB#0 Bundle #1: BB#0, BB#1, BB#5 Bundle #2: BB#1, BB#2, BB#6 Bundle #3: BB#2, BB#3, BB#4 Bundle #4: BB#3, BB#4, BB#5 Bundle #5: BB#6 Bundle #6: Bundle #7: Bundle #8: Bundle #9: Bundle #10: Bundle #11: Bundle #12: Bundle #13: EC: (BB#0, in) Bundle #0: 0 0 0 (BB#0, out) Bundle #1: 1 1 1 (BB#1, in) Bundle #2: 2 1 1 (BB#1, out) Bundle #3: 3 3 2 (BB#2, in) Bundle #4: 4 3 2 (BB#2, out) Bundle #5: 5 5 3 (BB#3, in) Bundle #6: 6 5 3 (BB#3, out) Bundle #7: 7 7 4 (BB#4, in) Bundle #8: 8 7 4 (BB#4, out) Bundle #9: 9 5 3 (BB#5, in) Bundle #10: 10 7 4 (BB#5, out) Bundle #11: 11 1 1 (BB#6, in) Bundle #12: 12 3 2 (BB#6, out) Bundle #13: 13 13 5
  • 44. Edge Bundle BB #0 BB #1 BB #3 BB #2 BB #4 BB #5 BB #6 Blocks: Bundle #0: BB#0 Bundle #1: BB#0, BB#1, BB#5 Bundle #2: BB#1, BB#2, BB#6 Bundle #3: BB#2, BB#3, BB#4 Bundle #4: BB#3, BB#4, BB#5 Bundle #5: BB#6 Bundle #6: Bundle #7: Bundle #8: Bundle #9: Bundle #10: Bundle #11: Bundle #12: Bundle #13: EC: (BB#0, in) Bundle #0: 0 0 0 (BB#0, out) Bundle #1: 1 1 1 (BB#1, in) Bundle #2: 2 1 1 (BB#1, out) Bundle #3: 3 3 2 (BB#2, in) Bundle #4: 4 3 2 (BB#2, out) Bundle #5: 5 5 3 (BB#3, in) Bundle #6: 6 5 3 (BB#3, out) Bundle #7: 7 7 4 (BB#4, in) Bundle #8: 8 7 4 (BB#4, out) Bundle #9: 9 5 3 (BB#5, in) Bundle #10: 10 7 4 (BB#5, out) Bundle #11: 11 1 1 (BB#6, in) Bundle #12: 12 3 2 (BB#6, out) Bundle #13: 13 13 5
  • 45. Edge Bundle BB #0 BB #1 BB #3 BB #2 BB #4 BB #5 BB #6 Blocks: Bundle #0: BB#0 Bundle #1: BB#0, BB#1, BB#5 Bundle #2: BB#1, BB#2, BB#6 Bundle #3: BB#2, BB#3, BB#4 Bundle #4: BB#3, BB#4, BB#5 Bundle #5: BB#6 Bundle #6: Bundle #7: Bundle #8: Bundle #9: Bundle #10: Bundle #11: Bundle #12: Bundle #13: EC: (BB#0, in) Bundle #0: 0 0 0 (BB#0, out) Bundle #1: 1 1 1 (BB#1, in) Bundle #2: 2 1 1 (BB#1, out) Bundle #3: 3 3 2 (BB#2, in) Bundle #4: 4 3 2 (BB#2, out) Bundle #5: 5 5 3 (BB#3, in) Bundle #6: 6 5 3 (BB#3, out) Bundle #7: 7 7 4 (BB#4, in) Bundle #8: 8 7 4 (BB#4, out) Bundle #9: 9 5 3 (BB#5, in) Bundle #10: 10 7 4 (BB#5, out) Bundle #11: 11 1 1 (BB#6, in) Bundle #12: 12 3 2 (BB#6, out) Bundle #13: 13 13 5
  • 46. Initialize Hopfield Network Node • update BiasN, BiasP according to BorderConstraint BB #n (freq) … = Y op … PrefReg PrefSpill Bundle ib BiasP += freq Bundle ob BiasN += freq void addBias(BlockFrequency freq, BorderConstraint direction) { switch (direction) { default: break; case PrefReg: BiasP += freq; break; case PrefSpill: BiasN += freq; break; case MustSpill: BiasN = BlockFrequency::getMaxFrequency(); // (uint64_t)-1ULL break; } }
  • 47. Add Links to Hopfield Network • add weight to links Live Through BB #n (freq) Bundle ib Bundle ob void addLink(unsigned b, BlockFrequency w) { // Update cached sum. SumLinkWeights += w; // There can be multiple links to the same bundle, add them up. for (LinkVector::iterator I = Links.begin(), E = Links.end(); I != if (I->second == b) { I->first += w; return; } // This must be the first link to b. Links.push_back(std::make_pair(w, b)); } (freq, ob) (freq, ib)
  • 48. Update Hopfield Network Bundle X BiasN BiasP Value = 0 Bundle A Value = -1 Bundle B Value = 1 Bundle C Value = 1 Bundle D Value = 1 SumN = BiasN + freqA SunP = BiasP + freqB + freqC + freqD (freqA, A) (freqB, B) (freqC, C) (freqD, D) if (SumN >= SumP + Threshold) Value = -1; else if (SumP >= SumN + Threshold) Value = 1; else Value = 0; a(t)s⇥1 = ⇢ ps⇥1 : t = 0 S(Ws⇥s ⇥ a(t 1)s⇥1 + bs⇥1) : t 1 2 6 6 6 6 4 · · · · · · · · · · · · FA FB FC FD 0 3 7 7 7 7 5 ⇥ 2 6 6 6 6 4 1 1 1 1 0 3 7 7 7 7 5 + 2 6 6 6 6 6 4 ... Biasp Biasn 3 7 7 7 7 7 5
  • 49. Region Split • splitLiveThroughBlock • splitRegInBlock • splitRegOutBlock
  • 50. splitLiveThroughBlock Bundle ib Value == 1 Bundle ob Value != 1 Live Through LiveOut on Stack first non-PHI Start New Int Bundle ib Value != 1 Bundle ob Value == 1 Live Through LiveIn on Stack last split point End New Int Live Through No Interference Bundle ib Value == 1 Bundle ob Value == 1 End New Int Start
  • 51. splitLiveThroughBlock Bundle ib Value == 1 Bundle ob Value == 1 LiveThrough Non-overlapping interference New Int Interference.first() Interference.last() New Int Bundle ib Value == 1 Bundle ob Value == 1 LiveThrough Overlapping interference New Int Interference.first() Interference.last() New Int
  • 52. splitRegInBlock Bundle ib Value == 1 No LiveOut Interference after kill Start New Int Bundle ib Value == 1 Bundle ob Value != 1 LiveOut on Stack Interference after last use LiveOut on Stack Interference after last use Interference.fist() LastInstr LastInstr last split point New Int Start Bundle ib Value == 1 Bundle ob Value != 1 LastInstr last split point New Int Start Interference.fist() Interference.fist()
  • 53. splitRegInBlock Bundle ib Value == 1 LiveOut on Stack Interference overlapping uses Start New Int Bundle ib Value == 1 Interference.fist() LastInstr last split point New Int Start New Int Interference.fist() LastInstr last split point New Int Bundle ob Value != 1 Bundle ob Value != 1 LiveOut on Stack Interference overlapping uses
  • 54. splitRegOutBlock No LiveIn Interference before def End New Int Bundle ib Value != 1 Bundle ob Value == 1 Live Through Interference before def Live Through Interference overlapping uses Interference.last() FirstInstr Bundle ib Value != 1 Bundle ob Value == 1 Bundle ob Value == 1 End New Int Interference.last() FirstInstr last split point End New Int Interference.last() FirstInstr New Int