SlideShare a Scribd company logo
Part II: LLVM Intermediate Representation
陳韋任 (Chen Wei-Ren)
chenwj@iis.sinica.edu.tw
Computer Systems Lab, Institute of Information Science,
Academia Sinica, Taiwan (R.O.C.)
April 10, 2013
Outline
LLVM IR and IR Builder
From LLVM IR to Machine Code
Machine Code Layer
C Source Code
int fib(int x) {
if (x <= 2)
return 1;
return fib(x - 1) + fib(x - 2);
}
$ clang -emit-llvm -O0 -S fib.c -o fib.ll
C → LLVM IR
define i32 @fib(i32 %x) nounwind uwtable {
entry:
...
br i1 %cmp, label %if.then, label %if.end
if.then: ; preds = %entry
store i32 1, i32* %retval
br label %return
if.end: ; preds = %entry
...
store i32 %add, i32* %retval
br label %return
return: ; preds = %if.end, %if.then
%3 = load i32* %retval
ret i32 %3
}
LLVM IR 1/2
LLVM defines its own IR to represent code in the compiler. LLVM
IR has following characteristics:
SSA form: Every definition creates a new variable.
Infinite virtual registers
Phi node: Determine the value of the variable comes from
which control flow.
LLVM IR has three different forms:
in-memory compiler IR
on-disk bitcode representation (*.bc)
on-disk human readable assembly (*.ll)
LLVM IR 2/2
IR Builder 1/3
LLVM IRBuilder is a class template that can be used as a
convenient way to create LLVM instructions.
Module: The top level container of all other LLVM IR. It can
contains global variables, functions, etc...
Function: A function consists of a list of basic blocks and a
list of arguments.
Basic block: A container of instructions that can be executed
sequentially.
LLVMContext: A container of global state in LLVM.
IR Builder 2/3
IR Builder 3/3
// Create LLVMContext for latter use.
LLVMContext Context;
// Create some module to put our function into it.
Module *M = new Module("test", Context);
// Create function inside the module
Function *Add1F =
cast<Function>(M->getOrInsertFunction("add1", Type::getInt32Ty(Context),
Type::getInt32Ty(Context),
(Type *)0));
// Add a basic block to the function.
BasicBlock *BB = BasicBlock::Create(Context, "EntryBlock", Add1F);
// Create a basic block builder. The builder will append instructions
// to the basic block ’BB’.
IRBuilder<> builder(BB);
// ... prepare operands for Add instrcution. ...
// Create the add instruction, inserting it into the end of BB.
Value *Add = builder.CreateAdd(One, ArgX);
Let’s start with a simple HelloWorld!
We will stick with LLVM 3.2 Release!
Step 1. Create Module
# include "llvm / LLVMContext.h"
# include "llvm / Module.h"
# include "llvm / IRBuilder.h"
int main()
{
llvm::LLVMContext& context = llvm::getGlobalContext();
llvm::Module* module = new llvm::Module("top", context);
llvm::IRBuilder<> builder(context);
module->dump( );
}
; ModuleID = ’top’
Step 2. Create Function
...
llvm::FunctionType *funcType =
llvm::FunctionType::get(builder.getInt32Ty(), false);
llvm::Function *mainFunc =
llvm::Function::Create(funcType,
llvm::Function::ExternalLinkage,
"main", module);
module->dump( );
}
; ModuleID = ’top’
declare i32 @main()
Step 3. Create Basic Block
...
llvm::BasicBlock *entry =
llvm::BasicBlock::Create(context, "entrypoint", mainFunc);
builder.SetInsertPoint(entry);
module->dump( );
}
; ModuleID = ’top’
define i32 @main() {
entrypoint:
}
Step 4. Add ”hello world” into the Module
...
llvm::Value *helloWorld
= builder.CreateGlobalStringPtr("hello world!n");
module->dump( );
}
; ModuleID = ’top’
@0 = private unnamed_addr constant [14 x i8] c"hello world!0A00"
define void @main() {
entrypoint:
}
Step 5. Declare ”puts” method
...
std::vector<llvm::Type *> putsArgs;
putsArgs.push_back(builder.getInt8Ty()->getPointerTo());
llvm::ArrayRef<llvm::Type*> argsRef(putsArgs);
llvm::FunctionType *putsType =
llvm::FunctionType::get(builder.getInt32Ty(), argsRef, false);
llvm::Constant *putsFunc
= module->getOrInsertFunction("puts", putsType);
module->dump();
}
; ModuleID = ’top’
@0 = private unnamed_addr constant [14 x i8] c"hello world!0A00"
define void @main() {
entrypoint:
}
declare i32 @puts(i8*)
Step 6. Call ”puts” with ”hello world”
...
builder.CreateCall(putsFunc, helloWorld);
builder.CreateRetVoid();
module->dump();
}
; ModuleID = ’top’
@0 = private unnamed_addr constant [14 x i8] c"hello world!0A00"
define void @main() {
entrypoint:
%0 = call i32 @puts(i8* getelementptr inbounds ([14 x i8]* @0, i32 0,
ret void
}
declare i32 @puts(i8*)
Lowering Flow Overview
LLVM → DAG
We first translate a sequence of LLVM IR into a SelectionDAG to
make instruction selection easier. DAG also make optimization
efficiently, CSE (Common Subexpression Elimination), for example.
For each use and def in the LLVM IR, an SDNode (SelectionDAG
Node) is created. The SDNode could be intrinsic SDNode or
target defined SDNode.
$ llc -view-dag-combine1-dags fib.ll
$ dot -Tpng dag.fib.dot -o dag.fib.png
LLVM → DAG
Legalize
Target machine might not support some of LLVM types or
operations, we need to legalize previous DAG.
$ llc -view-dag-combine2-dags fib.ll
IR DAG → Machine DAG
The instruction selector match IR SDNode into MachineSDNode.
MachineSDNode represents everything that will be needed to
construct a MachineInstr (MI).
# displays the DAG before instruction selection
$ llc -view-isel-dags fib.ll
# displays the DAG before instruction scheduling
$ llc -view-sched-dags fib.ll
Scheduling and Formation
Machine DAG are deconstructed and their instruction are turned
into list of MachineInstr (MI). The scheduler has to decide in what
order the instructions (MI) are emitted.
$ llc -print-machineinstrs fib.ll
SSA-based Machine Code Optimization
Ths list of MachineInstr is still in SSA form. We can perform
target-specific low-level optimization before doing register
allocation. For example, LICM (Loop Invariant Code Motion).
addMachineSSAOptimization (lib/CodeGen/Passes.cpp)
Register Allocation
All virtual registers are eliminated at this stage. The register
allocator assigns physical register to each virtual register if possible.
After register allocation, the list is not SSA form anymore.
Prologue/Epilogue Code Insertion
We insert prologue/epilogue code sequence into function
beginning/end respectively. All abstract stack frame references are
turned into memory address relative to the stack/frame pointer.
What We Have Now?
C → LLVM IR → IR DAG → Machine DAG → MachineInstr (MI)
Code Emission
This stage lowers MachineInstr (MI) to MCInst (MC).
MachineInstr (MI): Abstraction of target instruction, consists
of opcode and operand only. We have MachineFunction and
MachineBasicBlock as containers for MI.
MCInst (MC): Abstraction of object file. No function, global
variable ...etc, but only label, directive and instruction.
Machine Code (MC) Layer
MC layer works at the level of abstraction of object file.
$ llc -show-mc-inst fib.ll
$ llc -show-mc-encoding fib.ll
MC Layer Flow
What Is The Role of MC?
Assembler Relaxation
Assmbler can adjust the instructions for two purposes:
Correctness: If the operand of instruction is out of bound,
either expand the instruction or replace it by a longer one.
Optimization: Replace expensive instruction with a cheaper
one.
This process is called relaxation.
Assembler Relaxation Example 1/4
extern int bar(int);
int foo(int num) {
int t1 = num * 17;
if (t1 > bar(num))
return t1 + bar(num);
return num * bar(num);
}
$ clang -S -O2 relax.c
$ clang -c relax.s
$ objdump -d relax.o
Assembler Relaxation Example 2/4
0000000000000000 <foo>:
1a: e8 00 00 00 00 callq 1f <foo+0x1f>
1f: 45 39 f7 cmp %r14d,%r15d
22: 7e 05 jle 29 <foo+0x29>
24: 44 01 f8 add %r15d,%eax
27: eb 03 jmp 2c <foo+0x2c>
29: 0f af c3 imul %ebx,%eax
2c: 48 83 c4 08 add $0x8,%rsp
30: 5b pop %rbx
31: 41 5e pop %r14
33: 41 5f pop %r15
35: 5d pop %rbp
36: c3 retq
Assembler Relaxation Example 3/4
callq bar
cmpl %r14d, %r15d
jle .LBB0_2
.fill 124, 1, 0x90 # nop
# BB#1:
addl %r15d, %eax
jmp .LBB0_3
.LBB0_2:
imull %ebx, %eax
.LBB0_3:
addq $8, %rsp
popq %rbx
popq %r14
popq %r15
popq %rbp
ret
Assembler Relaxation Example 4/4
0000000000000000 <foo>:
1a: e8 00 00 00 00 callq 1f <foo+0x1f>
1f: 45 39 f7 cmp %r14d,%r15d
22: 0f 8e 81 00 00 00 jle a9 <foo+0xa9>
28: 90 nop
a3: 90 nop
a4: 44 01 f8 add %r15d,%eax
a7: eb 03 jmp ac <foo+0xac>
a9: 0f af c3 imul %ebx,%eax
ac: 48 83 c4 08 add $0x8,%rsp
b0: 5b pop %rbx
b1: 41 5e pop %r14
b3: 41 5f pop %r15
b5: 5d pop %rbp
b6: c3 retq
Section and Fragment
In order to make relaxation process easier, LLVM keeps instructions
as a linked list of Fragment not a byte array.
MCAssembler::layoutSectionOnce (lib/MC/MCAssembler.cpp)
Reference
LLVM - Another Toolchain Platform
Design and Implementation of a TriCore Backend for the
LLVM Compiler Framework
Life of an instruction in LLVM
Assembler relaxation
Instruction Relaxation
Intro to the LLVM MC Project
The LLVM Assembler & Machine Code Infrastructure
Howto: Implementing LLVM Integrated Assembler
The Architecture of Open Source Applications - Chapter
LLVM
The LLVM Target-Independent Code Generator
The End
Q & A
You can download the manuscript and related material from here.

More Related Content

What's hot

Linux Initialization Process (2)
Linux Initialization Process (2)Linux Initialization Process (2)
Linux Initialization Process (2)
shimosawa
 
FISL XIV - The ELF File Format and the Linux Loader
FISL XIV - The ELF File Format and the Linux LoaderFISL XIV - The ELF File Format and the Linux Loader
FISL XIV - The ELF File Format and the Linux Loader
John Tortugo
 
COSCUP 2020 RISC-V 32 bit linux highmem porting
COSCUP 2020 RISC-V 32 bit linux highmem portingCOSCUP 2020 RISC-V 32 bit linux highmem porting
COSCUP 2020 RISC-V 32 bit linux highmem porting
Eric Lin
 
OSSNA 2017 Performance Analysis Superpowers with Linux BPF
OSSNA 2017 Performance Analysis Superpowers with Linux BPFOSSNA 2017 Performance Analysis Superpowers with Linux BPF
OSSNA 2017 Performance Analysis Superpowers with Linux BPF
Brendan Gregg
 
Function Level Analysis of Linux NVMe Driver
Function Level Analysis of Linux NVMe DriverFunction Level Analysis of Linux NVMe Driver
Function Level Analysis of Linux NVMe Driver
인구 강
 
U boot porting guide for SoC
U boot porting guide for SoCU boot porting guide for SoC
U boot porting guide for SoC
Macpaul Lin
 
malloc & vmalloc in Linux
malloc & vmalloc in Linuxmalloc & vmalloc in Linux
malloc & vmalloc in Linux
Adrian Huang
 
Instruction Combine in LLVM
Instruction Combine in LLVMInstruction Combine in LLVM
Instruction Combine in LLVM
Wang Hsiangkai
 
Gcc porting
Gcc portingGcc porting
Gcc porting
Shiva Chen
 
Memory Optimization
Memory OptimizationMemory Optimization
Memory Optimization
Wei Lin
 
GCC LTO
GCC LTOGCC LTO
Kernel Recipes 2017: Using Linux perf at Netflix
Kernel Recipes 2017: Using Linux perf at NetflixKernel Recipes 2017: Using Linux perf at Netflix
Kernel Recipes 2017: Using Linux perf at Netflix
Brendan Gregg
 
llvm basic porting for risc v
llvm basic porting for risc vllvm basic porting for risc v
llvm basic porting for risc v
Tsung-Chun Lin
 
Linux Kernel Cryptographic API and Use Cases
Linux Kernel Cryptographic API and Use CasesLinux Kernel Cryptographic API and Use Cases
Linux Kernel Cryptographic API and Use Cases
Kernel TLV
 
Linux Kernel Image
Linux Kernel ImageLinux Kernel Image
Linux Kernel Image
艾鍗科技
 
[143] Modern C++ 무조건 써야 해?
[143] Modern C++ 무조건 써야 해?[143] Modern C++ 무조건 써야 해?
[143] Modern C++ 무조건 써야 해?
NAVER D2
 
U-Boot Porting on New Hardware
U-Boot Porting on New HardwareU-Boot Porting on New Hardware
U-Boot Porting on New Hardware
RuggedBoardGroup
 
Something About Dynamic Linking
Something About Dynamic LinkingSomething About Dynamic Linking
Something About Dynamic Linking
Wang Hsiangkai
 
LLVM Register Allocation
LLVM Register AllocationLLVM Register Allocation
LLVM Register Allocation
Wang Hsiangkai
 
Static partitioning virtualization on RISC-V
Static partitioning virtualization on RISC-VStatic partitioning virtualization on RISC-V
Static partitioning virtualization on RISC-V
RISC-V International
 

What's hot (20)

Linux Initialization Process (2)
Linux Initialization Process (2)Linux Initialization Process (2)
Linux Initialization Process (2)
 
FISL XIV - The ELF File Format and the Linux Loader
FISL XIV - The ELF File Format and the Linux LoaderFISL XIV - The ELF File Format and the Linux Loader
FISL XIV - The ELF File Format and the Linux Loader
 
COSCUP 2020 RISC-V 32 bit linux highmem porting
COSCUP 2020 RISC-V 32 bit linux highmem portingCOSCUP 2020 RISC-V 32 bit linux highmem porting
COSCUP 2020 RISC-V 32 bit linux highmem porting
 
OSSNA 2017 Performance Analysis Superpowers with Linux BPF
OSSNA 2017 Performance Analysis Superpowers with Linux BPFOSSNA 2017 Performance Analysis Superpowers with Linux BPF
OSSNA 2017 Performance Analysis Superpowers with Linux BPF
 
Function Level Analysis of Linux NVMe Driver
Function Level Analysis of Linux NVMe DriverFunction Level Analysis of Linux NVMe Driver
Function Level Analysis of Linux NVMe Driver
 
U boot porting guide for SoC
U boot porting guide for SoCU boot porting guide for SoC
U boot porting guide for SoC
 
malloc & vmalloc in Linux
malloc & vmalloc in Linuxmalloc & vmalloc in Linux
malloc & vmalloc in Linux
 
Instruction Combine in LLVM
Instruction Combine in LLVMInstruction Combine in LLVM
Instruction Combine in LLVM
 
Gcc porting
Gcc portingGcc porting
Gcc porting
 
Memory Optimization
Memory OptimizationMemory Optimization
Memory Optimization
 
GCC LTO
GCC LTOGCC LTO
GCC LTO
 
Kernel Recipes 2017: Using Linux perf at Netflix
Kernel Recipes 2017: Using Linux perf at NetflixKernel Recipes 2017: Using Linux perf at Netflix
Kernel Recipes 2017: Using Linux perf at Netflix
 
llvm basic porting for risc v
llvm basic porting for risc vllvm basic porting for risc v
llvm basic porting for risc v
 
Linux Kernel Cryptographic API and Use Cases
Linux Kernel Cryptographic API and Use CasesLinux Kernel Cryptographic API and Use Cases
Linux Kernel Cryptographic API and Use Cases
 
Linux Kernel Image
Linux Kernel ImageLinux Kernel Image
Linux Kernel Image
 
[143] Modern C++ 무조건 써야 해?
[143] Modern C++ 무조건 써야 해?[143] Modern C++ 무조건 써야 해?
[143] Modern C++ 무조건 써야 해?
 
U-Boot Porting on New Hardware
U-Boot Porting on New HardwareU-Boot Porting on New Hardware
U-Boot Porting on New Hardware
 
Something About Dynamic Linking
Something About Dynamic LinkingSomething About Dynamic Linking
Something About Dynamic Linking
 
LLVM Register Allocation
LLVM Register AllocationLLVM Register Allocation
LLVM Register Allocation
 
Static partitioning virtualization on RISC-V
Static partitioning virtualization on RISC-VStatic partitioning virtualization on RISC-V
Static partitioning virtualization on RISC-V
 

Viewers also liked

LibBest Library Information System
LibBest Library Information SystemLibBest Library Information System
LibBest Library Information System
Neo Chen
 
非關技術: 淺談如何參與社區開發
非關技術: 淺談如何參與社區開發非關技術: 淺談如何參與社區開發
非關技術: 淺談如何參與社區開發
Wei-Ren Chen
 
Introduction to llvm
Introduction to llvmIntroduction to llvm
Introduction to llvm
Tao He
 
圖書館趨勢觀察
圖書館趨勢觀察圖書館趨勢觀察
圖書館趨勢觀察
Ted Lin (林泰宏)
 
開放街圖: 集合群眾之力的製圖 (OpenStreetMap: A crowdsoucing map )
開放街圖: 集合群眾之力的製圖 (OpenStreetMap: A crowdsoucing map )開放街圖: 集合群眾之力的製圖 (OpenStreetMap: A crowdsoucing map )
開放街圖: 集合群眾之力的製圖 (OpenStreetMap: A crowdsoucing map )
Dongpo Deng
 
千里之行 Begin with a Single Step
千里之行 Begin with a Single Step千里之行 Begin with a Single Step
千里之行 Begin with a Single Step
Calvin C. Yu
 
大學與​信息安全​
大學與​信息安全​大學與​信息安全​
大學與​信息安全​
Chuan Lin
 

Viewers also liked (7)

LibBest Library Information System
LibBest Library Information SystemLibBest Library Information System
LibBest Library Information System
 
非關技術: 淺談如何參與社區開發
非關技術: 淺談如何參與社區開發非關技術: 淺談如何參與社區開發
非關技術: 淺談如何參與社區開發
 
Introduction to llvm
Introduction to llvmIntroduction to llvm
Introduction to llvm
 
圖書館趨勢觀察
圖書館趨勢觀察圖書館趨勢觀察
圖書館趨勢觀察
 
開放街圖: 集合群眾之力的製圖 (OpenStreetMap: A crowdsoucing map )
開放街圖: 集合群眾之力的製圖 (OpenStreetMap: A crowdsoucing map )開放街圖: 集合群眾之力的製圖 (OpenStreetMap: A crowdsoucing map )
開放街圖: 集合群眾之力的製圖 (OpenStreetMap: A crowdsoucing map )
 
千里之行 Begin with a Single Step
千里之行 Begin with a Single Step千里之行 Begin with a Single Step
千里之行 Begin with a Single Step
 
大學與​信息安全​
大學與​信息安全​大學與​信息安全​
大學與​信息安全​
 

Similar to Part II: LLVM Intermediate Representation

Boosting Developer Productivity with Clang
Boosting Developer Productivity with ClangBoosting Developer Productivity with Clang
Boosting Developer Productivity with Clang
Samsung Open Source Group
 
Post Exploitation Bliss: Loading Meterpreter on a Factory iPhone, Black Hat U...
Post Exploitation Bliss: Loading Meterpreter on a Factory iPhone, Black Hat U...Post Exploitation Bliss: Loading Meterpreter on a Factory iPhone, Black Hat U...
Post Exploitation Bliss: Loading Meterpreter on a Factory iPhone, Black Hat U...
Vincenzo Iozzo
 
Improving Java performance at JBCNConf 2015
Improving Java performance at JBCNConf 2015Improving Java performance at JBCNConf 2015
Improving Java performance at JBCNConf 2015
Raimon Ràfols
 
Adding a BOLT pass
Adding a BOLT passAdding a BOLT pass
Adding a BOLT pass
Amir42407
 
Bytes in the Machine: Inside the CPython interpreter
Bytes in the Machine: Inside the CPython interpreterBytes in the Machine: Inside the CPython interpreter
Bytes in the Machine: Inside the CPython interpreter
akaptur
 
The true story_of_hello_world
The true story_of_hello_worldThe true story_of_hello_world
The true story_of_hello_world
fantasy zheng
 
Improving Android Performance at Mobiconf 2014
Improving Android Performance at Mobiconf 2014Improving Android Performance at Mobiconf 2014
Improving Android Performance at Mobiconf 2014
Raimon Ràfols
 
JVM Mechanics: When Does the JVM JIT & Deoptimize?
JVM Mechanics: When Does the JVM JIT & Deoptimize?JVM Mechanics: When Does the JVM JIT & Deoptimize?
JVM Mechanics: When Does the JVM JIT & Deoptimize?
Doug Hawkins
 
Как работает LLVM бэкенд в C#. Егор Богатов ➠ CoreHard Autumn 2019
Как работает LLVM бэкенд в C#. Егор Богатов ➠ CoreHard Autumn 2019Как работает LLVM бэкенд в C#. Егор Богатов ➠ CoreHard Autumn 2019
Как работает LLVM бэкенд в C#. Егор Богатов ➠ CoreHard Autumn 2019
corehard_by
 
LAS16-501: Introduction to LLVM - Projects, Components, Integration, Internals
LAS16-501: Introduction to LLVM - Projects, Components, Integration, InternalsLAS16-501: Introduction to LLVM - Projects, Components, Integration, Internals
LAS16-501: Introduction to LLVM - Projects, Components, Integration, Internals
Linaro
 
Pragmatic Optimization in Modern Programming - Demystifying the Compiler
Pragmatic Optimization in Modern Programming - Demystifying the CompilerPragmatic Optimization in Modern Programming - Demystifying the Compiler
Pragmatic Optimization in Modern Programming - Demystifying the Compiler
Marina Kolpakova
 
NYU hacknight, april 6, 2016
NYU hacknight, april 6, 2016NYU hacknight, april 6, 2016
NYU hacknight, april 6, 2016
Mikhail Sosonkin
 
EMBEDDED SYSTEMS 4&5
EMBEDDED SYSTEMS 4&5EMBEDDED SYSTEMS 4&5
EMBEDDED SYSTEMS 4&5
PRADEEP
 
Clean code slide
Clean code slideClean code slide
Clean code slide
Anh Huan Miu
 
Analysis of Haiku Operating System (BeOS Family) by PVS-Studio. Part 2
Analysis of Haiku Operating System (BeOS Family) by PVS-Studio. Part 2Analysis of Haiku Operating System (BeOS Family) by PVS-Studio. Part 2
Analysis of Haiku Operating System (BeOS Family) by PVS-Studio. Part 2
PVS-Studio
 
HKG15-207: Advanced Toolchain Usage Part 3
HKG15-207: Advanced Toolchain Usage Part 3HKG15-207: Advanced Toolchain Usage Part 3
HKG15-207: Advanced Toolchain Usage Part 3
Linaro
 
Intel IPP Samples for Windows - error correction
Intel IPP Samples for Windows - error correctionIntel IPP Samples for Windows - error correction
Intel IPP Samples for Windows - error correction
PVS-Studio
 
Intel IPP Samples for Windows - error correction
Intel IPP Samples for Windows - error correctionIntel IPP Samples for Windows - error correction
Intel IPP Samples for Windows - error correction
Andrey Karpov
 
Top 10 bugs in C++ open source projects, checked in 2016
Top 10 bugs in C++ open source projects, checked in 2016Top 10 bugs in C++ open source projects, checked in 2016
Top 10 bugs in C++ open source projects, checked in 2016
PVS-Studio
 
SFO15-500: VIXL
SFO15-500: VIXLSFO15-500: VIXL
SFO15-500: VIXL
Linaro
 

Similar to Part II: LLVM Intermediate Representation (20)

Boosting Developer Productivity with Clang
Boosting Developer Productivity with ClangBoosting Developer Productivity with Clang
Boosting Developer Productivity with Clang
 
Post Exploitation Bliss: Loading Meterpreter on a Factory iPhone, Black Hat U...
Post Exploitation Bliss: Loading Meterpreter on a Factory iPhone, Black Hat U...Post Exploitation Bliss: Loading Meterpreter on a Factory iPhone, Black Hat U...
Post Exploitation Bliss: Loading Meterpreter on a Factory iPhone, Black Hat U...
 
Improving Java performance at JBCNConf 2015
Improving Java performance at JBCNConf 2015Improving Java performance at JBCNConf 2015
Improving Java performance at JBCNConf 2015
 
Adding a BOLT pass
Adding a BOLT passAdding a BOLT pass
Adding a BOLT pass
 
Bytes in the Machine: Inside the CPython interpreter
Bytes in the Machine: Inside the CPython interpreterBytes in the Machine: Inside the CPython interpreter
Bytes in the Machine: Inside the CPython interpreter
 
The true story_of_hello_world
The true story_of_hello_worldThe true story_of_hello_world
The true story_of_hello_world
 
Improving Android Performance at Mobiconf 2014
Improving Android Performance at Mobiconf 2014Improving Android Performance at Mobiconf 2014
Improving Android Performance at Mobiconf 2014
 
JVM Mechanics: When Does the JVM JIT & Deoptimize?
JVM Mechanics: When Does the JVM JIT & Deoptimize?JVM Mechanics: When Does the JVM JIT & Deoptimize?
JVM Mechanics: When Does the JVM JIT & Deoptimize?
 
Как работает LLVM бэкенд в C#. Егор Богатов ➠ CoreHard Autumn 2019
Как работает LLVM бэкенд в C#. Егор Богатов ➠ CoreHard Autumn 2019Как работает LLVM бэкенд в C#. Егор Богатов ➠ CoreHard Autumn 2019
Как работает LLVM бэкенд в C#. Егор Богатов ➠ CoreHard Autumn 2019
 
LAS16-501: Introduction to LLVM - Projects, Components, Integration, Internals
LAS16-501: Introduction to LLVM - Projects, Components, Integration, InternalsLAS16-501: Introduction to LLVM - Projects, Components, Integration, Internals
LAS16-501: Introduction to LLVM - Projects, Components, Integration, Internals
 
Pragmatic Optimization in Modern Programming - Demystifying the Compiler
Pragmatic Optimization in Modern Programming - Demystifying the CompilerPragmatic Optimization in Modern Programming - Demystifying the Compiler
Pragmatic Optimization in Modern Programming - Demystifying the Compiler
 
NYU hacknight, april 6, 2016
NYU hacknight, april 6, 2016NYU hacknight, april 6, 2016
NYU hacknight, april 6, 2016
 
EMBEDDED SYSTEMS 4&5
EMBEDDED SYSTEMS 4&5EMBEDDED SYSTEMS 4&5
EMBEDDED SYSTEMS 4&5
 
Clean code slide
Clean code slideClean code slide
Clean code slide
 
Analysis of Haiku Operating System (BeOS Family) by PVS-Studio. Part 2
Analysis of Haiku Operating System (BeOS Family) by PVS-Studio. Part 2Analysis of Haiku Operating System (BeOS Family) by PVS-Studio. Part 2
Analysis of Haiku Operating System (BeOS Family) by PVS-Studio. Part 2
 
HKG15-207: Advanced Toolchain Usage Part 3
HKG15-207: Advanced Toolchain Usage Part 3HKG15-207: Advanced Toolchain Usage Part 3
HKG15-207: Advanced Toolchain Usage Part 3
 
Intel IPP Samples for Windows - error correction
Intel IPP Samples for Windows - error correctionIntel IPP Samples for Windows - error correction
Intel IPP Samples for Windows - error correction
 
Intel IPP Samples for Windows - error correction
Intel IPP Samples for Windows - error correctionIntel IPP Samples for Windows - error correction
Intel IPP Samples for Windows - error correction
 
Top 10 bugs in C++ open source projects, checked in 2016
Top 10 bugs in C++ open source projects, checked in 2016Top 10 bugs in C++ open source projects, checked in 2016
Top 10 bugs in C++ open source projects, checked in 2016
 
SFO15-500: VIXL
SFO15-500: VIXLSFO15-500: VIXL
SFO15-500: VIXL
 

Recently uploaded

PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
DianaGray10
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
Neo4j
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
Edge AI and Vision Alliance
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
Neo4j
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
innovationoecd
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
Neo4j
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 
20 Comprehensive Checklist of Designing and Developing a Website
20 Comprehensive Checklist of Designing and Developing a Website20 Comprehensive Checklist of Designing and Developing a Website
20 Comprehensive Checklist of Designing and Developing a Website
Pixlogix Infotech
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
Neo4j
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
Zilliz
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
SOFTTECHHUB
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 

Recently uploaded (20)

PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 
20 Comprehensive Checklist of Designing and Developing a Website
20 Comprehensive Checklist of Designing and Developing a Website20 Comprehensive Checklist of Designing and Developing a Website
20 Comprehensive Checklist of Designing and Developing a Website
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 

Part II: LLVM Intermediate Representation

  • 1. Part II: LLVM Intermediate Representation 陳韋任 (Chen Wei-Ren) chenwj@iis.sinica.edu.tw Computer Systems Lab, Institute of Information Science, Academia Sinica, Taiwan (R.O.C.) April 10, 2013
  • 2. Outline LLVM IR and IR Builder From LLVM IR to Machine Code Machine Code Layer
  • 3. C Source Code int fib(int x) { if (x <= 2) return 1; return fib(x - 1) + fib(x - 2); } $ clang -emit-llvm -O0 -S fib.c -o fib.ll
  • 4. C → LLVM IR define i32 @fib(i32 %x) nounwind uwtable { entry: ... br i1 %cmp, label %if.then, label %if.end if.then: ; preds = %entry store i32 1, i32* %retval br label %return if.end: ; preds = %entry ... store i32 %add, i32* %retval br label %return return: ; preds = %if.end, %if.then %3 = load i32* %retval ret i32 %3 }
  • 5. LLVM IR 1/2 LLVM defines its own IR to represent code in the compiler. LLVM IR has following characteristics: SSA form: Every definition creates a new variable. Infinite virtual registers Phi node: Determine the value of the variable comes from which control flow. LLVM IR has three different forms: in-memory compiler IR on-disk bitcode representation (*.bc) on-disk human readable assembly (*.ll)
  • 7. IR Builder 1/3 LLVM IRBuilder is a class template that can be used as a convenient way to create LLVM instructions. Module: The top level container of all other LLVM IR. It can contains global variables, functions, etc... Function: A function consists of a list of basic blocks and a list of arguments. Basic block: A container of instructions that can be executed sequentially. LLVMContext: A container of global state in LLVM.
  • 9. IR Builder 3/3 // Create LLVMContext for latter use. LLVMContext Context; // Create some module to put our function into it. Module *M = new Module("test", Context); // Create function inside the module Function *Add1F = cast<Function>(M->getOrInsertFunction("add1", Type::getInt32Ty(Context), Type::getInt32Ty(Context), (Type *)0)); // Add a basic block to the function. BasicBlock *BB = BasicBlock::Create(Context, "EntryBlock", Add1F); // Create a basic block builder. The builder will append instructions // to the basic block ’BB’. IRBuilder<> builder(BB); // ... prepare operands for Add instrcution. ... // Create the add instruction, inserting it into the end of BB. Value *Add = builder.CreateAdd(One, ArgX);
  • 10. Let’s start with a simple HelloWorld! We will stick with LLVM 3.2 Release!
  • 11. Step 1. Create Module # include "llvm / LLVMContext.h" # include "llvm / Module.h" # include "llvm / IRBuilder.h" int main() { llvm::LLVMContext& context = llvm::getGlobalContext(); llvm::Module* module = new llvm::Module("top", context); llvm::IRBuilder<> builder(context); module->dump( ); } ; ModuleID = ’top’
  • 12. Step 2. Create Function ... llvm::FunctionType *funcType = llvm::FunctionType::get(builder.getInt32Ty(), false); llvm::Function *mainFunc = llvm::Function::Create(funcType, llvm::Function::ExternalLinkage, "main", module); module->dump( ); } ; ModuleID = ’top’ declare i32 @main()
  • 13. Step 3. Create Basic Block ... llvm::BasicBlock *entry = llvm::BasicBlock::Create(context, "entrypoint", mainFunc); builder.SetInsertPoint(entry); module->dump( ); } ; ModuleID = ’top’ define i32 @main() { entrypoint: }
  • 14. Step 4. Add ”hello world” into the Module ... llvm::Value *helloWorld = builder.CreateGlobalStringPtr("hello world!n"); module->dump( ); } ; ModuleID = ’top’ @0 = private unnamed_addr constant [14 x i8] c"hello world!0A00" define void @main() { entrypoint: }
  • 15. Step 5. Declare ”puts” method ... std::vector<llvm::Type *> putsArgs; putsArgs.push_back(builder.getInt8Ty()->getPointerTo()); llvm::ArrayRef<llvm::Type*> argsRef(putsArgs); llvm::FunctionType *putsType = llvm::FunctionType::get(builder.getInt32Ty(), argsRef, false); llvm::Constant *putsFunc = module->getOrInsertFunction("puts", putsType); module->dump(); } ; ModuleID = ’top’ @0 = private unnamed_addr constant [14 x i8] c"hello world!0A00" define void @main() { entrypoint: } declare i32 @puts(i8*)
  • 16. Step 6. Call ”puts” with ”hello world” ... builder.CreateCall(putsFunc, helloWorld); builder.CreateRetVoid(); module->dump(); } ; ModuleID = ’top’ @0 = private unnamed_addr constant [14 x i8] c"hello world!0A00" define void @main() { entrypoint: %0 = call i32 @puts(i8* getelementptr inbounds ([14 x i8]* @0, i32 0, ret void } declare i32 @puts(i8*)
  • 18. LLVM → DAG We first translate a sequence of LLVM IR into a SelectionDAG to make instruction selection easier. DAG also make optimization efficiently, CSE (Common Subexpression Elimination), for example. For each use and def in the LLVM IR, an SDNode (SelectionDAG Node) is created. The SDNode could be intrinsic SDNode or target defined SDNode. $ llc -view-dag-combine1-dags fib.ll $ dot -Tpng dag.fib.dot -o dag.fib.png
  • 20. Legalize Target machine might not support some of LLVM types or operations, we need to legalize previous DAG. $ llc -view-dag-combine2-dags fib.ll
  • 21. IR DAG → Machine DAG The instruction selector match IR SDNode into MachineSDNode. MachineSDNode represents everything that will be needed to construct a MachineInstr (MI). # displays the DAG before instruction selection $ llc -view-isel-dags fib.ll # displays the DAG before instruction scheduling $ llc -view-sched-dags fib.ll
  • 22. Scheduling and Formation Machine DAG are deconstructed and their instruction are turned into list of MachineInstr (MI). The scheduler has to decide in what order the instructions (MI) are emitted. $ llc -print-machineinstrs fib.ll
  • 23. SSA-based Machine Code Optimization Ths list of MachineInstr is still in SSA form. We can perform target-specific low-level optimization before doing register allocation. For example, LICM (Loop Invariant Code Motion). addMachineSSAOptimization (lib/CodeGen/Passes.cpp)
  • 24. Register Allocation All virtual registers are eliminated at this stage. The register allocator assigns physical register to each virtual register if possible. After register allocation, the list is not SSA form anymore.
  • 25. Prologue/Epilogue Code Insertion We insert prologue/epilogue code sequence into function beginning/end respectively. All abstract stack frame references are turned into memory address relative to the stack/frame pointer.
  • 26. What We Have Now? C → LLVM IR → IR DAG → Machine DAG → MachineInstr (MI)
  • 27. Code Emission This stage lowers MachineInstr (MI) to MCInst (MC). MachineInstr (MI): Abstraction of target instruction, consists of opcode and operand only. We have MachineFunction and MachineBasicBlock as containers for MI. MCInst (MC): Abstraction of object file. No function, global variable ...etc, but only label, directive and instruction.
  • 28. Machine Code (MC) Layer MC layer works at the level of abstraction of object file. $ llc -show-mc-inst fib.ll $ llc -show-mc-encoding fib.ll
  • 30. What Is The Role of MC?
  • 31. Assembler Relaxation Assmbler can adjust the instructions for two purposes: Correctness: If the operand of instruction is out of bound, either expand the instruction or replace it by a longer one. Optimization: Replace expensive instruction with a cheaper one. This process is called relaxation.
  • 32. Assembler Relaxation Example 1/4 extern int bar(int); int foo(int num) { int t1 = num * 17; if (t1 > bar(num)) return t1 + bar(num); return num * bar(num); } $ clang -S -O2 relax.c $ clang -c relax.s $ objdump -d relax.o
  • 33. Assembler Relaxation Example 2/4 0000000000000000 <foo>: 1a: e8 00 00 00 00 callq 1f <foo+0x1f> 1f: 45 39 f7 cmp %r14d,%r15d 22: 7e 05 jle 29 <foo+0x29> 24: 44 01 f8 add %r15d,%eax 27: eb 03 jmp 2c <foo+0x2c> 29: 0f af c3 imul %ebx,%eax 2c: 48 83 c4 08 add $0x8,%rsp 30: 5b pop %rbx 31: 41 5e pop %r14 33: 41 5f pop %r15 35: 5d pop %rbp 36: c3 retq
  • 34. Assembler Relaxation Example 3/4 callq bar cmpl %r14d, %r15d jle .LBB0_2 .fill 124, 1, 0x90 # nop # BB#1: addl %r15d, %eax jmp .LBB0_3 .LBB0_2: imull %ebx, %eax .LBB0_3: addq $8, %rsp popq %rbx popq %r14 popq %r15 popq %rbp ret
  • 35. Assembler Relaxation Example 4/4 0000000000000000 <foo>: 1a: e8 00 00 00 00 callq 1f <foo+0x1f> 1f: 45 39 f7 cmp %r14d,%r15d 22: 0f 8e 81 00 00 00 jle a9 <foo+0xa9> 28: 90 nop a3: 90 nop a4: 44 01 f8 add %r15d,%eax a7: eb 03 jmp ac <foo+0xac> a9: 0f af c3 imul %ebx,%eax ac: 48 83 c4 08 add $0x8,%rsp b0: 5b pop %rbx b1: 41 5e pop %r14 b3: 41 5f pop %r15 b5: 5d pop %rbp b6: c3 retq
  • 36. Section and Fragment In order to make relaxation process easier, LLVM keeps instructions as a linked list of Fragment not a byte array. MCAssembler::layoutSectionOnce (lib/MC/MCAssembler.cpp)
  • 37. Reference LLVM - Another Toolchain Platform Design and Implementation of a TriCore Backend for the LLVM Compiler Framework Life of an instruction in LLVM Assembler relaxation Instruction Relaxation Intro to the LLVM MC Project The LLVM Assembler & Machine Code Infrastructure Howto: Implementing LLVM Integrated Assembler The Architecture of Open Source Applications - Chapter LLVM The LLVM Target-Independent Code Generator
  • 39. You can download the manuscript and related material from here.