SlideShare a Scribd company logo
Opaque Pointers
Are Coming
Nikita Popov @ LLVM CGO 2022
About Me
● Sr. Software Engineer on Platform Tools team at Red Hat
● I maintain the LLVM Compile-Time Tracker
2
ptr
One PointerType to Rule Them All
i8*
i32*
[0 x i32]*
i8* (i8*)*
%ty*
3
ptr
… apart from address spaces
i8*
i32*
i8 addrspace(1)*
i32 addrspace(1)*
ptr addrspace(1)
4
Why?
5
Pointer element types carry no semantics
define void @test(i32* %p)
6
Pointer element types carry no semantics
define void @test(i32* %p)
⇒ Does not imply that %p is 4-byte aligned
define void @test(i32* aligned 4 %p)
7
Pointer element types carry no semantics
define void @test(i32* %p)
⇒ Does not imply that %p is 4-byte aligned
define void @test(i32* aligned 4 %p)
⇒ Does not imply that %p is 4-byte dereferenceable
define void @test(i32* dereferenceable(4) %p)
8
Pointer element types carry no semantics
define void @test(i32* %p)
⇒ Does not imply that %p is 4-byte aligned
define void @test(i32* aligned 4 %p)
⇒ Does not imply that %p is 4-byte dereferenceable
define void @test(i32* dereferenceable(4) %p)
⇒ Does not imply any aliasing semantics (TBAA metadata does)
9
Pointer element types carry no semantics
define void @test(i32* %p)
⇒ Does not imply that %p is 4-byte aligned
define void @test(i32* aligned 4 %p)
⇒ Does not imply that %p is 4-byte dereferenceable
define void @test(i32* dereferenceable(4) %p)
⇒ Does not imply any aliasing semantics (TBAA metadata does)
⇒ Does not imply it will be accessed as i32!
10
Pointers can be arbitrarily bitcasted
define i64 @test(double* byval(double) %p) {
%p.p0i8 = bitcast double* %p to i8**
store i8* null, i8** %p.p0i8
%p.i64 = bitcast double* %p to i64*
%x = load i64, i64** %p.i64
ret i64 %x
}
11
Only types at certain uses matter
define i64 @test(double* byval(double) %p) {
%p.p0i8 = bitcast double* %p to i8**
store i8* null, i8** %p.p0i8
%p.i64 = bitcast double* %p to i64*
%x = load i64, i64** %p.i64
ret i64 %x
}
12
Only types at certain uses matter
define i64 @test(ptr byval(double) %p) {
store ptr 0.0, ptr %p
%x = load i64, ptr %p
ret i64 %x
}
13
Why?
● Memory usage: Don’t need to store bitcasts
● Compile-time: Don’t need to skip bitcasts in optimizations
14
Compile-Time Improvements (CTMark)
Link to data
Disclaimer: There may be differences in
optimization behavior.
15
Compile-Time Improvements (rustc)
Link to data
Disclaimer: There may be differences in
optimization behavior.
16
Max-RSS Improvements (rustc)
Link to data
Disclaimer: There may be differences in
optimization behavior.
17
Why?
● Memory usage: Don’t need to store bitcasts
● Compile-time: Don’t need to skip bitcasts in optimizations
● Performance:
○ Optimizations should ignore pointer bitcasts
18
Why?
● Memory usage: Don’t need to store bitcasts
● Compile-time: Don’t need to skip bitcasts in optimizations
● Performance:
○ Optimizations should ignore pointer bitcasts
○ …and many do (e.g. cost models says they’re free)
19
Why?
● Memory usage: Don’t need to store bitcasts
● Compile-time: Don’t need to skip bitcasts in optimizations
● Performance:
○ Optimizations should ignore pointer bitcasts
○ …and many do (e.g. cost models says they’re free)
○ …but many don’t (e.g. limited instruction/use walks)
20
Why?
● Memory usage: Don’t need to store bitcasts
● Compile-time: Don’t need to skip bitcasts in optimizations
● Performance:
○ Bitcasts can’t affect optimization if they don’t exist
21
Equivalence modulo pointer type
define i32* @test(i8** %p) {
store i8* null, i8** %p
%p.i32 = bitcast i8** %p to i32**
%v = load i32*, i32** %p.i32
ret i32* %v
}
EarlyCSE can’t optimize this!
(But full GVN can.)
22
Equivalence modulo pointer type
define ptr @test(ptr %p) {
store ptr null, ptr %p
%v = load ptr, ptr %p
ret ptr %v
}
EarlyCSE can optimize this!
23
Equivalence modulo pointer type
define ptr @test(ptr %p) {
store ptr null, ptr %p
%v = load ptr, ptr %p
ret ptr %v
}
; RUN: opt -S -early-cse < %s
define ptr @test(ptr %p) {
store ptr null, ptr %p
ret ptr null
}
24
Why?
● Memory usage: Don’t need to store bitcasts
● Compile-time: Don’t need to skip bitcasts in optimizations
● Performance:
○ Bitcasts can’t affect optimization if they don’t exist
○ Pointer element type difference cannot prevent CSE / forwarding / etc.
25
Offset-based reasoning
define internal i32 @add({ i32, i32 }* %p) {
%p0 = getelementptr { i32, i32 }, { i32, i32 }* %p, i64 0, i32 0
%v0 = load i32, i32* %p0
%p1 = getelementptr { i32, i32 }, { i32, i32 }* %p, i64 0, i32 1
%v1 = load i32, i32* %p1
%add = add i32 %v0, %v1
ret i32 %add
}
define i32 @caller({ i32, i32 }* %p) {
%res = call i32 @add({ i32, i32 }* %p)
ret i32 %res
}
26
Offset-based reasoning
; RUN: opt -S -argpromotion < %s
define internal i32 @add(i32 %p.0.val, i32 %p.4.val) {
%add = add i32 %p.0.val, %p.4.val
ret i32 %add
}
define i32 @caller({ i32, i32 }* %p) {
%1 = getelementptr { i32, i32 }, { i32, i32 }* %p, i64 0, i32 0
%p.val = load i32, i32* %1, align 4
%2 = getelementptr { i32, i32 }, { i32, i32 }* %p, i64 0, i32 1
%p.val1 = load i32, i32* %2, align 4
%res = call i32 @add(i32 %p.val, i32 %p.val1)
ret i32 %res
}
27
Offset-based reasoning
; RUN: opt -S -argpromotion < %s
define internal i32 @add(i32 %p.0.val, i32 %p.4.val) {
%add = add i32 %p.0.val, %p.4.val
ret i32 %add
}
define i32 @caller({ i32, i32 }* %p) {
%1 = getelementptr { i32, i32 }, { i32, i32 }* %p, i64 0, i32 0
%p.val = load i32, i32* %1, align 4
%2 = getelementptr { i32, i32 }, { i32, i32 }* %p, i64 0, i32 1
%p.val1 = load i32, i32* %2, align 4
%res = call i32 @add(i32 %p.val, i32 %p.val1)
ret i32 %res
}
Used to be based on GEP indices
28
Getelementptr index ambiguity
; Equivalent despite different indices:
getelementptr { [1 x i32], i32 }, ptr %p, i64 0
getelementptr { [1 x i32], i32 }, ptr %p, i64 0, i32 0
getelementptr { [1 x i32], i32 }, ptr %p, i64 0, i32 0, i64 0
29
Getelementptr index ambiguity
; Equivalent despite different indices:
getelementptr { [1 x i32], i32 }, ptr %p, i64 0
getelementptr { [1 x i32], i32 }, ptr %p, i64 0, i32 0
getelementptr { [1 x i32], i32 }, ptr %p, i64 0, i32 0, i64 0
; Equivalent despite different indices:
getelementptr { [1 x i32], i32 }, ptr %p, i64 0, i32 1
getelementptr { [1 x i32], i32 }, ptr %p, i64 0, i32 0, i64 1
getelementptr { [1 x i32], i32 }, ptr %p, i64 1, i32 0, i64 -1
30
Getelementptr index ambiguity
; Equivalent despite different indices:
getelementptr { [1 x i32], i32 }, ptr %p, i64 0
getelementptr { [1 x i32], i32 }, ptr %p, i64 0, i32 0
getelementptr { [1 x i32], i32 }, ptr %p, i64 0, i32 0, i64 0
; Equivalent despite different indices:
getelementptr { [1 x i32], i32 }, ptr %p, i64 0, i32 1
getelementptr { [1 x i32], i32 }, ptr %p, i64 0, i32 0, i64 1
getelementptr { [1 x i32], i32 }, ptr %p, i64 1, i32 0, i64 -1
⇒ Requires careful restriction to ensure uniqueness
⇒ Can’t support bitcasts
31
Getelementptr index ambiguity
; Equivalent despite different indices:
getelementptr { [1 x i32], i32 }, ptr %p, i64 0
getelementptr { [1 x i32], i32 }, ptr %p, i64 0, i32 0
getelementptr { [1 x i32], i32 }, ptr %p, i64 0, i32 0, i64 0
; Equivalent despite different indices:
getelementptr { [1 x i32], i32 }, ptr %p, i64 0, i32 1
getelementptr { [1 x i32], i32 }, ptr %p, i64 0, i32 0, i64 1
getelementptr { [1 x i32], i32 }, ptr %p, i64 1, i32 0, i64 -1
⇒ Requires careful restriction to ensure uniqueness
⇒ Can’t support bitcasts
⇒ Very hard to ensure correctness with opaque pointers
32
Offset-based reasoning
define internal i32 @add({ i32, i32 }* %p) {
%p0 = getelementptr { i32, i32 }, { i32, i32 }* %p, i64 0, i32 0
%v0 = load i32, i32* %p0 ← Load of i32 at offset 0
%p1 = getelementptr { i32, i32 }, { i32, i32 }* %p, i64 0, i32 1
%v1 = load i32, i32* %p1 ← Load of i32 at offset 4
%add = add i32 %v0, %v1
ret i32 %add
}
define i32 @caller({ i32, i32 }* %p) {
%res = call i32 @add({ i32, i32 }* %p)
ret i32 %res
}
33
Offset-based reasoning
define internal i32 @add({ i32, i32 }* %p) {
%p0 = getelementptr { i32, i32 }, { i32, i32 }* %p, i64 0, i32 0
%v0 = load i32, i32* %p0 ← Load of i32 at offset 0
%p1 = getelementptr { i32, i32 }, { i32, i32 }* %p, i64 0, i32 1
%v1 = load i32, i32* %p1 ← Load of i32 at offset 4
%add = add i32 %v0, %v1
ret i32 %add
}
define i32 @caller({ i32, i32 }* %p) {
%res = call i32 @add({ i32, i32 }* %p)
ret i32 %res
}
Derive “struct type” from access pattern,
rather than IR type information
34
Why?
● Memory usage: Don’t need to store bitcasts
● Compile-time: Don’t need to skip bitcasts in optimizations
● Performance:
○ Bitcasts can’t affect optimization if they don’t exist
○ Pointer element type difference cannot prevent CSE / forwarding / etc.
○ Opaque pointers require/encourage generic offset-based reasoning
35
Why?
● Memory usage: Don’t need to store bitcasts
● Compile-time: Don’t need to skip bitcasts in optimizations
● Performance: …
● Implementation simplification:
○ Don’t need to insert bitcasts all over the place
36
Type-System recursion
%ty = type { %ty* }
⇒ This requires struct types to be mutable
37
Type-System recursion
%ty = type { %ty* }
⇒ This requires struct types to be mutable
%ty = type { ptr }
⇒ All types can be immutable
38
Why?
● Memory usage: Don’t need to store bitcasts
● Compile-time: Don’t need to skip bitcasts in optimizations
● Performance: …
● Implementation simplification:
○ Don’t need to insert bitcasts all over the place
○ Removes recursion from the type system
39
Why?
● Memory usage: Don’t need to store bitcasts
● Compile-time: Don’t need to skip bitcasts in optimizations
● Performance: …
● Implementation simplification:
○ Don’t need to insert bitcasts all over the place
○ Removes recursion from the type system
○ Enables follow-up IR changes to remove more types
40
How?
41
IR changes
Add explicit type where semantically relevant.
load i32* %p
load i32, i32* %p
getelementptr i32* %p, i64 1
getelementptr i32, i32* %p, i64 1
define void @test(i32* byval %p)
define void @test(i32* byval(i32) %p)
42
Type::getPointerElementType()
43
Type::getPointerElementType()
44
Code changes
Use value types:
Load->getPointerOperandType()->getPointerElementType()
⇒ Load->getType()
Store->getPointerOperandType()->getPointerElementType()
⇒ Store->getValueOperand()->getType()
Global->getType()->getPointerElementType()
⇒ Global->getValueType()
Call->getType()->getPointerElementType()
⇒ Call->getFunctionType()
45
Migration helpers
assert(Load->getType() ==
Load->getPointerOperandType()->getPointerElementType());
⇒
assert(cast<PointerType>(Load->getPointerOperandType())
->isOpaqueOrPointeeTypeEquals(Load->getType()));
46
Code changes
PointerType::get(ElemTy, AS) still works in opaque pointer mode!
The element type is simply ignored.
47
Code changes
PointerType::get(ElemTy, AS) still works in opaque pointer mode!
The element type is simply ignored.
⇒ As long as getPointerElementType() is not called,
code usually “just works” in opaque pointer mode.
48
Pointer equality does not imply access type equality
define ptr @test(ptr %p) {
store i32 0, ptr %p
%v = load i64, ptr %p
ret ptr %v
}
49
Pointer equality does not imply access type equality
define ptr @test(ptr %p) {
store i32 0, ptr %p
%v = load i64, ptr %p
ret ptr %v
}
Need to explicitly check that load type == store type.
Not implied by same pointer operand anymore!
50
Frontends
Need to track pointer element types in their own structures now – can’t rely on
LLVM PointerType!
51
Frontends
Need to track pointer element types in their own structures now – can’t rely on
LLVM PointerType!
Clang: Address, LValue, RValue store pointer element type now.
52
Opaque pointer mode
Automatically enabled if you use ptr in IR or bitcode.
53
Opaque pointer mode
Automatically enabled if you use ptr in IR or bitcode.
Manually enabled with -opaque-pointers.
define i32* @test(i32* %p)
ret i32* %p
}
54
Opaque pointer mode
Automatically enabled if you use ptr in IR or bitcode.
Manually enabled with -opaque-pointers.
define i32* @test(i32* %p)
ret i32* %p
}
; RUN: opt -S -opaque-pointers < %s
define ptr @test(ptr %p)
ret ptr %p
}
55
Opaque pointer mode
Automatically enabled if you use ptr in IR or bitcode.
Manually enabled with -opaque-pointers.
Upgrading (very old) bitcode to opaque pointers is supported!
56
Migration
● Proposed by David Blaikie in 2015
57
Migration
● Proposed by David Blaikie in 2015
● Massive migration scope, with many hundreds of direct and indirect pointer
element type uses across the code base.
○ Some trivial to remove.
○ Some require IR changes or full transform rewrites.
58
Migration
● Proposed by David Blaikie in 2015
● Massive migration scope, with many hundreds of direct and indirect pointer
element type uses across the code base.
○ Some trivial to remove.
○ Some require IR changes or full transform rewrites.
● Opaque pointer type ptr only introduced in 2021 by Arthur Eubanks
59
Migration
● Proposed by David Blaikie in 2015
● Massive migration scope, with many hundreds of direct and indirect pointer
element type uses across the code base.
○ Some trivial to remove.
○ Some require IR changes or full transform rewrites.
● Opaque pointer type ptr only introduced in 2021 by Arthur Eubanks
● Now (end of March 2022): All pointer element type accesses in LLVM and
Clang eradicated.
60
Migration
● Proposed by David Blaikie in 2015
● Massive migration scope, with many hundreds of direct and indirect pointer
element type uses across the code base.
○ Some trivial to remove.
○ Some require IR changes or full transform rewrites.
● Opaque pointer type ptr only introduced in 2021 by Arthur Eubanks
● Now (end of March 2022): All pointer element type accesses in LLVM and
Clang eradicated.
● Next step: Enable opaque pointers by default.
○ Caveat: Requires updating ~7k tests.
61
Migration
● Proposed by David Blaikie in 2015
● Massive migration scope, with many hundreds of direct and indirect pointer
element type uses across the code base.
○ Some trivial to remove.
○ Some require IR changes or full transform rewrites.
● Opaque pointer type ptr only introduced in 2021 by Arthur Eubanks
● Now (end of March 2022): All pointer element type accesses in LLVM and
Clang eradicated.
● Next step: Enable opaque pointers by default.
○ Caveat: Requires updating ~7k tests.
● Typed pointers expected to be removed after LLVM 15 branch.
62
Future: Type-less GEP
All of these are equivalent:
getelementptr { [1 x i32], i32 }, ptr %p, i64 0, i32 1
getelementptr { [1 x i32], i32 }, ptr %p, i64 0, i32 0, i64 1
getelementptr { [1 x i32], i32 }, ptr %p, i64 1, i32 0, i64 -1
63
Future: Type-less GEP
All of these are equivalent:
getelementptr { [1 x i32], i32 }, ptr %p, i64 0, i32 1
getelementptr { [1 x i32], i32 }, ptr %p, i64 0, i32 0, i64 1
getelementptr { [1 x i32], i32 }, ptr %p, i64 1, i32 0, i64 -1
getelementptr i32, ptr %p, i64 1
getelementptr i8, ptr %p, i64 4
64
Future: Type-less GEP
All of these are equivalent:
getelementptr { [1 x i32], i32 }, ptr %p, i64 0, i32 1
getelementptr { [1 x i32], i32 }, ptr %p, i64 0, i32 0, i64 1
getelementptr { [1 x i32], i32 }, ptr %p, i64 1, i32 0, i64 -1
getelementptr i32, ptr %p, i64 1
getelementptr i8, ptr %p, i64 4
Offset-based algorithms will realize these are equivalent, but…
65
Future: Type-less GEP
Nothing in the -O3 pipeline realizes that %p1 and %p2 can be CSEd:
define void @test(ptr %p) {
%p1 = getelementptr i8, ptr %p, i64 4
%p2 = getelementptr i32, ptr %p, i64 1
call void @use(ptr %p1, ptr %p2)
ret void
}
66
Future: Type-less GEP
Offset-based GEP makes these trivially equivalent:
define void @test(ptr %p) {
%p1 = getelementptr ptr %p, i64 4
%p2 = getelementptr ptr %p, i64 4
call void @use(ptr %p1, ptr %p2)
ret void
}
67
Future: Type-less GEP
How far should we go?
%p.idx = getelementptr ptr %p, 4 * i64 %idx
; or
%off = shl i64 %idx, 2
%p.idx = getelementptr ptr %p, i64 %off
68
The End
● Docs: https://llvm.org/docs/OpaquePointers.html
● Reach me at:
○ npopov@redhat.com
○ https://twitter.com/nikita_ppv
69

More Related Content

What's hot

Debian Linux on Zynq (Xilinx ARM-SoC FPGA) Setup Flow (Vivado 2015.4)
Debian Linux on Zynq (Xilinx ARM-SoC FPGA) Setup Flow (Vivado 2015.4)Debian Linux on Zynq (Xilinx ARM-SoC FPGA) Setup Flow (Vivado 2015.4)
Debian Linux on Zynq (Xilinx ARM-SoC FPGA) Setup Flow (Vivado 2015.4)
Shinya Takamaeda-Y
 
Qemu Pcie
Qemu PcieQemu Pcie
Profiling the ACPICA Namespace and Event Handing
Profiling the ACPICA Namespace and Event HandingProfiling the ACPICA Namespace and Event Handing
Profiling the ACPICA Namespace and Event Handing
SUSE Labs Taipei
 
Part-1 : Mastering microcontroller with embedded driver development
Part-1 : Mastering microcontroller with embedded driver development Part-1 : Mastering microcontroller with embedded driver development
Part-1 : Mastering microcontroller with embedded driver development
FastBit Embedded Brain Academy
 
[COSCUP 2020] How to use llvm frontend library-libtooling
[COSCUP 2020] How to use llvm frontend library-libtooling[COSCUP 2020] How to use llvm frontend library-libtooling
[COSCUP 2020] How to use llvm frontend library-libtooling
Douglas Chen
 
SFO15-202: Towards Multi-Threaded Tiny Code Generator (TCG) in QEMU
SFO15-202: Towards Multi-Threaded Tiny Code Generator (TCG) in QEMUSFO15-202: Towards Multi-Threaded Tiny Code Generator (TCG) in QEMU
SFO15-202: Towards Multi-Threaded Tiny Code Generator (TCG) in QEMU
Linaro
 
Kernel Recipes 2015: Representing device-tree peripherals in ACPI
Kernel Recipes 2015: Representing device-tree peripherals in ACPIKernel Recipes 2015: Representing device-tree peripherals in ACPI
Kernel Recipes 2015: Representing device-tree peripherals in ACPI
Anne Nicolas
 
eBPF Trace from Kernel to Userspace
eBPF Trace from Kernel to UserspaceeBPF Trace from Kernel to Userspace
eBPF Trace from Kernel to Userspace
SUSE Labs Taipei
 
Embedded C
Embedded CEmbedded C
Virtual machine and javascript engine
Virtual machine and javascript engineVirtual machine and javascript engine
Virtual machine and javascript engine
Duoyi Wu
 
verilog code
verilog codeverilog code
verilog code
Mantra VLSI
 
Glibc内存管理ptmalloc源代码分析4
Glibc内存管理ptmalloc源代码分析4Glibc内存管理ptmalloc源代码分析4
Glibc内存管理ptmalloc源代码分析4hans511002
 
JVM Mechanics: Understanding the JIT's Tricks
JVM Mechanics: Understanding the JIT's TricksJVM Mechanics: Understanding the JIT's Tricks
JVM Mechanics: Understanding the JIT's Tricks
Doug Hawkins
 
[嵌入式系統] 嵌入式系統進階
[嵌入式系統] 嵌入式系統進階[嵌入式系統] 嵌入式系統進階
[嵌入式系統] 嵌入式系統進階
Simen Li
 
Linux kernel debugging
Linux kernel debuggingLinux kernel debugging
Linux kernel debugging
libfetion
 
Actor Model and C++: what, why and how?
Actor Model and C++: what, why and how?Actor Model and C++: what, why and how?
Actor Model and C++: what, why and how?
Yauheni Akhotnikau
 
Performance Wins with eBPF: Getting Started (2021)
Performance Wins with eBPF: Getting Started (2021)Performance Wins with eBPF: Getting Started (2021)
Performance Wins with eBPF: Getting Started (2021)
Brendan Gregg
 
Embedded_Linux_Booting
Embedded_Linux_BootingEmbedded_Linux_Booting
Embedded_Linux_Booting
Rashila Rr
 
[143] Modern C++ 무조건 써야 해?
[143] Modern C++ 무조건 써야 해?[143] Modern C++ 무조건 써야 해?
[143] Modern C++ 무조건 써야 해?
NAVER D2
 
from Binary to Binary: How Qemu Works
from Binary to Binary: How Qemu Worksfrom Binary to Binary: How Qemu Works
from Binary to Binary: How Qemu Works
Zhen Wei
 

What's hot (20)

Debian Linux on Zynq (Xilinx ARM-SoC FPGA) Setup Flow (Vivado 2015.4)
Debian Linux on Zynq (Xilinx ARM-SoC FPGA) Setup Flow (Vivado 2015.4)Debian Linux on Zynq (Xilinx ARM-SoC FPGA) Setup Flow (Vivado 2015.4)
Debian Linux on Zynq (Xilinx ARM-SoC FPGA) Setup Flow (Vivado 2015.4)
 
Qemu Pcie
Qemu PcieQemu Pcie
Qemu Pcie
 
Profiling the ACPICA Namespace and Event Handing
Profiling the ACPICA Namespace and Event HandingProfiling the ACPICA Namespace and Event Handing
Profiling the ACPICA Namespace and Event Handing
 
Part-1 : Mastering microcontroller with embedded driver development
Part-1 : Mastering microcontroller with embedded driver development Part-1 : Mastering microcontroller with embedded driver development
Part-1 : Mastering microcontroller with embedded driver development
 
[COSCUP 2020] How to use llvm frontend library-libtooling
[COSCUP 2020] How to use llvm frontend library-libtooling[COSCUP 2020] How to use llvm frontend library-libtooling
[COSCUP 2020] How to use llvm frontend library-libtooling
 
SFO15-202: Towards Multi-Threaded Tiny Code Generator (TCG) in QEMU
SFO15-202: Towards Multi-Threaded Tiny Code Generator (TCG) in QEMUSFO15-202: Towards Multi-Threaded Tiny Code Generator (TCG) in QEMU
SFO15-202: Towards Multi-Threaded Tiny Code Generator (TCG) in QEMU
 
Kernel Recipes 2015: Representing device-tree peripherals in ACPI
Kernel Recipes 2015: Representing device-tree peripherals in ACPIKernel Recipes 2015: Representing device-tree peripherals in ACPI
Kernel Recipes 2015: Representing device-tree peripherals in ACPI
 
eBPF Trace from Kernel to Userspace
eBPF Trace from Kernel to UserspaceeBPF Trace from Kernel to Userspace
eBPF Trace from Kernel to Userspace
 
Embedded C
Embedded CEmbedded C
Embedded C
 
Virtual machine and javascript engine
Virtual machine and javascript engineVirtual machine and javascript engine
Virtual machine and javascript engine
 
verilog code
verilog codeverilog code
verilog code
 
Glibc内存管理ptmalloc源代码分析4
Glibc内存管理ptmalloc源代码分析4Glibc内存管理ptmalloc源代码分析4
Glibc内存管理ptmalloc源代码分析4
 
JVM Mechanics: Understanding the JIT's Tricks
JVM Mechanics: Understanding the JIT's TricksJVM Mechanics: Understanding the JIT's Tricks
JVM Mechanics: Understanding the JIT's Tricks
 
[嵌入式系統] 嵌入式系統進階
[嵌入式系統] 嵌入式系統進階[嵌入式系統] 嵌入式系統進階
[嵌入式系統] 嵌入式系統進階
 
Linux kernel debugging
Linux kernel debuggingLinux kernel debugging
Linux kernel debugging
 
Actor Model and C++: what, why and how?
Actor Model and C++: what, why and how?Actor Model and C++: what, why and how?
Actor Model and C++: what, why and how?
 
Performance Wins with eBPF: Getting Started (2021)
Performance Wins with eBPF: Getting Started (2021)Performance Wins with eBPF: Getting Started (2021)
Performance Wins with eBPF: Getting Started (2021)
 
Embedded_Linux_Booting
Embedded_Linux_BootingEmbedded_Linux_Booting
Embedded_Linux_Booting
 
[143] Modern C++ 무조건 써야 해?
[143] Modern C++ 무조건 써야 해?[143] Modern C++ 무조건 써야 해?
[143] Modern C++ 무조건 써야 해?
 
from Binary to Binary: How Qemu Works
from Binary to Binary: How Qemu Worksfrom Binary to Binary: How Qemu Works
from Binary to Binary: How Qemu Works
 

Similar to Opaque Pointers Are Coming

LSFMM 2019 BPF Observability
LSFMM 2019 BPF ObservabilityLSFMM 2019 BPF Observability
LSFMM 2019 BPF Observability
Brendan Gregg
 
Exploiting the Linux Kernel via Intel's SYSRET Implementation
Exploiting the Linux Kernel via Intel's SYSRET ImplementationExploiting the Linux Kernel via Intel's SYSRET Implementation
Exploiting the Linux Kernel via Intel's SYSRET Implementation
nkslides
 
Optimizing Parallel Reduction in CUDA : NOTES
Optimizing Parallel Reduction in CUDA : NOTESOptimizing Parallel Reduction in CUDA : NOTES
Optimizing Parallel Reduction in CUDA : NOTES
Subhajit Sahu
 
Reduction
ReductionReduction
Reduction
Wei Shen
 
Boosting Developer Productivity with Clang
Boosting Developer Productivity with ClangBoosting Developer Productivity with Clang
Boosting Developer Productivity with Clang
Samsung Open Source Group
 
Comparing On-The-Fly Accelerating Packages: Numba, TensorFlow, Dask, etc
Comparing On-The-Fly Accelerating Packages: Numba, TensorFlow, Dask, etcComparing On-The-Fly Accelerating Packages: Numba, TensorFlow, Dask, etc
Comparing On-The-Fly Accelerating Packages: Numba, TensorFlow, Dask, etc
Yukio Okuda
 
Fedor Polyakov - Optimizing computer vision problems on mobile platforms
Fedor Polyakov - Optimizing computer vision problems on mobile platforms Fedor Polyakov - Optimizing computer vision problems on mobile platforms
Fedor Polyakov - Optimizing computer vision problems on mobile platforms
Eastern European Computer Vision Conference
 
pointers CP Lecture.ppt
pointers CP Lecture.pptpointers CP Lecture.ppt
pointers CP Lecture.ppt
EC42ShaikhAmaan
 
0-Slot11-12-Pointers.pdf
0-Slot11-12-Pointers.pdf0-Slot11-12-Pointers.pdf
0-Slot11-12-Pointers.pdf
ssusere19c741
 
Python Objects
Python ObjectsPython Objects
Python Objects
Quintagroup
 
SimpleArray between Python and C++
SimpleArray between Python and C++SimpleArray between Python and C++
SimpleArray between Python and C++
Yung-Yu Chen
 
LAS16-501: Introduction to LLVM - Projects, Components, Integration, Internals
LAS16-501: Introduction to LLVM - Projects, Components, Integration, InternalsLAS16-501: Introduction to LLVM - Projects, Components, Integration, Internals
LAS16-501: Introduction to LLVM - Projects, Components, Integration, Internals
Linaro
 
TensorFlow Quantization Tour
TensorFlow Quantization TourTensorFlow Quantization Tour
TensorFlow Quantization Tour
KSuzukiii
 
Pragmatic Optimization in Modern Programming - Demystifying the Compiler
Pragmatic Optimization in Modern Programming - Demystifying the CompilerPragmatic Optimization in Modern Programming - Demystifying the Compiler
Pragmatic Optimization in Modern Programming - Demystifying the Compiler
Marina Kolpakova
 
1. C Basics for Data Structures Bridge Course
1. C Basics for Data Structures   Bridge Course1. C Basics for Data Structures   Bridge Course
MuVM: Higher Order Mutation Analysis Virtual Machine for C
MuVM: Higher Order Mutation Analysis Virtual Machine for CMuVM: Higher Order Mutation Analysis Virtual Machine for C
MuVM: Higher Order Mutation Analysis Virtual Machine for C
Susumu Tokumoto
 
Specializing the Data Path - Hooking into the Linux Network Stack
Specializing the Data Path - Hooking into the Linux Network StackSpecializing the Data Path - Hooking into the Linux Network Stack
Specializing the Data Path - Hooking into the Linux Network Stack
Kernel TLV
 
Compiler2016 by abcdabcd987
Compiler2016 by abcdabcd987Compiler2016 by abcdabcd987
Compiler2016 by abcdabcd987
乐群 陈
 
A taste of GlobalISel
A taste of GlobalISelA taste of GlobalISel
A taste of GlobalISel
Igalia
 
C++_notes.pdf
C++_notes.pdfC++_notes.pdf
C++_notes.pdf
HimanshuSharma997566
 

Similar to Opaque Pointers Are Coming (20)

LSFMM 2019 BPF Observability
LSFMM 2019 BPF ObservabilityLSFMM 2019 BPF Observability
LSFMM 2019 BPF Observability
 
Exploiting the Linux Kernel via Intel's SYSRET Implementation
Exploiting the Linux Kernel via Intel's SYSRET ImplementationExploiting the Linux Kernel via Intel's SYSRET Implementation
Exploiting the Linux Kernel via Intel's SYSRET Implementation
 
Optimizing Parallel Reduction in CUDA : NOTES
Optimizing Parallel Reduction in CUDA : NOTESOptimizing Parallel Reduction in CUDA : NOTES
Optimizing Parallel Reduction in CUDA : NOTES
 
Reduction
ReductionReduction
Reduction
 
Boosting Developer Productivity with Clang
Boosting Developer Productivity with ClangBoosting Developer Productivity with Clang
Boosting Developer Productivity with Clang
 
Comparing On-The-Fly Accelerating Packages: Numba, TensorFlow, Dask, etc
Comparing On-The-Fly Accelerating Packages: Numba, TensorFlow, Dask, etcComparing On-The-Fly Accelerating Packages: Numba, TensorFlow, Dask, etc
Comparing On-The-Fly Accelerating Packages: Numba, TensorFlow, Dask, etc
 
Fedor Polyakov - Optimizing computer vision problems on mobile platforms
Fedor Polyakov - Optimizing computer vision problems on mobile platforms Fedor Polyakov - Optimizing computer vision problems on mobile platforms
Fedor Polyakov - Optimizing computer vision problems on mobile platforms
 
pointers CP Lecture.ppt
pointers CP Lecture.pptpointers CP Lecture.ppt
pointers CP Lecture.ppt
 
0-Slot11-12-Pointers.pdf
0-Slot11-12-Pointers.pdf0-Slot11-12-Pointers.pdf
0-Slot11-12-Pointers.pdf
 
Python Objects
Python ObjectsPython Objects
Python Objects
 
SimpleArray between Python and C++
SimpleArray between Python and C++SimpleArray between Python and C++
SimpleArray between Python and C++
 
LAS16-501: Introduction to LLVM - Projects, Components, Integration, Internals
LAS16-501: Introduction to LLVM - Projects, Components, Integration, InternalsLAS16-501: Introduction to LLVM - Projects, Components, Integration, Internals
LAS16-501: Introduction to LLVM - Projects, Components, Integration, Internals
 
TensorFlow Quantization Tour
TensorFlow Quantization TourTensorFlow Quantization Tour
TensorFlow Quantization Tour
 
Pragmatic Optimization in Modern Programming - Demystifying the Compiler
Pragmatic Optimization in Modern Programming - Demystifying the CompilerPragmatic Optimization in Modern Programming - Demystifying the Compiler
Pragmatic Optimization in Modern Programming - Demystifying the Compiler
 
1. C Basics for Data Structures Bridge Course
1. C Basics for Data Structures   Bridge Course1. C Basics for Data Structures   Bridge Course
1. C Basics for Data Structures Bridge Course
 
MuVM: Higher Order Mutation Analysis Virtual Machine for C
MuVM: Higher Order Mutation Analysis Virtual Machine for CMuVM: Higher Order Mutation Analysis Virtual Machine for C
MuVM: Higher Order Mutation Analysis Virtual Machine for C
 
Specializing the Data Path - Hooking into the Linux Network Stack
Specializing the Data Path - Hooking into the Linux Network StackSpecializing the Data Path - Hooking into the Linux Network Stack
Specializing the Data Path - Hooking into the Linux Network Stack
 
Compiler2016 by abcdabcd987
Compiler2016 by abcdabcd987Compiler2016 by abcdabcd987
Compiler2016 by abcdabcd987
 
A taste of GlobalISel
A taste of GlobalISelA taste of GlobalISel
A taste of GlobalISel
 
C++_notes.pdf
C++_notes.pdfC++_notes.pdf
C++_notes.pdf
 

More from Nikita Popov

A whirlwind tour of the LLVM optimizer
A whirlwind tour of the LLVM optimizerA whirlwind tour of the LLVM optimizer
A whirlwind tour of the LLVM optimizer
Nikita Popov
 
What's new in PHP 8.0?
What's new in PHP 8.0?What's new in PHP 8.0?
What's new in PHP 8.0?
Nikita Popov
 
Just-In-Time Compiler in PHP 8
Just-In-Time Compiler in PHP 8Just-In-Time Compiler in PHP 8
Just-In-Time Compiler in PHP 8
Nikita Popov
 
What's new in PHP 8.0?
What's new in PHP 8.0?What's new in PHP 8.0?
What's new in PHP 8.0?
Nikita Popov
 
PHP Performance Trivia
PHP Performance TriviaPHP Performance Trivia
PHP Performance Trivia
Nikita Popov
 
Typed Properties and more: What's coming in PHP 7.4?
Typed Properties and more: What's coming in PHP 7.4?Typed Properties and more: What's coming in PHP 7.4?
Typed Properties and more: What's coming in PHP 7.4?
Nikita Popov
 
Static Optimization of PHP bytecode (PHPSC 2017)
Static Optimization of PHP bytecode (PHPSC 2017)Static Optimization of PHP bytecode (PHPSC 2017)
Static Optimization of PHP bytecode (PHPSC 2017)
Nikita Popov
 
PHP Language Trivia
PHP Language TriviaPHP Language Trivia
PHP Language Trivia
Nikita Popov
 
PHP 7 – What changed internally? (Forum PHP 2015)
PHP 7 – What changed internally? (Forum PHP 2015)PHP 7 – What changed internally? (Forum PHP 2015)
PHP 7 – What changed internally? (Forum PHP 2015)
Nikita Popov
 
PHP 7 – What changed internally? (PHP Barcelona 2015)
PHP 7 – What changed internally? (PHP Barcelona 2015)PHP 7 – What changed internally? (PHP Barcelona 2015)
PHP 7 – What changed internally? (PHP Barcelona 2015)
Nikita Popov
 
PHP 7 – What changed internally?
PHP 7 – What changed internally?PHP 7 – What changed internally?
PHP 7 – What changed internally?
Nikita Popov
 

More from Nikita Popov (11)

A whirlwind tour of the LLVM optimizer
A whirlwind tour of the LLVM optimizerA whirlwind tour of the LLVM optimizer
A whirlwind tour of the LLVM optimizer
 
What's new in PHP 8.0?
What's new in PHP 8.0?What's new in PHP 8.0?
What's new in PHP 8.0?
 
Just-In-Time Compiler in PHP 8
Just-In-Time Compiler in PHP 8Just-In-Time Compiler in PHP 8
Just-In-Time Compiler in PHP 8
 
What's new in PHP 8.0?
What's new in PHP 8.0?What's new in PHP 8.0?
What's new in PHP 8.0?
 
PHP Performance Trivia
PHP Performance TriviaPHP Performance Trivia
PHP Performance Trivia
 
Typed Properties and more: What's coming in PHP 7.4?
Typed Properties and more: What's coming in PHP 7.4?Typed Properties and more: What's coming in PHP 7.4?
Typed Properties and more: What's coming in PHP 7.4?
 
Static Optimization of PHP bytecode (PHPSC 2017)
Static Optimization of PHP bytecode (PHPSC 2017)Static Optimization of PHP bytecode (PHPSC 2017)
Static Optimization of PHP bytecode (PHPSC 2017)
 
PHP Language Trivia
PHP Language TriviaPHP Language Trivia
PHP Language Trivia
 
PHP 7 – What changed internally? (Forum PHP 2015)
PHP 7 – What changed internally? (Forum PHP 2015)PHP 7 – What changed internally? (Forum PHP 2015)
PHP 7 – What changed internally? (Forum PHP 2015)
 
PHP 7 – What changed internally? (PHP Barcelona 2015)
PHP 7 – What changed internally? (PHP Barcelona 2015)PHP 7 – What changed internally? (PHP Barcelona 2015)
PHP 7 – What changed internally? (PHP Barcelona 2015)
 
PHP 7 – What changed internally?
PHP 7 – What changed internally?PHP 7 – What changed internally?
PHP 7 – What changed internally?
 

Recently uploaded

Mobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona InfotechMobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona Infotech
Drona Infotech
 
Microservice Teams - How the cloud changes the way we work
Microservice Teams - How the cloud changes the way we workMicroservice Teams - How the cloud changes the way we work
Microservice Teams - How the cloud changes the way we work
Sven Peters
 
Measures in SQL (SIGMOD 2024, Santiago, Chile)
Measures in SQL (SIGMOD 2024, Santiago, Chile)Measures in SQL (SIGMOD 2024, Santiago, Chile)
Measures in SQL (SIGMOD 2024, Santiago, Chile)
Julian Hyde
 
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, FactsALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
Green Software Development
 
Kubernetes at Scale: Going Multi-Cluster with Istio
Kubernetes at Scale:  Going Multi-Cluster  with IstioKubernetes at Scale:  Going Multi-Cluster  with Istio
Kubernetes at Scale: Going Multi-Cluster with Istio
Severalnines
 
Project Management: The Role of Project Dashboards.pdf
Project Management: The Role of Project Dashboards.pdfProject Management: The Role of Project Dashboards.pdf
Project Management: The Role of Project Dashboards.pdf
Karya Keeper
 
WWDC 2024 Keynote Review: For CocoaCoders Austin
WWDC 2024 Keynote Review: For CocoaCoders AustinWWDC 2024 Keynote Review: For CocoaCoders Austin
WWDC 2024 Keynote Review: For CocoaCoders Austin
Patrick Weigel
 
Modelling Up - DDDEurope 2024 - Amsterdam
Modelling Up - DDDEurope 2024 - AmsterdamModelling Up - DDDEurope 2024 - Amsterdam
Modelling Up - DDDEurope 2024 - Amsterdam
Alberto Brandolini
 
All you need to know about Spring Boot and GraalVM
All you need to know about Spring Boot and GraalVMAll you need to know about Spring Boot and GraalVM
All you need to know about Spring Boot and GraalVM
Alina Yurenko
 
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian CompaniesE-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
Quickdice ERP
 
ppt on the brain chip neuralink.pptx
ppt  on   the brain  chip neuralink.pptxppt  on   the brain  chip neuralink.pptx
ppt on the brain chip neuralink.pptx
Reetu63
 
Quarter 3 SLRP grade 9.. gshajsbhhaheabh
Quarter 3 SLRP grade 9.. gshajsbhhaheabhQuarter 3 SLRP grade 9.. gshajsbhhaheabh
Quarter 3 SLRP grade 9.. gshajsbhhaheabh
aisafed42
 
UI5con 2024 - Keynote: Latest News about UI5 and it’s Ecosystem
UI5con 2024 - Keynote: Latest News about UI5 and it’s EcosystemUI5con 2024 - Keynote: Latest News about UI5 and it’s Ecosystem
UI5con 2024 - Keynote: Latest News about UI5 and it’s Ecosystem
Peter Muessig
 
What’s New in Odoo 17 – A Complete Roadmap
What’s New in Odoo 17 – A Complete RoadmapWhat’s New in Odoo 17 – A Complete Roadmap
What’s New in Odoo 17 – A Complete Roadmap
Envertis Software Solutions
 
INTRODUCTION TO AI CLASSICAL THEORY TARGETED EXAMPLES
INTRODUCTION TO AI CLASSICAL THEORY TARGETED EXAMPLESINTRODUCTION TO AI CLASSICAL THEORY TARGETED EXAMPLES
INTRODUCTION TO AI CLASSICAL THEORY TARGETED EXAMPLES
anfaltahir1010
 
A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...
A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...
A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...
kalichargn70th171
 
Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...
Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...
Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...
The Third Creative Media
 
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdfBaha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
Baha Majid
 
Enums On Steroids - let's look at sealed classes !
Enums On Steroids - let's look at sealed classes !Enums On Steroids - let's look at sealed classes !
Enums On Steroids - let's look at sealed classes !
Marcin Chrost
 
The Key to Digital Success_ A Comprehensive Guide to Continuous Testing Integ...
The Key to Digital Success_ A Comprehensive Guide to Continuous Testing Integ...The Key to Digital Success_ A Comprehensive Guide to Continuous Testing Integ...
The Key to Digital Success_ A Comprehensive Guide to Continuous Testing Integ...
kalichargn70th171
 

Recently uploaded (20)

Mobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona InfotechMobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona Infotech
 
Microservice Teams - How the cloud changes the way we work
Microservice Teams - How the cloud changes the way we workMicroservice Teams - How the cloud changes the way we work
Microservice Teams - How the cloud changes the way we work
 
Measures in SQL (SIGMOD 2024, Santiago, Chile)
Measures in SQL (SIGMOD 2024, Santiago, Chile)Measures in SQL (SIGMOD 2024, Santiago, Chile)
Measures in SQL (SIGMOD 2024, Santiago, Chile)
 
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, FactsALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
 
Kubernetes at Scale: Going Multi-Cluster with Istio
Kubernetes at Scale:  Going Multi-Cluster  with IstioKubernetes at Scale:  Going Multi-Cluster  with Istio
Kubernetes at Scale: Going Multi-Cluster with Istio
 
Project Management: The Role of Project Dashboards.pdf
Project Management: The Role of Project Dashboards.pdfProject Management: The Role of Project Dashboards.pdf
Project Management: The Role of Project Dashboards.pdf
 
WWDC 2024 Keynote Review: For CocoaCoders Austin
WWDC 2024 Keynote Review: For CocoaCoders AustinWWDC 2024 Keynote Review: For CocoaCoders Austin
WWDC 2024 Keynote Review: For CocoaCoders Austin
 
Modelling Up - DDDEurope 2024 - Amsterdam
Modelling Up - DDDEurope 2024 - AmsterdamModelling Up - DDDEurope 2024 - Amsterdam
Modelling Up - DDDEurope 2024 - Amsterdam
 
All you need to know about Spring Boot and GraalVM
All you need to know about Spring Boot and GraalVMAll you need to know about Spring Boot and GraalVM
All you need to know about Spring Boot and GraalVM
 
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian CompaniesE-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
 
ppt on the brain chip neuralink.pptx
ppt  on   the brain  chip neuralink.pptxppt  on   the brain  chip neuralink.pptx
ppt on the brain chip neuralink.pptx
 
Quarter 3 SLRP grade 9.. gshajsbhhaheabh
Quarter 3 SLRP grade 9.. gshajsbhhaheabhQuarter 3 SLRP grade 9.. gshajsbhhaheabh
Quarter 3 SLRP grade 9.. gshajsbhhaheabh
 
UI5con 2024 - Keynote: Latest News about UI5 and it’s Ecosystem
UI5con 2024 - Keynote: Latest News about UI5 and it’s EcosystemUI5con 2024 - Keynote: Latest News about UI5 and it’s Ecosystem
UI5con 2024 - Keynote: Latest News about UI5 and it’s Ecosystem
 
What’s New in Odoo 17 – A Complete Roadmap
What’s New in Odoo 17 – A Complete RoadmapWhat’s New in Odoo 17 – A Complete Roadmap
What’s New in Odoo 17 – A Complete Roadmap
 
INTRODUCTION TO AI CLASSICAL THEORY TARGETED EXAMPLES
INTRODUCTION TO AI CLASSICAL THEORY TARGETED EXAMPLESINTRODUCTION TO AI CLASSICAL THEORY TARGETED EXAMPLES
INTRODUCTION TO AI CLASSICAL THEORY TARGETED EXAMPLES
 
A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...
A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...
A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...
 
Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...
Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...
Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...
 
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdfBaha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
 
Enums On Steroids - let's look at sealed classes !
Enums On Steroids - let's look at sealed classes !Enums On Steroids - let's look at sealed classes !
Enums On Steroids - let's look at sealed classes !
 
The Key to Digital Success_ A Comprehensive Guide to Continuous Testing Integ...
The Key to Digital Success_ A Comprehensive Guide to Continuous Testing Integ...The Key to Digital Success_ A Comprehensive Guide to Continuous Testing Integ...
The Key to Digital Success_ A Comprehensive Guide to Continuous Testing Integ...
 

Opaque Pointers Are Coming

  • 1. Opaque Pointers Are Coming Nikita Popov @ LLVM CGO 2022
  • 2. About Me ● Sr. Software Engineer on Platform Tools team at Red Hat ● I maintain the LLVM Compile-Time Tracker 2
  • 3. ptr One PointerType to Rule Them All i8* i32* [0 x i32]* i8* (i8*)* %ty* 3
  • 4. ptr … apart from address spaces i8* i32* i8 addrspace(1)* i32 addrspace(1)* ptr addrspace(1) 4
  • 6. Pointer element types carry no semantics define void @test(i32* %p) 6
  • 7. Pointer element types carry no semantics define void @test(i32* %p) ⇒ Does not imply that %p is 4-byte aligned define void @test(i32* aligned 4 %p) 7
  • 8. Pointer element types carry no semantics define void @test(i32* %p) ⇒ Does not imply that %p is 4-byte aligned define void @test(i32* aligned 4 %p) ⇒ Does not imply that %p is 4-byte dereferenceable define void @test(i32* dereferenceable(4) %p) 8
  • 9. Pointer element types carry no semantics define void @test(i32* %p) ⇒ Does not imply that %p is 4-byte aligned define void @test(i32* aligned 4 %p) ⇒ Does not imply that %p is 4-byte dereferenceable define void @test(i32* dereferenceable(4) %p) ⇒ Does not imply any aliasing semantics (TBAA metadata does) 9
  • 10. Pointer element types carry no semantics define void @test(i32* %p) ⇒ Does not imply that %p is 4-byte aligned define void @test(i32* aligned 4 %p) ⇒ Does not imply that %p is 4-byte dereferenceable define void @test(i32* dereferenceable(4) %p) ⇒ Does not imply any aliasing semantics (TBAA metadata does) ⇒ Does not imply it will be accessed as i32! 10
  • 11. Pointers can be arbitrarily bitcasted define i64 @test(double* byval(double) %p) { %p.p0i8 = bitcast double* %p to i8** store i8* null, i8** %p.p0i8 %p.i64 = bitcast double* %p to i64* %x = load i64, i64** %p.i64 ret i64 %x } 11
  • 12. Only types at certain uses matter define i64 @test(double* byval(double) %p) { %p.p0i8 = bitcast double* %p to i8** store i8* null, i8** %p.p0i8 %p.i64 = bitcast double* %p to i64* %x = load i64, i64** %p.i64 ret i64 %x } 12
  • 13. Only types at certain uses matter define i64 @test(ptr byval(double) %p) { store ptr 0.0, ptr %p %x = load i64, ptr %p ret i64 %x } 13
  • 14. Why? ● Memory usage: Don’t need to store bitcasts ● Compile-time: Don’t need to skip bitcasts in optimizations 14
  • 15. Compile-Time Improvements (CTMark) Link to data Disclaimer: There may be differences in optimization behavior. 15
  • 16. Compile-Time Improvements (rustc) Link to data Disclaimer: There may be differences in optimization behavior. 16
  • 17. Max-RSS Improvements (rustc) Link to data Disclaimer: There may be differences in optimization behavior. 17
  • 18. Why? ● Memory usage: Don’t need to store bitcasts ● Compile-time: Don’t need to skip bitcasts in optimizations ● Performance: ○ Optimizations should ignore pointer bitcasts 18
  • 19. Why? ● Memory usage: Don’t need to store bitcasts ● Compile-time: Don’t need to skip bitcasts in optimizations ● Performance: ○ Optimizations should ignore pointer bitcasts ○ …and many do (e.g. cost models says they’re free) 19
  • 20. Why? ● Memory usage: Don’t need to store bitcasts ● Compile-time: Don’t need to skip bitcasts in optimizations ● Performance: ○ Optimizations should ignore pointer bitcasts ○ …and many do (e.g. cost models says they’re free) ○ …but many don’t (e.g. limited instruction/use walks) 20
  • 21. Why? ● Memory usage: Don’t need to store bitcasts ● Compile-time: Don’t need to skip bitcasts in optimizations ● Performance: ○ Bitcasts can’t affect optimization if they don’t exist 21
  • 22. Equivalence modulo pointer type define i32* @test(i8** %p) { store i8* null, i8** %p %p.i32 = bitcast i8** %p to i32** %v = load i32*, i32** %p.i32 ret i32* %v } EarlyCSE can’t optimize this! (But full GVN can.) 22
  • 23. Equivalence modulo pointer type define ptr @test(ptr %p) { store ptr null, ptr %p %v = load ptr, ptr %p ret ptr %v } EarlyCSE can optimize this! 23
  • 24. Equivalence modulo pointer type define ptr @test(ptr %p) { store ptr null, ptr %p %v = load ptr, ptr %p ret ptr %v } ; RUN: opt -S -early-cse < %s define ptr @test(ptr %p) { store ptr null, ptr %p ret ptr null } 24
  • 25. Why? ● Memory usage: Don’t need to store bitcasts ● Compile-time: Don’t need to skip bitcasts in optimizations ● Performance: ○ Bitcasts can’t affect optimization if they don’t exist ○ Pointer element type difference cannot prevent CSE / forwarding / etc. 25
  • 26. Offset-based reasoning define internal i32 @add({ i32, i32 }* %p) { %p0 = getelementptr { i32, i32 }, { i32, i32 }* %p, i64 0, i32 0 %v0 = load i32, i32* %p0 %p1 = getelementptr { i32, i32 }, { i32, i32 }* %p, i64 0, i32 1 %v1 = load i32, i32* %p1 %add = add i32 %v0, %v1 ret i32 %add } define i32 @caller({ i32, i32 }* %p) { %res = call i32 @add({ i32, i32 }* %p) ret i32 %res } 26
  • 27. Offset-based reasoning ; RUN: opt -S -argpromotion < %s define internal i32 @add(i32 %p.0.val, i32 %p.4.val) { %add = add i32 %p.0.val, %p.4.val ret i32 %add } define i32 @caller({ i32, i32 }* %p) { %1 = getelementptr { i32, i32 }, { i32, i32 }* %p, i64 0, i32 0 %p.val = load i32, i32* %1, align 4 %2 = getelementptr { i32, i32 }, { i32, i32 }* %p, i64 0, i32 1 %p.val1 = load i32, i32* %2, align 4 %res = call i32 @add(i32 %p.val, i32 %p.val1) ret i32 %res } 27
  • 28. Offset-based reasoning ; RUN: opt -S -argpromotion < %s define internal i32 @add(i32 %p.0.val, i32 %p.4.val) { %add = add i32 %p.0.val, %p.4.val ret i32 %add } define i32 @caller({ i32, i32 }* %p) { %1 = getelementptr { i32, i32 }, { i32, i32 }* %p, i64 0, i32 0 %p.val = load i32, i32* %1, align 4 %2 = getelementptr { i32, i32 }, { i32, i32 }* %p, i64 0, i32 1 %p.val1 = load i32, i32* %2, align 4 %res = call i32 @add(i32 %p.val, i32 %p.val1) ret i32 %res } Used to be based on GEP indices 28
  • 29. Getelementptr index ambiguity ; Equivalent despite different indices: getelementptr { [1 x i32], i32 }, ptr %p, i64 0 getelementptr { [1 x i32], i32 }, ptr %p, i64 0, i32 0 getelementptr { [1 x i32], i32 }, ptr %p, i64 0, i32 0, i64 0 29
  • 30. Getelementptr index ambiguity ; Equivalent despite different indices: getelementptr { [1 x i32], i32 }, ptr %p, i64 0 getelementptr { [1 x i32], i32 }, ptr %p, i64 0, i32 0 getelementptr { [1 x i32], i32 }, ptr %p, i64 0, i32 0, i64 0 ; Equivalent despite different indices: getelementptr { [1 x i32], i32 }, ptr %p, i64 0, i32 1 getelementptr { [1 x i32], i32 }, ptr %p, i64 0, i32 0, i64 1 getelementptr { [1 x i32], i32 }, ptr %p, i64 1, i32 0, i64 -1 30
  • 31. Getelementptr index ambiguity ; Equivalent despite different indices: getelementptr { [1 x i32], i32 }, ptr %p, i64 0 getelementptr { [1 x i32], i32 }, ptr %p, i64 0, i32 0 getelementptr { [1 x i32], i32 }, ptr %p, i64 0, i32 0, i64 0 ; Equivalent despite different indices: getelementptr { [1 x i32], i32 }, ptr %p, i64 0, i32 1 getelementptr { [1 x i32], i32 }, ptr %p, i64 0, i32 0, i64 1 getelementptr { [1 x i32], i32 }, ptr %p, i64 1, i32 0, i64 -1 ⇒ Requires careful restriction to ensure uniqueness ⇒ Can’t support bitcasts 31
  • 32. Getelementptr index ambiguity ; Equivalent despite different indices: getelementptr { [1 x i32], i32 }, ptr %p, i64 0 getelementptr { [1 x i32], i32 }, ptr %p, i64 0, i32 0 getelementptr { [1 x i32], i32 }, ptr %p, i64 0, i32 0, i64 0 ; Equivalent despite different indices: getelementptr { [1 x i32], i32 }, ptr %p, i64 0, i32 1 getelementptr { [1 x i32], i32 }, ptr %p, i64 0, i32 0, i64 1 getelementptr { [1 x i32], i32 }, ptr %p, i64 1, i32 0, i64 -1 ⇒ Requires careful restriction to ensure uniqueness ⇒ Can’t support bitcasts ⇒ Very hard to ensure correctness with opaque pointers 32
  • 33. Offset-based reasoning define internal i32 @add({ i32, i32 }* %p) { %p0 = getelementptr { i32, i32 }, { i32, i32 }* %p, i64 0, i32 0 %v0 = load i32, i32* %p0 ← Load of i32 at offset 0 %p1 = getelementptr { i32, i32 }, { i32, i32 }* %p, i64 0, i32 1 %v1 = load i32, i32* %p1 ← Load of i32 at offset 4 %add = add i32 %v0, %v1 ret i32 %add } define i32 @caller({ i32, i32 }* %p) { %res = call i32 @add({ i32, i32 }* %p) ret i32 %res } 33
  • 34. Offset-based reasoning define internal i32 @add({ i32, i32 }* %p) { %p0 = getelementptr { i32, i32 }, { i32, i32 }* %p, i64 0, i32 0 %v0 = load i32, i32* %p0 ← Load of i32 at offset 0 %p1 = getelementptr { i32, i32 }, { i32, i32 }* %p, i64 0, i32 1 %v1 = load i32, i32* %p1 ← Load of i32 at offset 4 %add = add i32 %v0, %v1 ret i32 %add } define i32 @caller({ i32, i32 }* %p) { %res = call i32 @add({ i32, i32 }* %p) ret i32 %res } Derive “struct type” from access pattern, rather than IR type information 34
  • 35. Why? ● Memory usage: Don’t need to store bitcasts ● Compile-time: Don’t need to skip bitcasts in optimizations ● Performance: ○ Bitcasts can’t affect optimization if they don’t exist ○ Pointer element type difference cannot prevent CSE / forwarding / etc. ○ Opaque pointers require/encourage generic offset-based reasoning 35
  • 36. Why? ● Memory usage: Don’t need to store bitcasts ● Compile-time: Don’t need to skip bitcasts in optimizations ● Performance: … ● Implementation simplification: ○ Don’t need to insert bitcasts all over the place 36
  • 37. Type-System recursion %ty = type { %ty* } ⇒ This requires struct types to be mutable 37
  • 38. Type-System recursion %ty = type { %ty* } ⇒ This requires struct types to be mutable %ty = type { ptr } ⇒ All types can be immutable 38
  • 39. Why? ● Memory usage: Don’t need to store bitcasts ● Compile-time: Don’t need to skip bitcasts in optimizations ● Performance: … ● Implementation simplification: ○ Don’t need to insert bitcasts all over the place ○ Removes recursion from the type system 39
  • 40. Why? ● Memory usage: Don’t need to store bitcasts ● Compile-time: Don’t need to skip bitcasts in optimizations ● Performance: … ● Implementation simplification: ○ Don’t need to insert bitcasts all over the place ○ Removes recursion from the type system ○ Enables follow-up IR changes to remove more types 40
  • 42. IR changes Add explicit type where semantically relevant. load i32* %p load i32, i32* %p getelementptr i32* %p, i64 1 getelementptr i32, i32* %p, i64 1 define void @test(i32* byval %p) define void @test(i32* byval(i32) %p) 42
  • 45. Code changes Use value types: Load->getPointerOperandType()->getPointerElementType() ⇒ Load->getType() Store->getPointerOperandType()->getPointerElementType() ⇒ Store->getValueOperand()->getType() Global->getType()->getPointerElementType() ⇒ Global->getValueType() Call->getType()->getPointerElementType() ⇒ Call->getFunctionType() 45
  • 47. Code changes PointerType::get(ElemTy, AS) still works in opaque pointer mode! The element type is simply ignored. 47
  • 48. Code changes PointerType::get(ElemTy, AS) still works in opaque pointer mode! The element type is simply ignored. ⇒ As long as getPointerElementType() is not called, code usually “just works” in opaque pointer mode. 48
  • 49. Pointer equality does not imply access type equality define ptr @test(ptr %p) { store i32 0, ptr %p %v = load i64, ptr %p ret ptr %v } 49
  • 50. Pointer equality does not imply access type equality define ptr @test(ptr %p) { store i32 0, ptr %p %v = load i64, ptr %p ret ptr %v } Need to explicitly check that load type == store type. Not implied by same pointer operand anymore! 50
  • 51. Frontends Need to track pointer element types in their own structures now – can’t rely on LLVM PointerType! 51
  • 52. Frontends Need to track pointer element types in their own structures now – can’t rely on LLVM PointerType! Clang: Address, LValue, RValue store pointer element type now. 52
  • 53. Opaque pointer mode Automatically enabled if you use ptr in IR or bitcode. 53
  • 54. Opaque pointer mode Automatically enabled if you use ptr in IR or bitcode. Manually enabled with -opaque-pointers. define i32* @test(i32* %p) ret i32* %p } 54
  • 55. Opaque pointer mode Automatically enabled if you use ptr in IR or bitcode. Manually enabled with -opaque-pointers. define i32* @test(i32* %p) ret i32* %p } ; RUN: opt -S -opaque-pointers < %s define ptr @test(ptr %p) ret ptr %p } 55
  • 56. Opaque pointer mode Automatically enabled if you use ptr in IR or bitcode. Manually enabled with -opaque-pointers. Upgrading (very old) bitcode to opaque pointers is supported! 56
  • 57. Migration ● Proposed by David Blaikie in 2015 57
  • 58. Migration ● Proposed by David Blaikie in 2015 ● Massive migration scope, with many hundreds of direct and indirect pointer element type uses across the code base. ○ Some trivial to remove. ○ Some require IR changes or full transform rewrites. 58
  • 59. Migration ● Proposed by David Blaikie in 2015 ● Massive migration scope, with many hundreds of direct and indirect pointer element type uses across the code base. ○ Some trivial to remove. ○ Some require IR changes or full transform rewrites. ● Opaque pointer type ptr only introduced in 2021 by Arthur Eubanks 59
  • 60. Migration ● Proposed by David Blaikie in 2015 ● Massive migration scope, with many hundreds of direct and indirect pointer element type uses across the code base. ○ Some trivial to remove. ○ Some require IR changes or full transform rewrites. ● Opaque pointer type ptr only introduced in 2021 by Arthur Eubanks ● Now (end of March 2022): All pointer element type accesses in LLVM and Clang eradicated. 60
  • 61. Migration ● Proposed by David Blaikie in 2015 ● Massive migration scope, with many hundreds of direct and indirect pointer element type uses across the code base. ○ Some trivial to remove. ○ Some require IR changes or full transform rewrites. ● Opaque pointer type ptr only introduced in 2021 by Arthur Eubanks ● Now (end of March 2022): All pointer element type accesses in LLVM and Clang eradicated. ● Next step: Enable opaque pointers by default. ○ Caveat: Requires updating ~7k tests. 61
  • 62. Migration ● Proposed by David Blaikie in 2015 ● Massive migration scope, with many hundreds of direct and indirect pointer element type uses across the code base. ○ Some trivial to remove. ○ Some require IR changes or full transform rewrites. ● Opaque pointer type ptr only introduced in 2021 by Arthur Eubanks ● Now (end of March 2022): All pointer element type accesses in LLVM and Clang eradicated. ● Next step: Enable opaque pointers by default. ○ Caveat: Requires updating ~7k tests. ● Typed pointers expected to be removed after LLVM 15 branch. 62
  • 63. Future: Type-less GEP All of these are equivalent: getelementptr { [1 x i32], i32 }, ptr %p, i64 0, i32 1 getelementptr { [1 x i32], i32 }, ptr %p, i64 0, i32 0, i64 1 getelementptr { [1 x i32], i32 }, ptr %p, i64 1, i32 0, i64 -1 63
  • 64. Future: Type-less GEP All of these are equivalent: getelementptr { [1 x i32], i32 }, ptr %p, i64 0, i32 1 getelementptr { [1 x i32], i32 }, ptr %p, i64 0, i32 0, i64 1 getelementptr { [1 x i32], i32 }, ptr %p, i64 1, i32 0, i64 -1 getelementptr i32, ptr %p, i64 1 getelementptr i8, ptr %p, i64 4 64
  • 65. Future: Type-less GEP All of these are equivalent: getelementptr { [1 x i32], i32 }, ptr %p, i64 0, i32 1 getelementptr { [1 x i32], i32 }, ptr %p, i64 0, i32 0, i64 1 getelementptr { [1 x i32], i32 }, ptr %p, i64 1, i32 0, i64 -1 getelementptr i32, ptr %p, i64 1 getelementptr i8, ptr %p, i64 4 Offset-based algorithms will realize these are equivalent, but… 65
  • 66. Future: Type-less GEP Nothing in the -O3 pipeline realizes that %p1 and %p2 can be CSEd: define void @test(ptr %p) { %p1 = getelementptr i8, ptr %p, i64 4 %p2 = getelementptr i32, ptr %p, i64 1 call void @use(ptr %p1, ptr %p2) ret void } 66
  • 67. Future: Type-less GEP Offset-based GEP makes these trivially equivalent: define void @test(ptr %p) { %p1 = getelementptr ptr %p, i64 4 %p2 = getelementptr ptr %p, i64 4 call void @use(ptr %p1, ptr %p2) ret void } 67
  • 68. Future: Type-less GEP How far should we go? %p.idx = getelementptr ptr %p, 4 * i64 %idx ; or %off = shl i64 %idx, 2 %p.idx = getelementptr ptr %p, i64 %off 68
  • 69. The End ● Docs: https://llvm.org/docs/OpaquePointers.html ● Reach me at: ○ npopov@redhat.com ○ https://twitter.com/nikita_ppv 69