1Samsung Open Source Group
Clang:
Much more than just a C/C++ Compiler
Tilmann Scheller
Principal Compiler Engineer
t.scheller@samsung.com
Samsung Open Source Group
Samsung Research UK
LinuxCon Europe 2016
Berlin, Germany, October 4 – 6, 2016
2Samsung Open Source Group
Overview
● Introduction
● LLVM Overview
● Clang
● Summary
3Samsung Open Source Group
Introduction
4Samsung Open Source Group
What is LLVM?
● Mature, production-quality compiler framework
● Modular architecture
● Heavily optimizing static and dynamic compiler
● Supports all major architectures (x86, ARM, MIPS,
PowerPC, …)
● Powerful link-time optimizations (LTO)
● Permissive license (BSD-like)
5Samsung Open Source Group
LLVM sub-projects
● Clang
C/C++/Objective C frontend and static analyzer
● LLDB
Next generation debugger leveraging the LLVM libraries, e.g. the Clang expression
parser
● lld
Framework for creating linkers, will make Clang independent of the system linker in
the future
● Polly
Polyhedral optimizer for LLVM, e.g. high-level loop optimizations and data-locality
optimizations
6Samsung Open Source Group
Which companies are contributing?
®
7Samsung Open Source Group
Who is using LLVM?
● Rust
● Android (NDK, RenderScript)
● Portable NativeClient (PNaCl)
● Majority of OpenCL implementations based on
Clang/LLVM
● CUDA
● LLVM on Linux: LLVMLinux, LLVMpipe (software
rasterizer in Mesa), AMDGPU drivers in Mesa
8Samsung Open Source Group
Clang users
● Default compiler on macOS
● Default compiler on FreeBSD
● Default compiler for native applications on Tizen
● Default compiler on OpenMandriva Lx 3.0
● Debian experimenting with Clang as an additional
compiler (94.4% of ~24.5k packages successfully built with Clang 3.8.1)
● Android NDK defaults to Clang
9Samsung Open Source Group
LLVM Overview
10Samsung Open Source Group
LLVM
● LLVM IR (Intermediate Representation)
● Scalar optimizations
● Interprocedural optimizations
● Auto-vectorizer (BB, Loop and SLP)
● Profile-guided optimizations
11Samsung Open Source Group
Compiler architecture
C Frontend
C++ Frontend
Fortran Frontend
Optimizer
x86 Backend
ARM Backend
MIPS Backend
12Samsung Open Source Group
Compilation steps
● Many steps involved in the translation from C source code to machine code:
– Frontend:
● Lexing, Parsing, AST (Abstract Syntax Tree) construction
● Translation to LLVM IR
– Middle-end
● Target-independent optimizations (Analyses & Transformations)
– Backend:
●
Translation into a DAG (Directed Acyclic Graph)
●
Instruction selection: Pattern matching on the DAG
● Instruction scheduling: Assigning an order of execution
● Register allocation: Trying to reduce memory traffic
13Samsung Open Source Group
LLVM Intermediate Representation
● The representation of the middle-end
● The majority of optimizations is done at LLVM IR level
● Low-level representation which carries type information
● RISC-like three-address code in static single assignment
form (SSA) with an infinite number of virtual registers
● Three different formats: bitcode (compact on-disk format),
in-memory representation and textual representation
(LLVM assembly language)
14Samsung Open Source Group
LLVM IR Overview
● Arithmetic: add, sub, mul, udiv, sdiv, ...
– %tmp = add i32 %indvar, -512
● Logical operations: shl, lshr, ashr, and, or, xor
– %shr21 = ashr i32 %mul20, 8
● Memory access: load, store, alloca, getelementptr
– %tmp3 = load i64* %tmp2
● Comparison: icmp, select
– %cmp12 = icmp slt i32 %add, 1024
● Control flow: call, ret, br, switch, ...
– call void @foo(i32 %phitmp)
● Types: integer, floating point, vector, structure, array, ...
– i32, i342, double, <4 x float>, {i8, <2 x i16>}, [40 x i32]
15Samsung Open Source Group
Target-independent code generator
● Part of the backend
● Domain specific language to describe the instruction set,
register file, calling conventions (TableGen)
● Pattern matcher is generated automatically
● Backend is a mix of C++ and TableGen
● Usually generates assembly code, direct machine code
emission is also possible
16Samsung Open Source Group
Clang
17Samsung Open Source Group
Clang
● Goals:
– Fast compile time
– Low memory usage
– GCC compatibility
– Expressive diagnostics
● Several tools built on top of Clang:
– Clang static analyzer
– clang-format, clang-tidy
18Samsung Open Source Group
Clang Static Analyzer
● Part of Clang
● Tries to find bugs without executing the program
● Slower than compilation
● False positives
● Source annotations
● Works best on C code
● Runs from the commandline (scan-build), web interface
for results
19Samsung Open Source Group
Clang Static Analyzer
● Core Checkers
● C++ Checkers
● Dead Code Checkers
● Security Checkers
● Unix Checkers
20Samsung Open Source Group
Clang Static Analyzer
21Samsung Open Source Group
Clang Static Analyzer
22Samsung Open Source Group
Clang Static Analyzer - Example
...
23Samsung Open Source Group
Clang Static Analyzer - Example
24Samsung Open Source Group
Clang Static Analyzer - Example
25Samsung Open Source Group
clang-format
● Automatic code formatting
● Consistent coding style is important
● Developers spend a lot of time on code formatting (e.g.
requesting trivial formatting changes in reviews)
● Supports different coding conventions (~80 settings)
● Includes configurations for LLVM, Google, Chromium,
Mozilla and WebKit coding conventions
26Samsung Open Source Group
clang-format
● Once the codebase is "clang-format clean" the coding
conventions can be enforced automatically
● Simplifies reformatting after automated refactorings
● Uses the Clang lexer
● Supports the following programming languages: C/C++,
Java, JavaScript, Objective-C and Protobuf
27Samsung Open Source Group
clang-tidy
● Detect bug prone coding patterns
● Enforce coding conventions
● Advocate modern and maintainable code
● Checks can be more expensive than compilation
● Currently 136 different checks
● Can run static analyzer checks as well
28Samsung Open Source Group
Sanitizers
● LLVM/Clang-based Sanitizer projects:
– AddressSanitizer – Fast memory error detector
– ThreadSanitizer – Detects data races
– LeakSanitizer – Memory leak detector
– MemorySanitizer – Detects reads of uninitialized variables
– UBSanitizer – Detects undefined behavior
29Samsung Open Source Group
AddressSanitizer: Stack Buffer Overflow
int main(int argc, char **argv) {
int stack_array[100];
stack_array[1] = 0;
return stack_array[argc + 100];
}
Example from https://linuxplumbersconf.org/2015/ocw/proposals/3261.html
$ clang++ -O1 -fsanitize=address a.cc; ./a.out
==10589== ERROR: AddressSanitizer stack-buffer-overflow
READ of size 4 at 0x7f5620d981b4 thread T0
#0 0x4024e8 in main a.cc:4
Address 0x7f5620d981b4 is located at offset 436 in frame
<main> of T0's stack:
This frame has 1 object(s):
[32, 432) 'stack_array'
30Samsung Open Source Group
AddressSanitizer: Use-After-Free
int main(int argc, char **argv) {
int *array = new int[100];
delete [] array;
return array[argc];
}
$ clang++ -O1 -fsanitize=address a.cc && ./a.out
==30226== ERROR: AddressSanitizer heap-use-after-free
READ of size 4 at 0x7faa07fce084 thread T0
#0 0x40433c in main a.cc:4
0x7faa07fce084 is located 4 bytes inside of 400-byte region
freed by thread T0 here:
#0 0x4058fd in operator delete[](void*) _asan_rtl_
#1 0x404303 in main a.cc:3
previously allocated by thread T0 here:
#0 0x405579 in operator new[](unsigned long) _asan_rtl_
#1 0x4042f3 in main a.cc:2
Example from https://linuxplumbersconf.org/2015/ocw/proposals/3261.html
31Samsung Open Source Group
AddressSanitizer: Stack-Use-After-Return
int main() {
LeakLocal();
return *g;
}
$ clang++ -g -fsanitize=address a.cc
$ ASAN_OPTIONS=detect_stack_use_after_return=1 ./a.out
==19177==ERROR: AddressSanitizer: stack-use-after-return
READ of size 4 at 0x7f473d0000a0 thread T0
#0 0x461ccf in main a.cc:8
Address is located in stack of thread T0 at offset 32 in frame
#0 0x461a5f in LeakLocal() a.cc:2
This frame has 1 object(s):
[32, 36) 'local' <== Memory access at offset 32
int *g;
void LeakLocal() {
int local;
g = &local;
}
Example from https://linuxplumbersconf.org/2015/ocw/proposals/3261.html
32Samsung Open Source Group
MemorySanitizer: Uninitialized Data
int main(int argc, char **argv) {
int x[10];
x[0] = 1;
return x[argc];
}
$ clang -fsanitize=memory a.c -g; ./a.out
WARNING: Use of uninitialized value
#0 0x7f1c31f16d10 in main a.cc:4
Uninitialized value was created by an
allocation of 'x' in the stack frame of
function 'main'
Example from https://linuxplumbersconf.org/2015/ocw/proposals/3261.html
33Samsung Open Source Group
UBSanitizer: Integer Overflow
int main(int argc, char **argv) {
int t = argc << 16;
return t * t;
}
$ clang -fsanitize=undefined a.cc -g; ./a.out
a.cc:3:12: runtime error:
signed integer overflow: 65536 * 65536
cannot be represented in type 'int'
Example from https://linuxplumbersconf.org/2015/ocw/proposals/3261.html
34Samsung Open Source Group
UBSanitizer: Invalid Shift
int main(int argc, char **argv) {
return (1 << (32 * argc)) == 0;
}
$ clang -fsanitize=undefined a.cc -g; ./a.out
a.cc:2:13: runtime error: shift exponent 32 is
too large for 32-bit type 'int'
Example from https://linuxplumbersconf.org/2015/ocw/proposals/3261.html
35Samsung Open Source Group
LibFuzzer
● Coverage-guided fuzz testing
● Coverage data provided by SanitizerCoverage (very low
overhead, tracking of function-level coverage causes no
measurable overhead)
● Best used in combination with the different Sanitizers
● LLVM project has bots which are fuzzing clang-format
and Clang continuously
36Samsung Open Source Group
IDEs/code browsers using Clang
● Code browsers:
– SourceWeb
– Woboq Code Browser
– Coati
– Doxygen
● Editor plugins:
– YouCompleteMe
– clang_complete
– rtags
● IDEs:
– KDevelop
– Qt Creator
– CodeLite
– Geany
37Samsung Open Source Group
Summary
38Samsung Open Source Group
Summary
● Great compiler infrastructure
● Fast C/C++ compiler with expressive diagnostics
● Bug detection at compile time
● Automated formatting of code
● Detect bugs early with Sanitizers
● Highly accurate source code browsing, code completion
39Samsung Open Source Group
Give it a try!
● Visit llvm.org
● Distributions with Clang/LLVM packages:
– Fedora
– Debian/Ubuntu
– openSUSE
– Arch Linux
– ...and many more
Thank you.
40Samsung Open Source Group
41Samsung Open Source Group
Contact Information:
Tilmann Scheller
t.scheller@samsung.com
Samsung Open Source Group
Samsung Research UK
42Samsung Open Source Group
Example
zx = zy = zx2 = zy2 = 0;
for (; iter < max_iter && zx2 + zy2 < 4; iter++) {
zy = 2 * zx * zy + y;
zx = zx2 - zy2 + x;
zx2 = zx * zx;
zy2 = zy * zy;
}
43Samsung Open Source Group
Example
zx = zy = zx2 = zy2 = 0;
for (; iter < max_iter && zx2 + zy2 < 4; iter++) {
zy = 2 * zx * zy + y;
zx = zx2 - zy2 + x;
zx2 = zx * zx;
zy2 = zy * zy;
}
loop:
%zy2.06 = phi double [ %8, %loop ], [ 0.000000e+00, %preheader ]
%zx2.05 = phi double [ %7, %loop ], [ 0.000000e+00, %preheader ]
%zy.04 = phi double [ %4, %loop ], [ 0.000000e+00, %preheader ]
%zx.03 = phi double [ %6, %loop ], [ 0.000000e+00, %preheader ]
%iter.02 = phi i32 [ %9, %loop ], [ 0, %.lr.ph.preheader ]
%2 = fmul double %zx.03, 2.000000e+00
%3 = fmul double %2, %zy.04
%4 = fadd double %3, %y
%5 = fsub double %zx2.05, %zy2.06
%6 = fadd double %5, %x
%7 = fmul double %6, %6
%8 = fmul double %4, %4
%9 = add i32 %iter.02, 1
%10 = icmp ult i32 %9, %max_iter
%11 = fadd double %7, %8
%12 = fcmp olt double %11, 4.000000e+00
%or.cond = and i1 %10, %12
br i1 %or.cond, label %loop, label %loopexit
44Samsung Open Source Group
Example
loop:
// zx = zy = zx2 = zy2 = 0;
%zy2.06 = phi double [ %8, %loop ], [ 0.000000e+00, %preheader ]
%zx2.05 = phi double [ %7, %loop ], [ 0.000000e+00, %preheader ]
%zy.04 = phi double [ %4, %loop ], [ 0.000000e+00, %preheader ]
%zx.03 = phi double [ %6, %loop ], [ 0.000000e+00, %preheader ]
%iter.02 = phi i32 [ %9, %loop ], [ 0, %preheader ]
// zy = 2 * zx * zy + y;
%2 = fmul double %zx.03, 2.000000e+00
%3 = fmul double %2, %zy.04
%4 = fadd double %3, %y
// zx = zx2 - zy2 + x;
%5 = fsub double %zx2.05, %zy2.06
%6 = fadd double %5, %x
// zx2 = zx * zx;
%7 = fmul double %6, %6
// zy2 = zy * zy;
%8 = fmul double %4, %4
// iter++
%9 = add i32 %iter.02, 1
// iter < max_iter
%10 = icmp ult i32 %9, %max_iter
// zx2 + zy2 < 4
%11 = fadd double %7, %8
%12 = fcmp olt double %11, 4.000000e+00
// &&
%or.cond = and i1 %10, %12
br i1 %or.cond, label %loop, label %loopexit
zx = zy = zx2 = zy2 = 0;
for (;
iter < max_iter
&& zx2 + zy2 < 4;
iter++) {
zy = 2 * zx * zy + y;
zx = zx2 - zy2 + x;
zx2 = zx * zx;
zy2 = zy * zy;
}
45Samsung Open Source Group
Example .LBB0_2:
@ d17 = 2 * zx
vadd.f64 d17, d12, d12
@ iter < max_iter
cmp r1, r0
@ d17 = (2 * zx) * zy
vmul.f64 d17, d17, d11
@ d18 = zx2 - zy2
vsub.f64 d18, d10, d8
@ d12 = (zx2 – zy2) + x
vadd.f64 d12, d18, d0
@ d11 = (2 * zx * zy) + y
vadd.f64 d11, d17, d9
@ zx2 = zx * zx
vmul.f64 d10, d12, d12
@ zy2 = zy * zy
vmul.f64 d8, d11, d11
bhs .LBB0_5
@ BB#3:
@ zx2 + zy2
vadd.f64 d17, d10, d8
@ iter++
adds r1, #1
@ zx2 + zy2 < 4
vcmpe.f64 d17, d16
vmrs APSR_nzcv, fpscr
bmi .LBB0_2
b .LBB0_5
zx = zy = zx2 = zy2 = 0;
for (;
iter < max_iter
&& zx2 + zy2 < 4;
iter++) {
zy = 2 * zx * zy + y;
zx = zx2 - zy2 + x;
zx2 = zx * zx;
zy2 = zy * zy;
}

Clang: More than just a C/C++ Compiler

  • 1.
    1Samsung Open SourceGroup Clang: Much more than just a C/C++ Compiler Tilmann Scheller Principal Compiler Engineer t.scheller@samsung.com Samsung Open Source Group Samsung Research UK LinuxCon Europe 2016 Berlin, Germany, October 4 – 6, 2016
  • 2.
    2Samsung Open SourceGroup Overview ● Introduction ● LLVM Overview ● Clang ● Summary
  • 3.
    3Samsung Open SourceGroup Introduction
  • 4.
    4Samsung Open SourceGroup What is LLVM? ● Mature, production-quality compiler framework ● Modular architecture ● Heavily optimizing static and dynamic compiler ● Supports all major architectures (x86, ARM, MIPS, PowerPC, …) ● Powerful link-time optimizations (LTO) ● Permissive license (BSD-like)
  • 5.
    5Samsung Open SourceGroup LLVM sub-projects ● Clang C/C++/Objective C frontend and static analyzer ● LLDB Next generation debugger leveraging the LLVM libraries, e.g. the Clang expression parser ● lld Framework for creating linkers, will make Clang independent of the system linker in the future ● Polly Polyhedral optimizer for LLVM, e.g. high-level loop optimizations and data-locality optimizations
  • 6.
    6Samsung Open SourceGroup Which companies are contributing? ®
  • 7.
    7Samsung Open SourceGroup Who is using LLVM? ● Rust ● Android (NDK, RenderScript) ● Portable NativeClient (PNaCl) ● Majority of OpenCL implementations based on Clang/LLVM ● CUDA ● LLVM on Linux: LLVMLinux, LLVMpipe (software rasterizer in Mesa), AMDGPU drivers in Mesa
  • 8.
    8Samsung Open SourceGroup Clang users ● Default compiler on macOS ● Default compiler on FreeBSD ● Default compiler for native applications on Tizen ● Default compiler on OpenMandriva Lx 3.0 ● Debian experimenting with Clang as an additional compiler (94.4% of ~24.5k packages successfully built with Clang 3.8.1) ● Android NDK defaults to Clang
  • 9.
    9Samsung Open SourceGroup LLVM Overview
  • 10.
    10Samsung Open SourceGroup LLVM ● LLVM IR (Intermediate Representation) ● Scalar optimizations ● Interprocedural optimizations ● Auto-vectorizer (BB, Loop and SLP) ● Profile-guided optimizations
  • 11.
    11Samsung Open SourceGroup Compiler architecture C Frontend C++ Frontend Fortran Frontend Optimizer x86 Backend ARM Backend MIPS Backend
  • 12.
    12Samsung Open SourceGroup Compilation steps ● Many steps involved in the translation from C source code to machine code: – Frontend: ● Lexing, Parsing, AST (Abstract Syntax Tree) construction ● Translation to LLVM IR – Middle-end ● Target-independent optimizations (Analyses & Transformations) – Backend: ● Translation into a DAG (Directed Acyclic Graph) ● Instruction selection: Pattern matching on the DAG ● Instruction scheduling: Assigning an order of execution ● Register allocation: Trying to reduce memory traffic
  • 13.
    13Samsung Open SourceGroup LLVM Intermediate Representation ● The representation of the middle-end ● The majority of optimizations is done at LLVM IR level ● Low-level representation which carries type information ● RISC-like three-address code in static single assignment form (SSA) with an infinite number of virtual registers ● Three different formats: bitcode (compact on-disk format), in-memory representation and textual representation (LLVM assembly language)
  • 14.
    14Samsung Open SourceGroup LLVM IR Overview ● Arithmetic: add, sub, mul, udiv, sdiv, ... – %tmp = add i32 %indvar, -512 ● Logical operations: shl, lshr, ashr, and, or, xor – %shr21 = ashr i32 %mul20, 8 ● Memory access: load, store, alloca, getelementptr – %tmp3 = load i64* %tmp2 ● Comparison: icmp, select – %cmp12 = icmp slt i32 %add, 1024 ● Control flow: call, ret, br, switch, ... – call void @foo(i32 %phitmp) ● Types: integer, floating point, vector, structure, array, ... – i32, i342, double, <4 x float>, {i8, <2 x i16>}, [40 x i32]
  • 15.
    15Samsung Open SourceGroup Target-independent code generator ● Part of the backend ● Domain specific language to describe the instruction set, register file, calling conventions (TableGen) ● Pattern matcher is generated automatically ● Backend is a mix of C++ and TableGen ● Usually generates assembly code, direct machine code emission is also possible
  • 16.
  • 17.
    17Samsung Open SourceGroup Clang ● Goals: – Fast compile time – Low memory usage – GCC compatibility – Expressive diagnostics ● Several tools built on top of Clang: – Clang static analyzer – clang-format, clang-tidy
  • 18.
    18Samsung Open SourceGroup Clang Static Analyzer ● Part of Clang ● Tries to find bugs without executing the program ● Slower than compilation ● False positives ● Source annotations ● Works best on C code ● Runs from the commandline (scan-build), web interface for results
  • 19.
    19Samsung Open SourceGroup Clang Static Analyzer ● Core Checkers ● C++ Checkers ● Dead Code Checkers ● Security Checkers ● Unix Checkers
  • 20.
    20Samsung Open SourceGroup Clang Static Analyzer
  • 21.
    21Samsung Open SourceGroup Clang Static Analyzer
  • 22.
    22Samsung Open SourceGroup Clang Static Analyzer - Example ...
  • 23.
    23Samsung Open SourceGroup Clang Static Analyzer - Example
  • 24.
    24Samsung Open SourceGroup Clang Static Analyzer - Example
  • 25.
    25Samsung Open SourceGroup clang-format ● Automatic code formatting ● Consistent coding style is important ● Developers spend a lot of time on code formatting (e.g. requesting trivial formatting changes in reviews) ● Supports different coding conventions (~80 settings) ● Includes configurations for LLVM, Google, Chromium, Mozilla and WebKit coding conventions
  • 26.
    26Samsung Open SourceGroup clang-format ● Once the codebase is "clang-format clean" the coding conventions can be enforced automatically ● Simplifies reformatting after automated refactorings ● Uses the Clang lexer ● Supports the following programming languages: C/C++, Java, JavaScript, Objective-C and Protobuf
  • 27.
    27Samsung Open SourceGroup clang-tidy ● Detect bug prone coding patterns ● Enforce coding conventions ● Advocate modern and maintainable code ● Checks can be more expensive than compilation ● Currently 136 different checks ● Can run static analyzer checks as well
  • 28.
    28Samsung Open SourceGroup Sanitizers ● LLVM/Clang-based Sanitizer projects: – AddressSanitizer – Fast memory error detector – ThreadSanitizer – Detects data races – LeakSanitizer – Memory leak detector – MemorySanitizer – Detects reads of uninitialized variables – UBSanitizer – Detects undefined behavior
  • 29.
    29Samsung Open SourceGroup AddressSanitizer: Stack Buffer Overflow int main(int argc, char **argv) { int stack_array[100]; stack_array[1] = 0; return stack_array[argc + 100]; } Example from https://linuxplumbersconf.org/2015/ocw/proposals/3261.html $ clang++ -O1 -fsanitize=address a.cc; ./a.out ==10589== ERROR: AddressSanitizer stack-buffer-overflow READ of size 4 at 0x7f5620d981b4 thread T0 #0 0x4024e8 in main a.cc:4 Address 0x7f5620d981b4 is located at offset 436 in frame <main> of T0's stack: This frame has 1 object(s): [32, 432) 'stack_array'
  • 30.
    30Samsung Open SourceGroup AddressSanitizer: Use-After-Free int main(int argc, char **argv) { int *array = new int[100]; delete [] array; return array[argc]; } $ clang++ -O1 -fsanitize=address a.cc && ./a.out ==30226== ERROR: AddressSanitizer heap-use-after-free READ of size 4 at 0x7faa07fce084 thread T0 #0 0x40433c in main a.cc:4 0x7faa07fce084 is located 4 bytes inside of 400-byte region freed by thread T0 here: #0 0x4058fd in operator delete[](void*) _asan_rtl_ #1 0x404303 in main a.cc:3 previously allocated by thread T0 here: #0 0x405579 in operator new[](unsigned long) _asan_rtl_ #1 0x4042f3 in main a.cc:2 Example from https://linuxplumbersconf.org/2015/ocw/proposals/3261.html
  • 31.
    31Samsung Open SourceGroup AddressSanitizer: Stack-Use-After-Return int main() { LeakLocal(); return *g; } $ clang++ -g -fsanitize=address a.cc $ ASAN_OPTIONS=detect_stack_use_after_return=1 ./a.out ==19177==ERROR: AddressSanitizer: stack-use-after-return READ of size 4 at 0x7f473d0000a0 thread T0 #0 0x461ccf in main a.cc:8 Address is located in stack of thread T0 at offset 32 in frame #0 0x461a5f in LeakLocal() a.cc:2 This frame has 1 object(s): [32, 36) 'local' <== Memory access at offset 32 int *g; void LeakLocal() { int local; g = &local; } Example from https://linuxplumbersconf.org/2015/ocw/proposals/3261.html
  • 32.
    32Samsung Open SourceGroup MemorySanitizer: Uninitialized Data int main(int argc, char **argv) { int x[10]; x[0] = 1; return x[argc]; } $ clang -fsanitize=memory a.c -g; ./a.out WARNING: Use of uninitialized value #0 0x7f1c31f16d10 in main a.cc:4 Uninitialized value was created by an allocation of 'x' in the stack frame of function 'main' Example from https://linuxplumbersconf.org/2015/ocw/proposals/3261.html
  • 33.
    33Samsung Open SourceGroup UBSanitizer: Integer Overflow int main(int argc, char **argv) { int t = argc << 16; return t * t; } $ clang -fsanitize=undefined a.cc -g; ./a.out a.cc:3:12: runtime error: signed integer overflow: 65536 * 65536 cannot be represented in type 'int' Example from https://linuxplumbersconf.org/2015/ocw/proposals/3261.html
  • 34.
    34Samsung Open SourceGroup UBSanitizer: Invalid Shift int main(int argc, char **argv) { return (1 << (32 * argc)) == 0; } $ clang -fsanitize=undefined a.cc -g; ./a.out a.cc:2:13: runtime error: shift exponent 32 is too large for 32-bit type 'int' Example from https://linuxplumbersconf.org/2015/ocw/proposals/3261.html
  • 35.
    35Samsung Open SourceGroup LibFuzzer ● Coverage-guided fuzz testing ● Coverage data provided by SanitizerCoverage (very low overhead, tracking of function-level coverage causes no measurable overhead) ● Best used in combination with the different Sanitizers ● LLVM project has bots which are fuzzing clang-format and Clang continuously
  • 36.
    36Samsung Open SourceGroup IDEs/code browsers using Clang ● Code browsers: – SourceWeb – Woboq Code Browser – Coati – Doxygen ● Editor plugins: – YouCompleteMe – clang_complete – rtags ● IDEs: – KDevelop – Qt Creator – CodeLite – Geany
  • 37.
    37Samsung Open SourceGroup Summary
  • 38.
    38Samsung Open SourceGroup Summary ● Great compiler infrastructure ● Fast C/C++ compiler with expressive diagnostics ● Bug detection at compile time ● Automated formatting of code ● Detect bugs early with Sanitizers ● Highly accurate source code browsing, code completion
  • 39.
    39Samsung Open SourceGroup Give it a try! ● Visit llvm.org ● Distributions with Clang/LLVM packages: – Fedora – Debian/Ubuntu – openSUSE – Arch Linux – ...and many more
  • 40.
  • 41.
    41Samsung Open SourceGroup Contact Information: Tilmann Scheller t.scheller@samsung.com Samsung Open Source Group Samsung Research UK
  • 42.
    42Samsung Open SourceGroup Example zx = zy = zx2 = zy2 = 0; for (; iter < max_iter && zx2 + zy2 < 4; iter++) { zy = 2 * zx * zy + y; zx = zx2 - zy2 + x; zx2 = zx * zx; zy2 = zy * zy; }
  • 43.
    43Samsung Open SourceGroup Example zx = zy = zx2 = zy2 = 0; for (; iter < max_iter && zx2 + zy2 < 4; iter++) { zy = 2 * zx * zy + y; zx = zx2 - zy2 + x; zx2 = zx * zx; zy2 = zy * zy; } loop: %zy2.06 = phi double [ %8, %loop ], [ 0.000000e+00, %preheader ] %zx2.05 = phi double [ %7, %loop ], [ 0.000000e+00, %preheader ] %zy.04 = phi double [ %4, %loop ], [ 0.000000e+00, %preheader ] %zx.03 = phi double [ %6, %loop ], [ 0.000000e+00, %preheader ] %iter.02 = phi i32 [ %9, %loop ], [ 0, %.lr.ph.preheader ] %2 = fmul double %zx.03, 2.000000e+00 %3 = fmul double %2, %zy.04 %4 = fadd double %3, %y %5 = fsub double %zx2.05, %zy2.06 %6 = fadd double %5, %x %7 = fmul double %6, %6 %8 = fmul double %4, %4 %9 = add i32 %iter.02, 1 %10 = icmp ult i32 %9, %max_iter %11 = fadd double %7, %8 %12 = fcmp olt double %11, 4.000000e+00 %or.cond = and i1 %10, %12 br i1 %or.cond, label %loop, label %loopexit
  • 44.
    44Samsung Open SourceGroup Example loop: // zx = zy = zx2 = zy2 = 0; %zy2.06 = phi double [ %8, %loop ], [ 0.000000e+00, %preheader ] %zx2.05 = phi double [ %7, %loop ], [ 0.000000e+00, %preheader ] %zy.04 = phi double [ %4, %loop ], [ 0.000000e+00, %preheader ] %zx.03 = phi double [ %6, %loop ], [ 0.000000e+00, %preheader ] %iter.02 = phi i32 [ %9, %loop ], [ 0, %preheader ] // zy = 2 * zx * zy + y; %2 = fmul double %zx.03, 2.000000e+00 %3 = fmul double %2, %zy.04 %4 = fadd double %3, %y // zx = zx2 - zy2 + x; %5 = fsub double %zx2.05, %zy2.06 %6 = fadd double %5, %x // zx2 = zx * zx; %7 = fmul double %6, %6 // zy2 = zy * zy; %8 = fmul double %4, %4 // iter++ %9 = add i32 %iter.02, 1 // iter < max_iter %10 = icmp ult i32 %9, %max_iter // zx2 + zy2 < 4 %11 = fadd double %7, %8 %12 = fcmp olt double %11, 4.000000e+00 // && %or.cond = and i1 %10, %12 br i1 %or.cond, label %loop, label %loopexit zx = zy = zx2 = zy2 = 0; for (; iter < max_iter && zx2 + zy2 < 4; iter++) { zy = 2 * zx * zy + y; zx = zx2 - zy2 + x; zx2 = zx * zx; zy2 = zy * zy; }
  • 45.
    45Samsung Open SourceGroup Example .LBB0_2: @ d17 = 2 * zx vadd.f64 d17, d12, d12 @ iter < max_iter cmp r1, r0 @ d17 = (2 * zx) * zy vmul.f64 d17, d17, d11 @ d18 = zx2 - zy2 vsub.f64 d18, d10, d8 @ d12 = (zx2 – zy2) + x vadd.f64 d12, d18, d0 @ d11 = (2 * zx * zy) + y vadd.f64 d11, d17, d9 @ zx2 = zx * zx vmul.f64 d10, d12, d12 @ zy2 = zy * zy vmul.f64 d8, d11, d11 bhs .LBB0_5 @ BB#3: @ zx2 + zy2 vadd.f64 d17, d10, d8 @ iter++ adds r1, #1 @ zx2 + zy2 < 4 vcmpe.f64 d17, d16 vmrs APSR_nzcv, fpscr bmi .LBB0_2 b .LBB0_5 zx = zy = zx2 = zy2 = 0; for (; iter < max_iter && zx2 + zy2 < 4; iter++) { zy = 2 * zx * zy + y; zx = zx2 - zy2 + x; zx2 = zx * zx; zy2 = zy * zy; }