GNU Toolchain
Compiler performance
Team
Christophe
Kugan Venkat
Yvan Zhenqiang
Achieved since LCA13
●
Switched to gcc-4.8
– Lots of backports from trunk
●
Gcc-4.7 is now in maintenance
●
Improved epilo...
●
Shrink-wrapping: move prologue/epilogue inside function body
●
Conditional compare support: short-circuit &&/|| if possi...
●
Loop peeling: generate out of loop iterations to
make sure the loop body makes aligned
memory accesses for vectorization...
Next iteration
●
Spec2k analysis
– Comparison with x86
– Looking for hot spots
– Identify and prioritise actions
●
Shrink ...
Upcoming SlideShare
Loading in …5
×

LCE13: GNU Toolchain - Compiler Performance

148
-1

Published on

Resource: LCE13
Name: GNU Toolchain - Compiler Performance
Date: 11-07-2013
Speaker: Matthew Gretton-Dann
Video: http://youtu.be/gSUWbe71NIs

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
148
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
2
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

LCE13: GNU Toolchain - Compiler Performance

  1. 1. GNU Toolchain Compiler performance
  2. 2. Team Christophe Kugan Venkat Yvan Zhenqiang
  3. 3. Achieved since LCA13 ● Switched to gcc-4.8 – Lots of backports from trunk ● Gcc-4.7 is now in maintenance ● Improved epilogues of leaf functions (can now use LR) ● Shrink-wrapping ● Progress on conditional compare support ● Progress on VRP (Value Range Propagation) ● Progress on divmod optimisation ● Progress on disabling loop peeling ● Address sanitizer
  4. 4. ● Shrink-wrapping: move prologue/epilogue inside function body ● Conditional compare support: short-circuit &&/|| if possible: ● VRP: helps removing useless sign/zero extensions ● Divmod: ARM runtime lib contains a routine computing div & mod at the same time X = a / b; // call div() Y = a % b; // call mod() (x,y) = divmod(a,b) short foo(unsigned char c) { c = c & (unsigned char)0x0F ; if (c > 7) { return c - 6; } return c; } foo: and r0, r0, #15 cmp r0, #7 subhi r0, r0, #6 uxthhi r0, r0 sxth r0, r0 bx lr foo: and r0, r0, #15 cmp r0, #7 subhi r0, r0, #6 bx lr Void test (int a) { If (a == 0) return; …. } Push {….} If (a == 0) goto Lx; ….. Lx: Pop {…} return If (a == 0) return; Push {…} …. Pop {…} Return If (a == b && c == d) Cmp a,b Cmpeq c,d
  5. 5. ● Loop peeling: generate out of loop iterations to make sure the loop body makes aligned memory accesses for vectorization. – Mostly useless on ARM which supports unaligned memory accesses. ● Address sanitizer: new GCC framework to identify NULL pointers accesses, invalid memory references....
  6. 6. Next iteration ● Spec2k analysis – Comparison with x86 – Looking for hot spots – Identify and prioritise actions ● Shrink wrapping improvements ● Conditional compares ● Finalize loop peeling improvements ● Neon intrinsics improvements ● GCC trunk backports ● Compiler target hooks audit
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×